Annotation explanation

  • pubmed_id
  • genbank_id
  • seq_length: sequence length sequenced; intact: must be >= 7 kb
  • 5LTR_length: 5’ LTR length (HXB2: 1-634)
  • TAR_stem_loop_insertion: insertions in TAR stem loop (HXB2: 453-513)
  • TAR_stem_loop_deletion: deletions in TAR stem loop
  • polyA_insertion: insertions in PolyA region (HXB2: 527-532)
  • polyA_deletion: deletions in PoyA region
  • u5_gag_pair_R1_insertion: insertions in the region of U5 that pairs with gag start AUG (HXB2 559-569)
  • u5_gag_pair_R1_deletion: deletions in the first region of U5 that pairs with gag start AUG
  • first_half_pbs_insertion: insertions in the first 8 bases of PBS (HXB2: 636-644)
  • first_half_pbs_deletion: deletions in the first 8 bases of PBS
  • MSD_status: mutations or deletions in major splice donor site (HXB2: 744-745); Intact: no deletion, no mutation
  • packing_insertion: insertions in the packaging signal region (HXB2: 695-810)
  • packing_deletion: deletions in the packaging signal region; intact: deletions < 8 nucleotides
  • u5_gag_pair_R2_insertion: insertions in gag start AUG region (HXB2: 788-798) that pairs with U5
  • u5_gag_pair_R2_deletion: deletions in gag start AUG region (HXB2: 788-798); Intact: deletion < 3 nucleotides
  • 2base_before_gag_deletion: deletions in the 2 bases before gag AUG (HXB2: 788-789); Intact: no deletions
  • gag_start_AUG: (HXB2: 790-792); Intact: no deletions, no mutations
  • gag_insertion: insertions in gag (HXB2: 790-2292); Intact: < 50 nucleotides
  • gag_deletion: deletions in gag; Intact: < 50 nucleotides
  • gag_premature_stop: out of frame stop codons in gag; Intact: no frameshift, no premature stop codons
  • pol_insertion: insertions in pol (HXB2: 2085-5096); Intact: < 50 nucleotides
  • pol_deletion: deletions in pol; Intact: < 50 nucleotides
  • pol_premature_stop: out of frame stop codons in pol; Intact: no frameshift, no premature stop codons
  • env_insertion: insertions in env (HXB2: 6225-8795); Intact: < 150 nucleotides
  • env_deletion: deletions in env; Intact: < 100 nucleotides
  • env_premature_stop: out of frame stop codons in env; Intact: no frameshift, no premature stop codons
  • rre_status: deletions in RRE region (HXB2: 7710-8061); Intact: > 233 nucleotides in length and key domain deletion <10 nucleotides (O’Carroll, IP, 2017, JV, 91:e00746-17)
  • 3LTR_length: 3’ LTR length (HXB2: 9086-9719)
  • two_LTR_similarity: the similarity between 5’ and 3’ LTR. They are compared when both LTR length > 250 bp. If no significant similarity found, “no similarity” is shown.
  • Accessary genes are not used in inferring intactness currently.
  • Inferred_intactness: Every above specified region is intact. Accessary genes are not used in intactness estimate currently
  • sequence_completeness: measures whether sequencing covers from the start of 5’LTR R region and the end of 3’LTR U3 region, no matter there is any deletion between these two ends or not
  • Infectious: experiment evidence of provirus replication reported by authors

If there are any questions, please contact Wei Shao at rid@mail.nih.gov

© RID. Supported by Advanced Biomedical Computational Science (ABCS) at Leidos/FNLCR.