IGV tracks gene annotation
1
0
Entering edit mode
6.5 years ago
bitpir ▴ 250

Hi there, I'm trying to visualize a reference genome in IGV and annotate the genes using a custom made GFF file. The snapshot of the IGV looks something like this. I am particularly curious about the pink labeled track. The GFF for the pink track looks something like this:

NC_023010.2 glimmer cds 5011 3686 3.08 - 1 orf00005;

NC_023010.2 glimmer cds 5052 5264 0.82 + 2 orf00006;

NC_023010.2 glimmer cds 5637 6800 3.11 + 2 orf00007;

As you can see, orf00005 is labeled differently. Does anyone know why this is so? Is that partial gene?

Screen Shot 2018 05 04 at 4 27 59 PM

Thanks for the help!

IGV gene GFF • 4.2k views
ADD COMMENT
1
Entering edit mode
6.5 years ago
h.mon 35k

Probably IGV doesn't like the Glimmer gff, as it does not conform to the specification:

Columns 4 & 5: "start" and "end"

[...] Start is always less than or equal to end.

For orf00005, start > end.

ADD COMMENT
0
Entering edit mode

Hmm, I don't think that's the problem because there are other orfs that go in reverse direction too (HSP_RS15385). I found this answer from another site (https://biology.stackexchange.com/questions/68431/clarification-on-refseq-genes-track-on-igv) The thinner line is supposed to be untranslated region. Now I have to figure out why it is so while other tracks are considered translated region.

ADD REPLY
1
Entering edit mode

Orfs that "go in reverse direction" has nothing to do with the start and end coordinates, this is an indication of strand:

Column 7: "strand"

The strand of the feature. + for positive strand (relative to the landmark), - for minus strand, and . for features that are not stranded. In addition, ? can be used for features whose strandedness is relevant, but unknown.

The feature you indicated (HISP_RS15385) is on minus strand, as orf00005, hence both have left-facing arrows. However, if you look at the gff, its start coordinate is less than the end coordinate ( 36612 < 37661 ):

NC_023010.2 RefSeq  gene    36612   37661   .   -   .   ID=gene-HISP_RS15385;Dbxref=GeneID:23802828;Name=HISP_RS15385;gbkey=Gene;gene_biotype=protein_coding;locus_tag=HISP_RS15385;old_locus_tag=HISP_16005
NC_023010.2 Protein Homology    CDS 36612   37661   .   -   0ID=cds35;Parent=gene-HISP_RS15385;Dbxref=Genbank:WP_014030602.1,GeneID:23802828;Name=WP_014030602.1;gbkey=CDS;inference=COORDINATES: similar to AA sequence:RefSeq:WP_014030602.1;product=radical SAM protein;protein_id=WP_014030602.1;transl_table=11
ADD REPLY
0
Entering edit mode

I see! Got it, thank you so much for pointing out. That's really weird that Glimmer GFF has that kind of format. I'll check again with the formatting with that file. Thanks!

ADD REPLY

Login before adding your answer.

Traffic: 2229 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6