Entering edit mode
6.1 years ago
misterie
▴
110
Hi,
I have combined two gff files for my annotation. One GFF represents chromosomes 1-29 + X and second GFF represents Y chromosome. The same situation is with genome (I have combined two genomes for my purposes).
The problem is with ID and Parent field in GFF because they are overlapping. I mean Y chromosome has the same ID as 29 chromosome:
Y Gnomon gene 2502499 2571410 . + . ID=gene32386;Dbxref=GeneID:100849399;Name=LOC100849399;gbkey=Gene;gene=LOC100849399;gene_biotype=protein_coding
and the same ID (gene32386) here:
29 Gnomon pseudogene 37912602 37922321 . - . ID=gene32386;Dbxref=GeneID:615840;Name=LOC615840;gbkey=Gene;gene=LOC615840;gene_biotype=pseudogene;pseudo=true
How can I fix that problem? Because of that situation I cannot do annotation of my Y chromosome. Should I modify ID and Parent field in my GFF or what?
That would work, I suppose.
But the ID modification will not impply for results of snpEff annotation? How should I modify ID? Changing the NUMBER or I can just add extra char?
Adding extra character is enough and probably the easiest. If you have already done downstream analysis with snpEff I don't know what it might imply. If it was using the IDs I guess it would have complain to meet duplicate IDs.