Hey all, I used the following command to align my proteome to genome:
exonerate --model protein2genome --showvulgar no --showalignment no --showtargetgff yes --percent 70 protein genome > output.gff
Now the gene_id field is generated without " " around it, so I wrote an awk script to add them but still I encounter a problem when trying to use my gff.
For example when I try to convert it to fasta with TransDecoder I get the error:
Use of uninitialized value $type in string eq at Error, no gene_id at Chr03 exonerate:protein2genome:local exon 10813108 10813326 . + . insertions 0 ; deletions 0 ; identity 89.04 ;
There's a gene id only when the record is of type gene.
Another example is when I try to convert to genepred format using ucsc tool:
gtfToGenePred -allErrors gtf_input output.gp
i get:
Word count less than 8 Bad line 1 of file.gtf:
I also tried changing between gtf & gff and rerunning..
Part of my gtf file:
Chr03 exonerate:protein2genome:local gene 18514887 18517034 2101 - . gene_id "1" sequence sp|P12459|TBB1_SOYBN gene_orientation + identity 88.84 similarity 95.22
Chr03 exonerate:protein2genome:local cds 18516641 18517034 . - .
Chr03 exonerate:protein2genome:local exon 18516641 18517034 . - . insertions 0 ; deletions 0 ; identity 86.26 ; similarity 93.13
Chr03 exonerate:protein2genome:local splice5 18516639 18516640 . - . intron_id 1 ; splice_site "GT"
Chr03 exonerate:protein2genome:local intron 18515928 18516640 . - . intron_id 1
Chr03 exonerate:protein2genome:local splice3 18515928 18515929 . - . intron_id 0 ; splice_site "AG"
Chr03 exonerate:protein2genome:local cds 18515658 18515927 . - .
Chr03 exonerate:protein2genome:local exon 18515658 18515927 . - . insertions 0 ; deletions 0 ; identity 93.26 ; similarity 95.51
Chr03 exonerate:protein2genome:local splice5 18515656 18515657 . - . intron_id 2 ; splice_site "GT"
Chr03 exonerate:protein2genome:local intron 18515546 18515657 . - . intron_id 2
Chr03 exonerate:protein2genome:local splice3 18515546 18515547 . - . intron_id 1 ; splice_site "AG"
Chr03 exonerate:protein2genome:local cds 18514887 18515545 . - .
Chr03 exonerate:protein2genome:local exon 18514887 18515545 . - . insertions 0 ; deletions 0 ; identity 88.58 ; similarity 96.35
Chr03 exonerate:protein2genome:local similarity 18514887 18517034 2101 - . alignment_id 1 ; Query sp|P12459|TBB1_SOYBN ; Align 18517035 1 393 ; Align 18515926 133 267 ; Alig
Any kind of help is highly appreciated.
you should post parts of your gff file, as the problem seems to come from there
Thank you for the reminder and for the wonderful work with the website. I updated the post 🤞🏻
consider to post it as text inline, not a screen shot
Thanks for the suggestion, I updated the post 🤞🏻