Hi I'm new to bioinformatics. So I'm analyzing transposable elements (TE) in RNA-seq of Musa acuminata (banana) and I'm planning on using TEtranscripts. TEtranscripts needs references (genome and TE) in gtf format. I can't find banana database that's in gtf format but I found the genome and TE references from Banana Genome Hub in gff3 format.
This is what the TE gtf file looks like
##gff-version 3
##sequence-region chr01 1 29070452
chr01 RepeatExplorer repeat_region 10954 11572 . + . Name=Ma_chr01-te0111212;Note=LTR~ Gypsy~ Chromovirus~ Monkey;_Label=LTR-Gypsy-Chromovirus-Monkey;ID=Ma_chr01-te0111212
chr01 RepeatExplorer repeat_region 30672 30962 . + . Name=Ma_chr01-te0111213;Note=LTR~ Copia~ SireMaximus;_Label=LTR-Copia-SireMaximus;ID=Ma_chr01-te0111213
###
chr01 RepeatExplorer repeat_region 31004 31490 . + . Name=Ma_chr01-te0111214;Note=LTR~ Copia~ SireMaximus;_Label=LTR-Copia-SireMaximus;ID=Ma_chr01-te0111214
chr01 RepeatExplorer repeat_region 31594 31924 . + . Name=Ma_chr01-te0111215;Note=LTR~ Copia~ SireMaximus;_Label=LTR-Copia-SireMaximus;ID=Ma_chr01-te0111215
I wonder if there are any tools that can help me convert gff to gtf format? (I tried gffread but didn't work). And also as the TE gff files are available per chromosome, how can I merge those files?
Thank you for sparing your time helping me :)
The TEtranscripts site where you can find the annotated TE gtfs includes a script (makeTEgtf.pl) to create gtfs from various sources, for example, from RepeatMasker files downloaded from USCS.
Thank you! Do you happen to know how to merge gff files? Because I have 11 TE gff files (one for each chromosome). I think I need to merge them if I want to use them.