Entering edit mode
4.5 years ago
tianshenbio
▴
180
I have a gff file like this:
Bany_Scaf24 maker gene 41357 46444 . + . ID=Bany_09696;Name=Bany_09696;Alias=maker-Bany_Scaf24-exonerate_est2genome-gene-0.0;
Bany_Scaf24 maker mRNA 41357 46444 . + . ID=Bany_09696-RA;Parent=Bany_09696;Name=Bany_09696-RA;Alias=maker-Bany_Scaf24-exonerate_est2genome-gene-0.0-mRNA-1;_AED=0.00;_QI=277|1|1|1|0|0|2|35|76;_eAED=0.00;score=1829;
Bany_Scaf24 maker exon 46189 46444 . + . ID=Bany_09696-RA:1;Parent=Bany_09696-RA;
Bany_Scaf24 maker exon 41357 41643 . + . ID=Bany_09696-RA:2;Parent=Bany_09696-RA;
Bany_Scaf24 maker five_prime_UTR 41357 41633 . + . ID=Bany_09696-RA:five_prime_utr;Parent=Bany_09696-RA;
Bany_Scaf24 maker CDS 41634 41643 . + 0 ID=Bany_09696-RA:cds;Parent=Bany_09696-RA;
Bany_Scaf24 maker CDS 46189 46409 . + 2 ID=Bany_09696-RA:cds;Parent=Bany_09696-RA;
Bany_Scaf24 maker three_prime_UTR 46410 46444 . + . ID=Bany_09696-RA:three_prime_utr;Parent=Bany_09696-RA;
B
I also have the genome fasta file. How do I generate two fasta files of 'gene' and 'three_prime_UTR' based on the the coordinates?
have you looked into this toolbox: https://github.com/NBISweden/AGAT , likely it will hold some function that does this.
You should search Biostars for questions such as these since this has been asked and answered before. Use google (or your fav search engine) to search externally.
How to extract exon sequences from annotated genome
Extracting transcript sequences with gene name (gffread)
http://ccb.jhu.edu/software/stringtie/gff.shtml#gffread