How to get CDS.fasta from gff file + sequenceFile.fasta
2
1
Entering edit mode
9.6 years ago
moranr ▴ 290

I have two files:

  1. An annotation file in gff format
  2. A fasta formatted sequnce file for a whole genome.

I want to have a resulting file : myGenomeCDS.fasta

How can I go about doing this via python or Tools ?

Thanks

sequence python gff annotation • 7.3k views
ADD COMMENT
5
Entering edit mode
9.6 years ago
iraun 6.2k

If your gtf file has the annotation of CDS's you can extract CDS sequences using gffread in this way:

gffread -g genome.fa -x CDS.fa annotation.gtf
ADD COMMENT
2
Entering edit mode
9.6 years ago
arnstrm ★ 1.9k

There are so many ways to do this. If you don't want to write your own code, use the utilities from preexisiting packages such as BedTools (fastafrombed), Glimmer (extract) and even Galaxy as a tool (extract features).

ADD COMMENT

Login before adding your answer.

Traffic: 2032 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6