Perl Script To Retrieve All Orf?
1
0
Entering edit mode
10.7 years ago
Adrian Pelin ★ 2.6k

Hello,

I have a draft assembly. Does anyone know of scripts to retrieve all ORF in protein format from a a fasta file of contigs?

Adrian

orf perl • 4.3k views
ADD COMMENT
1
Entering edit mode

Unless this is a prokaryote, getting all open reading frames from a draft assembly is not an informative analysis (splicing, low gene density), and also for bacteria it is of very limited use. Instead, look for gene prediction , e.g. on BioStar: gene-prediction

ADD REPLY
0
Entering edit mode

I work on microsporidia. They have very little introns (up to 20 genes with introns, some none at all) and small genomes. They are Eukaryotes.

ADD REPLY
0
Entering edit mode

Then you can use getorf as suggested by R@hul, you should still attempt to do a proper gene prediction.

ADD REPLY
0
Entering edit mode

Please share the fully functional perl script to translate CDNA to ORF (protein) selecting the longest one only. I have Active Perl installed.

ADD REPLY
3
Entering edit mode
10.7 years ago
Rahul Sharma ▴ 660

Hi

From EMBOSS toolkit:

getorf -sequence genome.fasta -outseq genome.ORFs -minsize 180 -find 1 &

Cheers!

ADD COMMENT

Login before adding your answer.

Traffic: 2078 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6