I'm looking for a software which I can translate the open reading frame of a lot of sequences at the same time. I've tried to use EMBOSS Sixpack, but even in the local software there's a limit of sequences to input. What can I use?
I'm looking for a software which I can translate the open reading frame of a lot of sequences at the same time. I've tried to use EMBOSS Sixpack, but even in the local software there's a limit of sequences to input. What can I use?
You can split your file into smaller files with GenomeTools (gt splitfasta -numfiles 60 seqs.fasta
), faSplit by Jim Kent or fasta-splitter by Kirill Kryukov, then loop or parallel through the files.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Do you have to extract the ORF, or your dna sequences are ATG-STOP already? Eitherway, Biopython can make quick work of it. Also, see this post.
I need to extract the ORF! Can I do this with Biopython?
You will need to know a little more about your sequences. Are the ORFs in the forward or reverse direction? If reverse, you'll need to take the reverse complement, and then find 'ATG' in a window search of 3 nucleotides. Do you happen to know the approximate length of the ORFs? You will have to then search for stop codons. With finding 'ATG', you can then translate in the forward direction from the first position.
They are in the forward direction. They have something about 7000nt.
see Emboss sixpack replacement : download & install the standalone version.