Entering edit mode
7.7 years ago
kabir.deb
▴
90
I want to know that is there is any suitable tool for extracting start codon and stop codon from the fasta file of multiple protein-coding genes (PCGs). I have downloaded multifasta PCGs (nucleotides) from NCBI; now I want to get the start codon and stop codons of all fasta sequence at a go. Thanks in advance.
Seems just get the first 3 bases or last 3 bases from CDs sequences?
ORFfinder
Yes, first 3 bases for Start codon and last 3, 2, or 1 base for the Stop codon; because some of the nucleotide seq have the cryptic stop codons. So the logic is that it will start count as 3 bases from beginning and print the first three letter of the sequence and then it will count upto to the second last codon and print the remaining bases as stop codon.
So, Here the probable output will be Start codon: ATG Stop codon: T
Please use ADD COMMENT to reply to earlier reactions, as such this thread remains logically structured and easy to follow. I have now moved your post, but as you can see it's not optimal.
Yes, first 3 bases for Start codon and last 3, 2, or 1 base for the Stop codon; because some of the nucleotide seq have the cryptic stop codons. So the logic is that it will start count as 3 bases from beginning and print the first three letter of the sequence and then it will count upto to the second last codon and print the remaining bases as stop codon.
So, Here the probable output will be Start codon: ATG Stop codon: T