Entering edit mode
3.8 years ago
Harper
•
0
I am new in python and I need to find the ORFs of some fasta sequences and I am using SeqIO in biopython. I followed the steps in the tutorial (considering table table = 11 and min_pro_len = 100) to get the ORFs in the fasta file given and then compared the results with the results for the same sequence using the Open Reading Frame Finder of NCBI.
However, the results are different. Does anyone knows why? Thanks!
Please post the code you are using by editing your original post. Text descriptions are insufficient to follow/diagnose the problem.
Differences in algorithm options/parameters. You would need to ensure you are running the NCBI tool with exactly equivalent parameters to expect output to be completely identical. If the NCBI tool is doing any sort of sophisticated screening (I don't know if it is or not), you may find it suggests different candidates.
If you are trying to do 'proper' gene finding, you would be better off using GeneMark or Glimmer (for bacteria) though.
Here goes the code, thanks again:
from Bio import SeqIO record = SeqIO.read("TestBacteria.fasta", "fasta") table = 11 min_pro_len = 100