I am a novice who does not really have much idea on bioinformatics.
I have a list of Genbank accession numbers for mRNA variants which I want to convert to a list of ORF sequences.
These are just very few sequences that I need to deal with: NM_175063.5 NM_005789.3 NM_005789.3 NM_024516.3 NM_032144.2 NM_001911.2
If i put any one of the accession numbers in orf finder in NCBI (https://www.ncbi.nlm.nih.gov/orffinder/) the longest one that pops up is what I need. But I need to do this operation for many many sequences, so it is very tedious to do one by one.. Does anyone know how to do this operation? I don't also know any tool such as R or Python, but my friend suggested to use Python for this kind of work (actually this is the very beginning of bioinformatics process I need to tackle). So if you could tell me with python that would be better. I can use powershell to make operations in linux mode.
You can use Entrez Direct to retrieve sequences from your Genbank accession numbers.
Afterward you can try this exercise in Biopython to find ORFs : https://munch-lab.org/2013/11/19/finding-open-reading-frames/
To retrieve all sequences corresponding to your accession list with Eutils, you can do the following :
If you want to use this inside python, you can simple "import sys", then write the command above in a string
For example if you named this string "cmd", you can call the Eutils like this inside Python :
Then use whatever Python package to call ORFs on theses sequences.