Question

Using efetch(biopython) to get the fasta file of an mRNA, as well as the exon locations

0

Entering edit mode

7.6 years ago

Jacob ▴ 10

I want to use some type of e-utility tool to obtain the fasta mRNA sequence file of a gene. First I'm getting the uid from Gene with the code below

from Bio import Entrez
Entrez.email="email@address.com
handle = Entrez.esearch(db="gene", term="Acan[Gene Name] AND Homo sapiens[Organism]", rettype='fasta')"
record = Entrez.read(handle)
id = record["IdList"]

Then I would like to use this uid to get the fasta sequence for the mRNA gene. I would also like to obtain the bp positions of the exons in this gene and the bp positions of the longest off. This is how I'm trying to use efetch

Entrez.efetch(db="nucleotide", term=id[0]+"[uid]", rettype="fasta")

It is not working and I don't know how to specify to get the exon positions either. I know the exon positions are there because I can see them on the website.

https://www.ncbi.nlm.nih.gov/nuccore/NM_000046.3

biopython python efetch entrez • 3.3k views

ADD COMMENT • link updated 7.6 years ago by Ben ▴ 60 • written 7.6 years ago by Jacob ▴ 10

score 0 · Answer 1 · 2017-06-01

0

Entering edit mode

7.6 years ago

Ben ▴ 60

Firstly, you should have the GTF/GFF and fasta files of the Reference genome; Then you can extract the cDNA sequence of mRNA with custome script; Last, you should convert the cDNA sequence to mRNA sequence.

ADD COMMENT • link 7.6 years ago by Ben ▴ 60

0

Entering edit mode

Is there a way to get all the exon regions from that website using a command line tool?

ADD REPLY • link 7.6 years ago by Jacob ▴ 10