Get fastas for genes by gene symbol
0
1
Entering edit mode
8.3 years ago
acorella ▴ 30

Hi,

Note: This is similar but not the same as: Get Fasta File With Protein Sequences Given Entrez Gene Ids

I have a file with gene symbols (one per line), would like to get fasta records for each gene, and compile them into one text file.

I have tried:

genelist= $(<genelist.txt) echo="" -e="" "$genelist"="" |="" while="" read="" G;="" do="" curl="" -s="" "<a="" href="http://eutils.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=gene&amp;db=nucleotide&amp;id=$" rel="nofollow">http://eutils.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=gene&db=nucleotide&id=${G}&linkname=gene_nuccore" | grep -A 1 "<Link>" | grep "<Id>" | cut -d '>' -f 2 | cut -d '<' -f 1 | while read S ; do curl -s "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nucleotide&id=${S}&retmode=text&rettype=fasta" ; done;  done

which results in no output.

Do I need to convert to entrez IDs first? If so, can I do this with eutils as well?

Also, how can I write all printed fastas to a file?

Thanks for any help you are willing to provide!

eutils fasta • 1.8k views
ADD COMMENT
0
Entering edit mode

Have you tried Unix e-utils, for example

elink -target protein -db nuccore -id "19084"|efetch -format fasta
ADD REPLY
0
Entering edit mode

Thanks, this might work! How can I modify this to get the genomic fasta from entrez id or gene symbol? What type of id is "19084" in your example?

ADD REPLY
0
Entering edit mode

Could you please post some example gene symbol or entrez id?

ADD REPLY

Login before adding your answer.

Traffic: 1673 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6