Get fastas for genes by gene symbol

1

Entering edit mode

8.3 years ago

acorella ▴ 30

Hi,

Note: This is similar but not the same as: Get Fasta File With Protein Sequences Given Entrez Gene Ids

I have a file with gene symbols (one per line), would like to get fasta records for each gene, and compile them into one text file.

I have tried:

genelist= $(<genelist.txt) echo="" -e="" "$genelist"="" |="" while="" read="" G;="" do="" curl="" -s="" "<a="" href="http://eutils.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=gene&amp;db=nucleotide&amp;id=$" rel="nofollow">http://eutils.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=gene&db=nucleotide&id=${G}&linkname=gene_nuccore" | grep -A 1 "<Link>" | grep "<Id>" | cut -d '>' -f 2 | cut -d '<' -f 1 | while read S ; do curl -s "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nucleotide&id=${S}&retmode=text&rettype=fasta" ; done;  done

which results in no output.

Do I need to convert to entrez IDs first? If so, can I do this with eutils as well?

Also, how can I write all printed fastas to a file?

Thanks for any help you are willing to provide!

eutils fasta • 1.8k views

ADD COMMENT • link 8.3 years ago by acorella ▴ 30

0

Entering edit mode

Have you tried Unix e-utils, for example

elink -target protein -db nuccore -id "19084"|efetch -format fasta

ADD REPLY • link 8.3 years ago by Sej Modha 5.3k

0

Entering edit mode

Thanks, this might work! How can I modify this to get the genomic fasta from entrez id or gene symbol? What type of id is "19084" in your example?