Entering edit mode
8.3 years ago
acorella
▴
30
Hi,
Note: This is similar but not the same as: Get Fasta File With Protein Sequences Given Entrez Gene Ids
I have a file with gene symbols (one per line), would like to get fasta records for each gene, and compile them into one text file.
I have tried:
genelist= $(<genelist.txt) echo="" -e="" "$genelist"="" |="" while="" read="" G;="" do="" curl="" -s="" "<a="" href="http://eutils.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=gene&db=nucleotide&id=$" rel="nofollow">http://eutils.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=gene&db=nucleotide&id=${G}&linkname=gene_nuccore" | grep -A 1 "<Link>" | grep "<Id>" | cut -d '>' -f 2 | cut -d '<' -f 1 | while read S ; do curl -s "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nucleotide&id=${S}&retmode=text&rettype=fasta" ; done; done
which results in no output.
Do I need to convert to entrez IDs first? If so, can I do this with eutils as well?
Also, how can I write all printed fastas to a file?
Thanks for any help you are willing to provide!
Have you tried Unix e-utils, for example
Thanks, this might work! How can I modify this to get the genomic fasta from entrez id or gene symbol? What type of id is "19084" in your example?
Could you please post some example gene symbol or entrez id?