I want to download coding dna seuqences of TrpB in streptomyces.
I would like to ask how to find nucleotides of protein sequences in NCBI. click the nucleotide button to show genome sequence rather than DNA sequence
I want to download coding dna seuqences of TrpB in streptomyces.
I would like to ask how to find nucleotides of protein sequences in NCBI. click the nucleotide button to show genome sequence rather than DNA sequence
As you can see above WP*
accession numbers point to multiple species.
One other way may be to use Entrezdirect: (there are 6134 hits as of today)
While the following should do this in one step it may time out
$ esearch -db nuccore -query "trpB [GENE] AND streptomyces [orgn]" | efetch -format fasta_cds_na
Doing it in two steps seems to work better
First get all the accession numbers and save to a file
$ esearch -db nuccore -query "trpB [GENE] AND streptomyces [orgn]" | efetch -format acc > id
and then fetch the coding sequence using the file id
created in step above.
$ for i in `cat id`; do efetch -db nuccore -id ${i} -format fasta_cds_na >> trpB.fa; done
You can search the "Gene" database instead of the "Protein" database. Then click on your gene of interest, scroll down to "NCBI Reference Sequences (RefSeq)", and click "FASTA" in the "Genomic" section.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
This is not going to work the way you describe it. At a minimum one will need to do this query to limit to genes in Streptomyces: https://www.ncbi.nlm.nih.gov/gene/?term=trpB%20%5BGENE%5D%20AND%20Streptomyces%20%5BORGN%5D Then once you click on a gene of interest you will get the sequence of just one gene.
OP wants to get the sequences for all trpB genes in Streptomyces. So this can be a cumbersome process and will yield only 15 (or 76 if you include discontinued entries).