How to download fasta sequence of my interest from NCBI?
1
0
Entering edit mode
3.2 years ago
Kumar ▴ 170

Hi all,

I am trying to download the protein fasta sequence of "HSP listeria monocytogenes" from NCBI. Please suggest how I can download these all sequences using the command line from NCBI. I tried the following command, however it requires a file of accession number but since I have a keyword (HSP listeria monocytogenes) to search and retrieve the sequences in fasta format.

Please help me to find the way in this regard.

epost -db protein -input <Accessions file>  | efetch -format fasta  > <Output file>
NCBI FASTA • 904 views
ADD COMMENT
0
Entering edit mode
3.2 years ago
GenoMax 147k

While this can be done via NCBI Entrez web search followed by a download, if you still want to use Entrezdirect then something like following would work.

$ esearch -db protein -query "HSP AND listeria monocytogenes [orgn]" | efetch -format fasta

Representative result (sequences truncated for space saving)

>sp|Q71Z71.1|RS4_LISMF RecName: Full=30S ribosomal protein S4
MARYTGPSWKVSRRLGISLSGTGKELERRPYAPGQHGPTQRKKISEYGLQQAEKQKLRHMYGLTERQFKN

>WP_052960683.1 30S ribosomal protein S4 [Listeria monocytogenes]
MARYTGPSWKVSRRLGISLSGTGKELERRPYAPGQHGPTQRKKISEYGLQQAEKQKLRHMYGLTERQFKN

>WP_031665574.1 30S ribosomal protein S4 [Listeria monocytogenes]
MARYTGPSWKVSRRLGISLSGTGKELERRPYAPGQHGPTQRKKISEYGLQQAEKQKLRHMYGLTERQFKN
ADD COMMENT
0
Entering edit mode

That is great! Thank you for your help.

ADD REPLY

Login before adding your answer.

Traffic: 2301 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6