Entering edit mode
7.1 years ago
horsedog
▴
60
Hi, all, I need a lot of bacterial sequences from NCBI, and I have the GI number, start position and end position of each sequences I want. I'm wondering is it possible to only download the targeted sequences instead of the whole genome? I used the batch entrez before but it will give me the whole genome which I don't need. Thank you
I'm sorry, could you please specify it a bit? Like how to introduce the start position and end position
For example:
$ efetch -db nuccore -format fasta -id CP005986 -chr_start 1600000 -chr_stop 1600020
brings back a 20 bp chunk from this genome.BTW:
CP005986
can be replaced by the gi number640840007
to get the same result.Oh! thank you very much, it's really amazing. But what if I have a batch of sequences want to extract, here I tried to save all the CP number, start position and end position in three different txt files, and I run: efetch -db nuccore -format fasta -id name.txt -chr_start start.txt -chr_stop end.txt, but it doesn't work.