Question

Metagenomic Degenerate Primers Design - How to download multiple gene sequences from Genbank?

1

Entering edit mode

11.2 years ago

Tim ▴ 130

Hello, everyone. I might be asking a question with a very simple answer and probably the one that was already answered here before, but I would really appreciate any help.

I am trying to design a new set of degenerate primers to amplify a gene (pstS) from bacterial metagenomes. While I have never worked before with metagenomic samples, I clearly understand the steps that need to be taken in order to do so:

Find and download nucleotide sequences of my gene (Genbank) from different bacterial species;
Perform multiple alignment of these nucleotide sequences and/or their protein translations (CLUSTAL, MUSCLE, T-COFFEE etc.);
Identify conservative regions;
Select primer sequences, either manually or using a specialized program (CODEHOP, Primaclade, HYDEN).

So my question is simple: how to batch download gene sequences from Genbank? If I use Entrez Nucleotide, it gives me all the sequences containing pstS, including whole genomic sequences, plasmids and so on, and I have no idea how to filter them out. I am not afraid to use BioPerl/BioPython or any other way of collecting data from Genbank programmatically, but I am worried that there exist a simple method that I am missing.

Thank you in advance, I am really struggling with this simple step and that makes me uncomfortable.

sequence primer Genbank • 4.7k views

ADD COMMENT • link updated 2.4 years ago by Ram 45k • written 11.2 years ago by Tim ▴ 130

Ram · Answer 1 · 2014-07-10

0

Entering edit mode

11.2 years ago

umer.zeeshan.ijaz ★ 1.8k

If you have a locally installed nt or nr database, then you can use blastdbcmd from blast suite to extract the sequences. For example, the following link tells you how to extract 16S rRNA sequences:

http://userweb.eng.gla.ac.uk/umer.ijaz/bioinformatics/oneliners.html?#BLASTDBCMD

For downloading sequences from Uniprot (Swiss-Prot,trEMBL), you can use extract_fasta_swissprot.py script from here.

Also read section 9 from Biopython Cookbook.

Best Wishes,
Umer

ADD COMMENT • link updated 3.8 years ago by Ram 45k • written 11.2 years ago by umer.zeeshan.ijaz ★ 1.8k

0

Entering edit mode

Thank you, Umer, I will try your suggestions

ADD REPLY • link updated 3.8 years ago by Ram 45k • written 11.2 years ago by Tim ▴ 130