Where can I find genome of a single bacteria?
4
0
Entering edit mode
10.7 years ago
matija.sosic ▴ 110

Where can I find genome of a single bacteria, e.g. of E.coli? I downloaded rna-seq reads of E.coli from SRA and now I would like to align it using BWA to the genome of E.coli.

bacteria genome • 3.6k views
ADD COMMENT
2
Entering edit mode

Single bacterium; single bacterial species; single bacteria.

ADD REPLY
4
Entering edit mode
10.7 years ago

Ensembl would be my first bet. Odds are good you mean this one in particular, though there are many other substrains that have been sequenced.

ADD COMMENT
0
Entering edit mode

I have one fasta file from that strain, but also this one: http://www.ncbi.nlm.nih.gov/sra/?term=SRR1187101

Is there some way to know if this strain's genome is sequenced, besides checking manually al sources? Would it be ok if I just aligned to tthe strain you proposed?

Thanks!

ADD REPLY
1
Entering edit mode

(N.B., I don't work on E. Coli so take this with an appropriately sized grain of salt!) Yeah, I'd go ahead and align it to the aforementioned reference. Perhaps then take the sequence of a gene that has a lot of differences vs. the reference and then blast that to see if perhaps there's a closer strain if you really want. At the end of the day, it really depends on what your goals are. The original study that you just linked to was looking at strain sequence association to a clinical phenotype, so in many ways the exact reference strain used may not have been that important.

BTW, you might also consider de novo or reference based assembly.

ADD REPLY
2
Entering edit mode
10.7 years ago
Neilfws 49k

Lots of places.

All easily found via a web search for "bacterial genomes database".

ADD COMMENT
1
Entering edit mode
10.7 years ago

From memory U00096 :o)

ADD COMMENT
1
Entering edit mode
7.8 years ago

I know that this question is already almost 3 years old, but I hope that my answer might be useful to others anyway.

I implemented a standardized way to automate the genome retrieval process in R (see biomartr package).

To retrieve a bacterial reference genome from several database sources using only the scientific name of the bacteria of interest one can simply type:

# download Escherichia coli reference genome from NCBI RefSeq
biomartr::getGenome(db  = "refseq", organism = "Escherichia coli")

or

# download Escherichia coli reference genome from NCBI Genbank
 biomartr::getGenome(db  = "genbank", organism = "Escherichia coli")

In case you wish to download all available bacterial genomes at once, simply type:

# download all bacterial reference genomes from NCBI RefSeq
biomartr::meta.retrieval(kingdom = "bacteria", db = "refseq", type = "genome")

For more details about downloading specific genomes from specific kingdoms or subkingdoms of life please consult the Genomic Sequence Retrieval vignette of the biomartr package. For metagenome downloads, please consult the Meta-Genome Retrieval vignette and for entire database retrieval the Database Retrieval vignette.

Please note that to promote computational reproducibility in genomics and metagenomics studies, biomartr stores log files for each downloaded genome, proteome, or CDS file.

An example log file looks as follows:

File Name: Escherichia_coli_genomic_refseq.fna.gz

Organism Name: Escherichia_coli

Database: NCBI refseq

URL: ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/005/845/GCF_000005845.2_ASM584v2/GCF_000005845.2_ASM584v2_genomic.fna.gz

Download_Date: Wed Feb 15 15:17:50 2017

refseq_category: reference genome

assembly_accession: GCF_000005845.2

bioproject: PRJNA57779

biosample: SAMN02604091

taxid: 511145

infraspecific_name: strain=K-12 substr. MG1655

version_status: latest

release_type: Major

genome_rep: Full

seq_rel_date: 2013-09-26

submitter: Univ. Wisconsin

ADD COMMENT

Login before adding your answer.

Traffic: 2246 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6