Download fungal genomes
1
0
Entering edit mode
8.7 years ago
d.pinili • 0

Hi. I need to download all available fungal genomes for my community analysis using kraken (sequence classifier tool). It doesn't have any assistance for acquiring fungal database so i have to download myself. For custom database, the program needs genome sequences in fasta file and the header should contain gi number. I have tried looking in NCBI in the first place, but fungal genomes in the ftp (refseq and genbank folders in genomes/) do not contain gi numbers. I have also tried other websites aside ncbi but to no avail. Can someone help me? Thank you very much!

fungi fungal genomes ncbi • 5.5k views
ADD COMMENT
1
Entering edit mode

See ftp://ftp.ncbi.nlm.nih.gov/pub/taxonomy/accession2taxid/README for accession - gi mapping info.

You can parse the accessions from downloaded data and use e.g. sed to replace them with gis..

ADD REPLY
1
Entering edit mode
7.8 years ago

I know that this question is already quite old, but I now implemented a new package named biomartr that can perform bulk retrieval of genomes, proteomes, cds, gff, etc. Since the actual question is "Download fungal genomes" I will provide some biomartr based examples as a reference for people who in the future search for a way to bulk download all fungal genomes from NCBI RefSeq or Genbank.

To download all fungi genomes from NCBI RefSeq, one can simply type:

# download all fungi genomes from NCBI RefSeq
biomartr::meta.retrieval(kingdom = "fungi", db = "refseq", type = "genome")

Alternatively, genomes from NCBI Genbank can be retrieved by typing:

# download all fungi genomes from NCBI Genbank
biomartr::meta.retrieval(kingdom = "fungi", db = "genbank", type = "genome")

However, you are not limited to genomes. You can also download proteomes (type = "proteome"), coding sequences (type = "CDS"), and annotation files (type = "gff").

In case you wish to download only specific subgroups of fungi genomes, you can consult the getGroups() function to obtain available subgroups:

# retrieve available subgroups for the fungi kingdom
getGroups(db = "refseq", kingdom = "fungi")

"Ascomycetes" "Basidiomycetes" "Other Fungi"

We can now choose the group "Ascomycetes" and download the genomes of all fungi species that correspond to that group by typing:

# download all fungi genomes from NCBI RefSeq that belong to the subgroup Ascomycetes
meta.retrieval(kingdom = "fungi", group = "Ascomycetes", db = "refseq", type = "genome")

For more information please consult the Metagenome Retrieval Vignette. I hope it helps.

ADD COMMENT

Login before adding your answer.

Traffic: 2575 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6