Taxonomic analysis of metagenomic data
4
1
Entering edit mode
7.2 years ago
bird77 ▴ 80

How can I quickly calculate the taxonomic distribution of a metagenome (assembled or unassembled data)?

sequence • 2.0k views
ADD COMMENT
2
Entering edit mode
7.2 years ago

You can do this very quickly (in a few seconds) with BBMap's Sketch tool, which will compare the data to RefSeq:

sendsketch.sh in=reads.fq reads=4m depth

or

sendsketch.sh in=contigs.fa
ADD COMMENT
1
Entering edit mode
7.2 years ago
Sej Modha 5.3k

There are a number of k-mer based taxonomic tools available: https://omictools.com/taxonomy-dependent2-category

ADD COMMENT
0
Entering edit mode
4.9 years ago
onestop_data ▴ 330

What do you mean by Taxonomic analysis? Do you mean taxonomic binning of the sequences or profiling of the sequences?

I wrote a tool in graduate school that uses k-mers and an optimization method for taxonomic profiling of a metagenome dataset in seconds. The tool is named FOCUS and you can learn more about it here. The page also teaches more details about binning and profiling in case you are interested,

ADD COMMENT
0
Entering edit mode
4.9 years ago
Shicheng Guo ★ 9.6k

How about fastq_screen and Diamond?, I prefer Diamond with my experience.

fastq_screen : https://www.bioinformatics.babraham.ac.uk/projects/fastq_screen/_build/html/index.html

Diamond: https://github.com/bbuchfink/diamond and DIAMOND_analysis_counter.py (SAMSA2)

ADD COMMENT
1
Entering edit mode

DIAMOND is simply an aligner. If you don't refer to a tool that does not have a database associated to it, it does no good. Maybe you can add some databases and tools which take the DIAMOND output such as MEGAN which takes the DIAMOND output when aligning against the NR/NT database and can give you back the taxonomic and functional analysis.

I talk here more about DIAMOND and Rapsearch2.

ADD REPLY

Login before adding your answer.

Traffic: 1256 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6