How can I quickly calculate the taxonomic distribution of a metagenome (assembled or unassembled data)?
How can I quickly calculate the taxonomic distribution of a metagenome (assembled or unassembled data)?
You can do this very quickly (in a few seconds) with BBMap's Sketch tool, which will compare the data to RefSeq:
sendsketch.sh in=reads.fq reads=4m depth
or
sendsketch.sh in=contigs.fa
There are a number of k-mer based taxonomic tools available: https://omictools.com/taxonomy-dependent2-category
What do you mean by Taxonomic analysis? Do you mean taxonomic binning of the sequences or profiling of the sequences?
I wrote a tool in graduate school that uses k-mers and an optimization method for taxonomic profiling of a metagenome dataset in seconds. The tool is named FOCUS and you can learn more about it here. The page also teaches more details about binning and profiling in case you are interested,
How about fastq_screen and Diamond?, I prefer Diamond with my experience.
fastq_screen : https://www.bioinformatics.babraham.ac.uk/projects/fastq_screen/_build/html/index.html
Diamond: https://github.com/bbuchfink/diamond and DIAMOND_analysis_counter.py (SAMSA2)
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
DIAMOND is simply an aligner. If you don't refer to a tool that does not have a database associated to it, it does no good. Maybe you can add some databases and tools which take the DIAMOND output such as MEGAN which takes the DIAMOND output when aligning against the NR/NT database and can give you back the taxonomic and functional analysis.
I talk here more about DIAMOND and Rapsearch2.