MetaPhlAn 3.0
1
0
Entering edit mode
2.9 years ago
Ap1438 ▴ 50

when i run the default command mentioned in the MetaPhlAn 3 manual i am getting a high rate of unknown estimation i.e. 80 %
metaphlan SK_1-forward_paired.fq.gz,SK_1-reverse_paired.fq.gz,SK_1-forward_unpaired.fq.gz,SK_1-reverse_unpaired.fq.gz --bowtie2out sample1.bowtie2.bz2 --nproc 5 --bt2_ps very-sensitive-local --add_viruses --unknown_estimation --input_type fastq -o profiled_sample1.txt.

Can anyone suggest how can i reduce the unknown estimation. And what is the accepted normal for unknown estimation in case of soil samples.

MetaPhlAn • 1.2k views
ADD COMMENT
1
Entering edit mode
2.7 years ago
boaty ▴ 220

metaphlan3 utilizes ChocoPhlAn database which is uniref based (~17,000 reference genomes, it a lot but not enough ). I think it is ok for gut microbe research but not enough for soil samples.

the better way is to run de novo assembly fastq -> contigs -> bins -> MAGs then perform genome annotation by GTDB toolkit or prokka or eggnog.

there are some snakemake pipeline tool such as sunbeam, Metagenome-atlas and metaGEM which do all the stuff altogether. Another way is to run kraken2 with much larger database as reference.

ADD COMMENT

Login before adding your answer.

Traffic: 1792 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6