Hello, I have classified 16S rRNA metagenomic reads against RDP and SSU reference database. more than 30% reads classified as unclassified (derived from bacteria).
Is there any way to reduce/improve the unclassified reads?
Thank you
Sebastian
Hello, I have classified 16S rRNA metagenomic reads against RDP and SSU reference database. more than 30% reads classified as unclassified (derived from bacteria).
Is there any way to reduce/improve the unclassified reads?
Thank you
Sebastian
You could try a different database. RDP does not include environmental clusters, which can be a substantial proportion of amplicons. Perhaps see if you get more classification when using GreenGenes or SILVA
It sounds to me pretty reasonable, depending on where your samples are from (there is more unknown in prairie soil that mammal gut for instance). If your data are sequenced amplicons like I suspect, did you "denoise" and filter chimera as a preliminary step? It should help.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
The short answer: No. We need better databases.