454 technology produces a number of errors in the reads, mostly (but not only) related to homopolymeric runs. It requires some degree of quality filtering, that is removing reads that contain false information. It's often based on quite simple measures of number of consecutive low quality bases and length. Are there any other approaches to quality filtering than the ones implemented in Pyro/AmpliconNoise packages?
I actually think the word "denoising" is a little misleading. All PyroNoise does is to "model" sequencing errors and because its primary goal is clustering, its way of "denoising" by clustering is the right thing to do. Nonetheless, if you do not model sequencing errors in your application but simply reply on an independent denoising method not developed in the context of your application, you will make compromise, which is suboptimal.
Most "denoising" methods come at the cost of losing information or data. When it is possible to process the raw data, denoising mostly causes troubles.
Most "denoising" methods come at the cost of losing information or data. When it is possible to process the raw data, better work with raw data. 454 reads are not so difficult/different to process. I do not see much need of denoising and few are doing that.
Most "denoising" methods come at the cost of losing information or data. When it is possible to process the raw data, better work with that. 454 reads are not so difficult/different to process. I do not see much need of denoising and few are doing that.
When using unfiltered data one risks an over-prediction of microbial diversity in the metagenomic samples. See "The 'rare biosphere': a reality check".
I actually think the word "denoising" is a little misleading. All PyroNoise does is to "model" sequencing errors and because its primary goal is clustering, its "denoising" step is the right thing to do. Nonetheless, if you do not model sequencing errors in your application but simply reply on an independent denoising procedure, you will probably make compromise. My overall advice is: explicitly model sequencing errors in your application, but do not rely on a 3rd-party "denoiser" that is not built for your application.
I actually think the word "denoising" is a little misleading. All PyroNoise does is to "model" sequencing errors and because its primary goal is clustering, its way of "denoising" by clustering is the right thing to do. Nonetheless, if you do not model sequencing errors in your application but simply reply on an independent denoising procedure, you will probably make compromise. My overall advice is: explicitly model sequencing errors in your application, but do not rely on a 3rd-party "denoiser" that is not built for your application.
Are you asking this for amplicon (PCR product sequencing) or shotgun reads?
lh3, I see. Yes, denoising is indeed misleading, as I see people use it in a quite different context. I will re-edit the question in a minute.
fixlex, mostly for amplicon based reads.