I have been exploring the efficacy of different QC methods for cross-sample contamination of NGS data. So far, I have only found these two methods (listed below) to QC NGS data.
Both of these methods use IBD/IBS values to detect sample contamination. Are there any more methods that can be used for quality control of NGS data for sample contamination? Please suggest me all the methods that you are aware of. Thanks!
If you are designing an experiment, I recommend using a unique spike-in sequence per well to detect cross-contamination (BBMap's Seal works well for calculating the cross-contamination rates).
For scenarios where you are multiplexing different organisms together, you can again use Seal for detection/quantification, or use CrossBlock for fully automatic, reference-free cross-contamination detection and removal. It does not work for cross-contamination between same-organism libraries.
If you already have data, and it's all one organism with no spike-ins, it's too late to accurately detect or remove cross-contamination. As your links indicate, there are ways to estimate it, but it's best to design the experiment correctly from the beginning, with spike-ins.
Thank you Brian. I have designed a method that can accurately identify (or estimate as you have pointed out) the contaminated samples, but I am not sure if there are any other methods previously available. I wanted to compare and contrast my algorithm with the methods available out there. Looking for these methods in pubmed or google search is not that helpful.
Still viable for human cross-sample contamination detection, if your samples have different target regions. You just have to define your target regions as the genomes for FastQ Screen.
Thank you Brian. I have designed a method that can accurately identify (or estimate as you have pointed out) the contaminated samples, but I am not sure if there are any other methods previously available. I wanted to compare and contrast my algorithm with the methods available out there. Looking for these methods in pubmed or google search is not that helpful.