As described in the title, what is the lowest percentage of sample contamination that we usually can accept? I am asking this because that, although sample contamination as low as 1% may result in false positive variant of VAF 1%, it is seems that not allowing such low contamination is too strict? Is there a common standard or reasonable logic on this topic? Thanks
P.S. Sorry for the unclarity. The contamination I am refering to is cross-sample contamination. The concern is generating false positive somatic variant on the polymorphic loci when the contaminating sample and contaminated sample have SNPs of different genotype.
How do you know it's contamination, and do you know the source? Often I check reads for contamination using metagenomic databases, and you can remove reads mapping to specific species if necessary, but it's on a sample by sample basis rather than % thresholds.
Contamination of what with what exactly? A cell line with another cell line? With bacteria or other culture contaminants? A human xenograft with mouse reads? We need more details.
STR profiling is usually done for cell line authentication to ensure they aren't misidentified or contaminated with other cell lines. The thresholds for mixing in those assays are pretty clear (multiple tri-allelic markers & high Master's scores versus unexpected lines, for instance).
Sorry for the unclarity. The contamination I am refering to is cross-sample contamination. The concern is generating false positive somatic variant on the polymorphic loci when the contaminating sample and contaminated sample have SNPs of different genotype.
If these are cell lines, STR profiling should reveal contamination - it's quite sensitive. If not, then I think you're out of luck. I'd be looking to throw out samples I thought were contaminated.