Hi, I'm trying to calculate tumor purity and ploidy using a tool called PureCN written in Rscript. I don't have matched normal samples. I just have tumor panel samples. So, I use PureCN because it can calculate purity without normal samples. But in scripts, there are normaldb, normal_panel, and so on.
How can I calculate tumor purity and ploidy without normal samples?
Without a matched normal (minimal test run) $ Rscript $PURECN/PureCN.R --out $OUT/$SAMPLEID \ --tumor $OUT/$SAMPLEID/${SAMPLEID}_coverage_loess.txt \ --sampleid $SAMPLEID \ --vcf ${SAMPLEID}_mutect.vcf \ --normaldb $OUT_REF/normalDB_hg19.rds \ --intervals $OUT_REF/baits_hg19_intervals.txt \ --genome hg19
Production pipeline run $ Rscript $PURECN/PureCN.R --out $OUT/$SAMPLEID \ --tumor $OUT/$SAMPLEID/${SAMPLEID}_coverage_loess.txt \ --sampleid $SAMPLEID \ --vcf ${SAMPLEID}_mutect.vcf \ --statsfile ${SAMPLEID}_mutect_stats.txt \ --normaldb $OUT_REF/normalDB_hg19.rds \ --normal_panel $OUT_REF/mapping_bias_hg19.rds \ --intervals $OUT_REF/baits_hg19_intervals.txt \ --intervalweightfile $OUT_REF/interval_weights_hg19.txt \ --snpblacklist hg19_simpleRepeats.bed \ --genome hg19 \ --force --postoptimize --seed 123
Production pipeline run includes "A pool of normal samples is recommended when matched normal samples are not available". As a requirement it is stated that you need "At least one BAM file from a normal control sample, either matched or process- matched." according to this manual: http://bioconductor.org/packages/release/bioc/vignettes/PureCN/inst/doc/PureCN.pdf . So I guess it is not exactly "without normal samples", it is more "without MATCHED normal samples".