The center in scatter plot generated by CNVkit looks off
1
0
Entering edit mode
4.7 years ago
Jordan ★ 1.3k

Hi,

I ran CNVkit piplene on WGS samples. I have 4 tumor/normals and pooled the normals.

Here is the command I used:
cnvkit.py batch -p $OMP_NUM_THREADS $BAMs/*_T*.bam -n $BAMs/*_B*.bam -m wgs -f $refs --annotate $refFlat --output-reference $out/project.cnn --output-dir $out

I dropped low coverage reads using the following command:

cnvkit.py segment $file -o $out/drop_low_cov/${sample}.cns

But my scatter plot looks quite weird. The y chromosome has too many deletions and in general it looks the deletions are on a much larger scale.

Is there a way to address this?

Here is the plot

Thanks for the help!

cnvkit wgs scatter-plot • 1.7k views
ADD COMMENT
0
Entering edit mode

What sex is your sample?

ADD REPLY
0
Entering edit mode

These are female samples.

ADD REPLY
0
Entering edit mode

So why are you worried about the CN profile on the Y chromosome? There is no Y in your samples so everything you're seeing can be explained being one of the pseudoautosomal regions, or has homology with an autosome.

ADD REPLY
0
Entering edit mode

All the samples are female. I was a bit worried to see Y chromosome having so many deletions even if both normals and tumors are female samples. Other papers I have seen do not have such high deletions in the Y chromosomes as well.

ADD REPLY
0
Entering edit mode

Do these other papers completely ignore Y for female sample? I know many pipeline just throw out anything on Y once the sample has been determined to be female. More sophisticated pipelines have extra logic to handle less common scenarios such as Downs and Klinefelter syndromes as, if you have a large cohort, you'll almost certainly encounter it.

ADD REPLY
0
Entering edit mode
4.7 years ago
d-cameron ★ 2.9k

This appears to be a sex determination issue. As per the CNVkit documentation:

By default, copy number calls and log2 ratios will be relative to a diploid X chromosome and haploid Y.

This can be adjusted if you know the sex of your sample (or you want CNVkit to predict for you). See https://cnvkit.readthedocs.io/en/stable/sex.html for more details.

The y chromosome has too many deletions and in general it looks the deletions are on a much larger scale. Is there a way to address this?

In general, a deletion should have both a CN loss in the deleted region, and a breakpoint that spans it. If you want do a comprehensive genomic rearrangement assessment of your tumour samples, I would suggest the GRIDSS/PURPLE/LINX pipeline [shameless plug disclaimer - I'm first author of that preprint]. On our cohort, we have 2.1% of (non-centromric/reference gap) somatic CN transitions without an explanatory SV. We have a few samples in the 20-40% range that show signs of DNA degradation hence the higher CN false positive rate.

ADD COMMENT

Login before adding your answer.

Traffic: 2516 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6