gender determination and chrX CN calls
3
2
Entering edit mode
3.7 years ago
erikt ▴ 30

I'm running CNVKit in amplicon mode on a set of tumor bam files generated with a small amplicon panel of 45 genes. The panel includes just one gene on chrX, and none on chrY. My reference is generated by 10 normal male samples sequenced with the same panel.

Initially I ran the pipeline using -y at all appropriate steps. The reference samples are correctly assumed male, but almost all of my tumor samples are also treated as male. I then ran the same pipeline but without the -y. The reference samples are still assumed male, but now I have a far more even breakdown of male and female in the sample set.

I prefer the output without the -y.

My understanding is that without -y the log2 ratios for male reference samples will be doubled. I confirmed that the diploid chrX log2 ratios are +1 with respect to the haploid chrX ratios. Since there is only one chrX region and no chrY region covered, it seems automatic determination of sample gender will be confounded if there is a CN change in that X region.

Are there any special considerations I should bear in mind with only one gene on chrX and no genes on chrY sequenced?

How, if at all, would incorrect gender identification for a sample affect CN calls on chrX?

cnvkit sequencing next-gen cnv • 2.9k views
ADD COMMENT
1
Entering edit mode
3.3 years ago
erikt ▴ 30

The safest thing to do in my experience is to manually specify gender. I am running an amplicon panel. In my case the reference samples were correctly identified as male when using the batch command so I left that as is.

I should note that my pipeline is unorthodox because I am working with a small, sparse targeted amplicon panel. I use genemetrics to get log2 ratios on a per gene basis rather then relying on a segmentation algorithm, then use call to get CN values.

Both genemetrics and call rely on gender information, and without -y male X log2 ratios are shifted, so to avoid a double shift for male samples I did the following. For genemetrics I treated all samples as female (no X shift) and for call I manually specified the correct gender.

Hope this helps.

ADD COMMENT
1
Entering edit mode
3.3 years ago
enes ▴ 40

In my case, I want to visualize copy-number ratio because I am working on a kind of application. Default chart is so simply and probably user cannot understand. So I want to add threshold layer for log2 ratios. like that:

chart

But these thresholds are not consistent with sex chromosomes, especially in male sample. That's why I want to find default thresholds for sex chromosomes because when I use -thresholds -t parameter in calling command, command works for only from chr1 to chr22 (not for sex chromosomes).

I confused..

ADD COMMENT
0
Entering edit mode

expected copy-number is equal to log2( copy-number in case / copy-number in control)

if your control is male - you will have log2( 1 / 1 ) on chrX

if your control is female - log2 (1 / 2)

overall it is very easy to calculate log2({0,1,..,10 / copy-number in control sample} and put thresholds just wherever you want between them (normally just in the middle of non-log values but whatever)

ADD REPLY
0
Entering edit mode

Hi again :)

According to calculation that you mentioned above, I used that code when I call the expected copy-number

cnvkit.py call {snakemake.input.cns_file} -m threshold -t=-2,-0.4150375,0.3219281,0.8073549,1.169925 -o {snakemake.output.cns_call}

And this is my part of output:

please focus on chrY and its log2 value: 0.809917

please focus on chrY and its log2 value: 0.809917

According to my threshold parameters, expected copy-number should be 3, right? But we see as 2 . So this result shows us, the thresholds work for only from chr1 to chr22 (not for sex chromosome).

The question is, what is the threshold for sex chromosomes (chrX and chrY).

I know the thresholds for chr1,chr2,chr3 ...chr22 because I entered thresholds for them. But thresholds that I defined, not work for sex chromosomes. Do you know what I mean?

How can I specify the theresholds for ChrX and ChrY? Is it possible? if not, what is the default thresholds for chrX and chrY? If there is no default thresholds for these sex chromosomes, what is the logic behind assign copy-number value to them?

I hope I could explain myself Thanks

ADD REPLY
0
Entering edit mode

yeap the thresholds are between log2({0,1,..,10 / copy-number in control sample}

if you open R you can see that these lines

plot(0:10, cex=0, ylim=c(-5,5));abline(h=log2(c(0.1,1:10) / 1))

give you expected copy-numbers if the control sample is male and it is chrX/chrY,

if you execute

abline(h=log2(c(0.1,1:10) / 2), col="red")

you will see copy-numbers when your control sample is female.

Your thresholds are between these lines.

ADD REPLY
0
Entering edit mode

You right, I want to show you my outputs in this case.

Firstly, this is my cns file and whole references are came from male samples: cns_file

Now I want to find discrete copy numbers so I use call command.

In first step I want to use my thresholds like that:

PS: I calculate my thresholds like that: log2( (0:4 + .5) / 2)

cnvkit.py call male_sample.cns -m threshold -t=-2,-0.4150375,0.3219281,0.8073549,1.169925 -o my_threshold_output.tsv

And here is the output:

my_thresholds

According to these theresholds (-2,-0.4150375,0.3219281,0.8073549,1.169925), chrY(log2=-0.64) should be 1 copy-number right? and the second chrY(log2=-0.1145) should be 2 copy-number. But first one is 0 and the other one is 1.

Then I tried your thresholds that you gave me ( log2(c(0.1,1:10) / 1) ) It is make sense because, the male gender chromosomes are haplotype. Here is my code:

cnvkit.py call male_sample.cns -m threshold -t=-3.321928 ,0.000000,1.000000,1.584963 -o your_threshold_output.tsv

And this is the output of this code:

your thresholds output

According to that thresholds, discrete copy-numbers are shown as 1 at chr1, chr2 generally. It is normal because these chromosomes are diploid naturally. Now we can check the gender chromosomes (chrX and chrY) chrY (log2 = -0.64) copy-number is 0 but according to these thresholds (-t=-3.321928 ,0.000000,1.000000,1.584963) it should be 1, right? And the other chrY(log2 = -0.1145) copy-number is 0 but according to thresholds, it should be 1 also.

This is the main problem...

Thanks a lot :)

ADD REPLY
0
Entering edit mode

According to these theresholds (-2,-0.4150375,0.3219281,0.8073549,1.169925), chrY(log2=-0.64) should be 1 copy-number right? and the second chrY(log2=-0.1145)

Indeed log2=-0.64 is not exactly 0. 2 ^ (-0.64) = 0.6417 - it is more like 0.5 which may indicate a) mosaic loss of Y chromosome, b) large background noise (reads with MAPQ = 0, But chrY(log2=-0.1145) is certainly a copy-number 1. It is 0.924 (2 ^ (-0.1145) ) and it is 1.

The thresholds are gave are "expected copy-numbers" - the actual thresholds are between these lines! so similar to the ones that you use

PS: I calculate my thresholds like that: log2( (0:4 + .5) / 2)

Simply replace 2 with 1. Copy-number of Y chromosome in males is 1.

ADD REPLY
1
Entering edit mode

I totally solve my problem.

I should use "clonal" option, this command use nearest integer method. So I can predict discrete copy number by myself easily. Threshold option can use different threshold for chrX and chrY for male samples.

Thanks :)

ADD REPLY
0
Entering edit mode
3.3 years ago
enes ▴ 40

You are totally right, I against the same problem and I still cannot solve. Did you find any solution for that?

In my case generally reference samples are mixed (include male and female) and I do not use -y option. Normally in calling step, I am using thresholds like that:

  • for cn = 0 (loss) ------> -1
  • for cn = 1 (no alteration) ------> 0.5849625
  • for cn = 2 (gain) -----> 1.321928
  • for cn = 3 (gain) ------> 1.807355

but these thresholds do not work for sex chromosomes. I couldn't find any default thresholds for sex chromosomes in documentary...

ADD COMMENT

Login before adding your answer.

Traffic: 2676 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6