I'm running CNVKit in amplicon mode on a set of tumor bam files generated with a small amplicon panel of 45 genes. The panel includes just one gene on chrX, and none on chrY. My reference is generated by 10 normal male samples sequenced with the same panel.
Initially I ran the pipeline using -y at all appropriate steps. The reference samples are correctly assumed male, but almost all of my tumor samples are also treated as male. I then ran the same pipeline but without the -y. The reference samples are still assumed male, but now I have a far more even breakdown of male and female in the sample set.
I prefer the output without the -y.
My understanding is that without -y the log2 ratios for male reference samples will be doubled. I confirmed that the diploid chrX log2 ratios are +1 with respect to the haploid chrX ratios. Since there is only one chrX region and no chrY region covered, it seems automatic determination of sample gender will be confounded if there is a CN change in that X region.
Are there any special considerations I should bear in mind with only one gene on chrX and no genes on chrY sequenced?
How, if at all, would incorrect gender identification for a sample affect CN calls on chrX?
expected copy-number is equal to log2( copy-number in case / copy-number in control)
if your control is male - you will have log2( 1 / 1 ) on chrX
if your control is female - log2 (1 / 2)
overall it is very easy to calculate log2({0,1,..,10 / copy-number in control sample} and put thresholds just wherever you want between them (normally just in the middle of non-log values but whatever)
Hi again :)
According to calculation that you mentioned above, I used that code when I call the expected copy-number
cnvkit.py call {snakemake.input.cns_file} -m threshold -t=-2,-0.4150375,0.3219281,0.8073549,1.169925 -o {snakemake.output.cns_call}
And this is my part of output:
please focus on chrY and its log2 value: 0.809917
According to my threshold parameters, expected copy-number should be 3, right? But we see as 2 . So this result shows us, the thresholds work for only from chr1 to chr22 (not for sex chromosome).
The question is, what is the threshold for sex chromosomes (chrX and chrY).
I know the thresholds for chr1,chr2,chr3 ...chr22 because I entered thresholds for them. But thresholds that I defined, not work for sex chromosomes. Do you know what I mean?
How can I specify the theresholds for ChrX and ChrY? Is it possible? if not, what is the default thresholds for chrX and chrY? If there is no default thresholds for these sex chromosomes, what is the logic behind assign copy-number value to them?
I hope I could explain myself Thanks
yeap the thresholds are between log2({0,1,..,10 / copy-number in control sample}
if you open R you can see that these lines
give you expected copy-numbers if the control sample is male and it is chrX/chrY,
if you execute
you will see copy-numbers when your control sample is female.
Your thresholds are between these lines.
You right, I want to show you my outputs in this case.
Firstly, this is my cns file and whole references are came from male samples:
Now I want to find discrete copy numbers so I use call command.
In first step I want to use my thresholds like that:
PS: I calculate my thresholds like that: log2( (0:4 + .5) / 2)
And here is the output:
According to these theresholds (-2,-0.4150375,0.3219281,0.8073549,1.169925), chrY(log2=-0.64) should be 1 copy-number right? and the second chrY(log2=-0.1145) should be 2 copy-number. But first one is 0 and the other one is 1.
Then I tried your thresholds that you gave me (
log2(c(0.1,1:10) / 1)
) It is make sense because, the male gender chromosomes are haplotype. Here is my code:And this is the output of this code:
According to that thresholds, discrete copy-numbers are shown as 1 at chr1, chr2 generally. It is normal because these chromosomes are diploid naturally. Now we can check the gender chromosomes (chrX and chrY) chrY (log2 = -0.64) copy-number is 0 but according to these thresholds (-t=-3.321928 ,0.000000,1.000000,1.584963) it should be 1, right? And the other chrY(log2 = -0.1145) copy-number is 0 but according to thresholds, it should be 1 also.
This is the main problem...
Thanks a lot :)
Indeed log2=-0.64 is not exactly 0. 2 ^ (-0.64) = 0.6417 - it is more like 0.5 which may indicate a) mosaic loss of Y chromosome, b) large background noise (reads with MAPQ = 0, But chrY(log2=-0.1145) is certainly a copy-number 1. It is 0.924 (2 ^ (-0.1145) ) and it is 1.
The thresholds are gave are "expected copy-numbers" - the actual thresholds are between these lines! so similar to the ones that you use
Simply replace 2 with 1. Copy-number of Y chromosome in males is 1.
I totally solve my problem.
I should use "clonal" option, this command use nearest integer method. So I can predict discrete copy number by myself easily. Threshold option can use different threshold for chrX and chrY for male samples.
Thanks :)