Target

Question

CNVkit detected CNV number

0

Entering edit mode

3.6 years ago

enes ▴ 40

Hi, I am newbie in CNVKit

I want to analyse clinical exome panel sample in CNVkit to detect copy number variants. When I tried with bed file that include 113k row, I get 60k targeted CNV. When I filter the result according to log2 ratio and morbid genes, I get 20k result.

I confused about the number of CNV. I don't believe the trustability of my results. Can anyone say something about it?

CNVKit copynumbervariation cnr • 2.7k views

ADD COMMENT • link updated 3.6 years ago by jared.andrews07 ★ 18k • written 3.6 years ago by enes ▴ 40

1

Entering edit mode

Try following the batch command from CNVkit pipeline. Another good ideia is to also run ExomeDepth and look into intersections of both callers!

ADD REPLY • link 3.6 years ago by brunobsouzaa ▴ 830

0

Entering edit mode

Pretty impossible for anyone to comment without additional details, including the code you ran and plots of CN ratio for your segments across the genome.

ADD REPLY • link 3.6 years ago by jared.andrews07 ★ 18k

0

Entering edit mode

Is it germline analysis or somatic? I think CNVkit is the best for Somatic CNVs calling. For germline it may be better to use something like ExomeDepth (or ClinCNV).

ADD REPLY • link 3.6 years ago by German.M.Demidov ★ 2.9k

0

Entering edit mode

It is germline.

I applied the standard CNVkit pipeline. My codes:

Target

cnvkit.py target hg38_file.bed --split -o my_targets.bed

Anti-target

cnvkit.py antitarget my_targets.bed -g access-5kb.hg10.bed -o my_antitargets.bed

Coverage

cnvkit.py coverage my_targets.target.bed -o Sample.targetcoverage.cnn cnvkit.py coverage Sample.bam my_targets.bed -o Sample.antitargetcoverage.cnn

Reference

cnvkit.py reference -o Reference.cnn -f Homo_sapiens_assembly38.fasta -t Sample.targetcoverage.cnn -a Sample.antitargetcoverage.cnn

(Although I also tried reference cnn that I create from multiple sample, I get 15-20k result again)

fix

cnvkit.py fix Sample.targetcoverage.cnn Sample.antitargetcoverage.cnn Reference.cnn -o Sample.cnr

Sample cnr has approximatelly 20k result...is this problem?

ADD REPLY • link 3.6 years ago by enes ▴ 40

score 1 · Answer 1 · 2021-04-14

1

Entering edit mode

3.6 years ago

jared.andrews07 ★ 18k

Those are just bins for coverage. You still need to run the segment command, potentially followed by the call command if you want to derive absolute copy numbers for each segment.

ADD COMMENT • link 3.6 years ago by jared.andrews07 ★ 18k

0

Entering edit mode

I couldn't understand what is segment actually? When I run segment and then call command, I get 58 copy number variant from sample.call.cns file

This is one of them:

start 70949893 end 81349838 gene DACH1,TBC1D4,CLN5,EDNRB,RNF219,SPRY2 log2 0.0791004 baf 0.264706 cn 2 cn1 1 cn2 1 depth 28.866 probes 106 weight 53.7732

How should I interpret this variant? cn is equal 2, so there is no any copy number? isn't it? I really confused

ADD REPLY • link 3.6 years ago by enes ▴ 40

0

Entering edit mode

segment infers discrete copy number segments from the given coverage table. call then applies thresholds to the log2 ratios for each segment to derive absolute copy number for each. Yes, a cn of 2 would be 2 copies. I recommend reading the documentation closely for these commands, especially call, as you may need to adjust the thresholds. I'd also recommend you use the various plot commands to help you interpret the results and determine appropriate thresholds for your samples.

You should really be creating your reference from germline samples, or at least a panel of normals from the 1000 genomes project or such. At minimum, you should be creating a "flat reference" via the -n flag using multiple samples. Deriving your reference solely from the same sample you are trying to find copy number alterations in doesn't really make much sense, which is what it seems you are doing in this case.

ADD REPLY • link 3.6 years ago by jared.andrews07 ★ 18k