Question

Getting huge size of CNV region from CNV-kit with hybrid data.

0

Entering edit mode

4.8 years ago

Makplus T ▴ 100

Hello,

I have some case-match Whole Exome seq samples, and I tried to call CNV event in the case corhot with CNVkit, my code is followed CNVkit manual and parameters are below:

cnvkit.py batch ${all_tumor.bam} --normal ${all_normal.bam} \
--targets ${target_bed} --fasta ${ref_genome} \
--access ${access_bed} \
--output-reference cnv_reference.cnn --output-dir FOO

The target_bed is the exome position I get from the GTF file.

The tool and codes ran well. But after a summary of the results, I found my CNV regions are with an average size of 50Mb and even up to 200Mb. According to some literature, the normal CNV size is between 1Kb to 5Mb.

The size of this result is too different from what I expected. I feel unreliable about the result.

Are there any suggestions for evaluating such results?

CNV WES CNVkit • 1.2k views

ADD COMMENT • link 4.8 years ago by Makplus T ▴ 100

0

Entering edit mode

Hi, The 'target BED' is supposed to be the hybridisation capture targets from the platform you used for WES. Like Agilent/ Nextera.

ADD REPLY • link 4.8 years ago by Amitm ★ 2.3k

0

Entering edit mode

Agree, the better solution is to provide a target BED from my Sequence platform, but normally, their position infos are similar. I re-run the CNV-kit with Agilent exon target region.BED, and finally got the similar result with huge CNV size.

I think it maybe the problem happen in determining the CNV breakpoints, CNV-kit use the off-target reads. but I have no idea to evaluate does my data(100X Whole Exome, with poor coverage in off-target region) suit for this method.

ADD REPLY • link 4.8 years ago by Makplus T ▴ 100