How to compare CNV events?
2
1
Entering edit mode
9.2 years ago
Chico ▴ 40

Hi everyone,

I'm performing experiments with CNV detection tools and I 'm stuck in a doubt: What is the best way to find out a subset of the overlapping results from two tools? I have seen some authors using 1bp reciprocal overlap for that, but I guess that it is a very weak criterion.

So, is there any strategy to find the same CNV events?

Thanks in advance,

next-gen • 4.8k views
ADD COMMENT
1
Entering edit mode

Why don't summarize the CNV events to a gene level and for each gene, compare the copy number of method A with method B. So download all gene coordinates, intersect them with both CNV results and calculate correlation statistics between them?

ADD REPLY
4
Entering edit mode
9.2 years ago
Shicheng Guo ★ 9.5k

In your situation, suppose you think the criterion of 1bp reciprocal overlap is too loose, you can use -f=0.5 (in bedtools intersect) to keep each other have 50% overlap. However, as you know, CNV usually would be long range genomic variation, I think you can apply two stage screening: 1) use 1bp reciprocal overlap to find all the overlap event (CNV hotspot) and then 2) use strict criterion, maybe -f=0.5 or -f=0.8 to identify most similar or homogenous CNV regions. On the other side, the genomic location of the CNV would be very important, if they occurred in enhance region, then I think even 1bp reciprocal overlap would bring similar biological significance. if they occurred in intergenic region, then I think even -f=50% reciprocal overlap would be better to define same CNV event. In conclusion, you should dig the data deeply and show the relationship among the event comprehensively and to find the most biological relevance to the CNVs.

ADD COMMENT
0
Entering edit mode
9.2 years ago
ivivek_ngs ★ 5.2k

In case of CNV detection not every tool allows you to put in similar parameters for a calling the regions, however if you can put the parameters close enough like say the number of reads to be selected for a call for a tumor and a normal and the number of windows for binning or segmentation, then one can easily try to compare the CNV calls first based on the -f parameter of 1.-0(100%) which will most likely be less owing to its statistical model , but then then relax is to .8 (80%) or .5(50%) with bed tools. Finally when you have the similar regions you can in fact try to plot the correlation of the segment ratio score between the overlapping region to find the similarity for the regions calls.

ADD COMMENT
0
Entering edit mode

hi guys,

could you please share how you analyzed population CNV ? I am trying to use JISTIC/cmds. Making input matrix, due to either missing data or due to multiple segments within one exonic region, is slighlty tricky. Any suggestion will be helpful.

ADD REPLY

Login before adding your answer.

Traffic: 2465 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6