My research group is slowly focusing on targeted resequencing, and although our main interests are SNP and INDEL discovery we would also like to detect CNVs if possible. as far as I know the CNV detectors available use whole genome information to calculate some kind of background noise and average read coverage, and then they go into inferring repeats. but for targeted resequencing I haven't found anything. For that reason, I have 2 questions which I'm placing here joined in this single BioStar question:
- (bio) Is it possible to appropriately detect CNVs on targeted resequencing? is there really a need to have all the genome covered with reads in order to infer repeats? wouldn't it be enough to look for a coverage average along the targeted regions, and then infer from coverage alterations those possible repeats?
- (informatics) Is there any program available for CNV detection for targeted resequencing? we currently use BioScope, and its CNV detection module only works on the Whole Genome Resequencing pipeline, refusing to accept data from the Targeted Resequencing one.
I have a rough implementation of #1 above. See here: https://github.com/seandavi/ngCGH . It is a little rough, but it does give useable results for paired tumor/normal samples. I typically load the results into Nexus Copy Number or into R for segmentation and downstream processing.
thanks Chris for your answer. I was having in mind something like option 1, because when option 2 came to my mind it looked too complex (model particular regions, if you are not restricted to the same ones always, would take a lot of effort). but I haven't found any program nor publication that indeed implements such idea. we are in contact with a dutch research group that are aiming towards a publication on this matter, so we'll see if they finally get to something soon.