Question

Finding Novel Disease Causing Cnv'S In Large Number Of Patients

2

Entering edit mode

12.9 years ago

Vikas Bansal ★ 2.4k

Dear all,

After some analysis, I used a tool to call copy numbers from my sequencing data. I got the output ->

CHROM  START     END    CopyNumber

chr1      0     1000    2.151000    
chr2      0     1000    4.478000      
chr2      1000  2000    5.431000

Now, I did this analysis for 50 patients. So I have 50 files (as shown above) like this and for each file I have about 10,000 CNV's. Now I want to see which are the disease causing CNV's. So what I am thinking is ->

1.) Take the common CNV's which are present in all 50 patients.

2.) Filter them, if some of them are already present in database (DGV).

I want to know if there is any better strategy (or pipeline, filtering method, visualization of all at once), to find out novel CNV's from this kind of data?

Thanks and Best regards,

Vikas

cnv next-gen sequencing visualization • 3.2k views

ADD COMMENT • link updated 12.9 years ago by Chris Miller 22k • written 12.9 years ago by Vikas Bansal ★ 2.4k

1

Entering edit mode

What is your phenotype? Is this tumor vs normal tissue? Several of the replies here I think assume you are looking at a cancer phenotype. However, if you want to identify "disease-causing CNVs" in a phenotype associated with germ-line mutations/CNVs -- you need to know status of parental inheritance (inherited CNVs less likely to be pathogenic), and it really comes down to size of CNV (larger = more likely pathogenic) and gene content (very small CNVs can be pathogenic if the right gene is deleted).

ADD REPLY • link 12.8 years ago by Alex Paciorkowski 3.5k

1

Entering edit mode

@Alex: Can you please tell me, what do you mean by "and it really comes down to size of CNV (larger = more likely pathogenic) and gene content (very small CNVs can be pathogenic if the right gene is deleted)".

ADD REPLY • link 12.8 years ago by Vikas Bansal ★ 2.4k

0

Entering edit mode

Just curious: which tool did you use in the end?

ADD REPLY • link 12.9 years ago by Christof Winter ★ 1.1k

0

Entering edit mode

I used mrCaNaVar.

ADD REPLY • link 12.9 years ago by Vikas Bansal ★ 2.4k

0

Entering edit mode

@Vikas, take a look at the 2011 review by Girirajan, Campbell, and Eichler (PMID:21854229). I think that paper gives a good overview.

ADD REPLY • link 12.8 years ago by Alex Paciorkowski 3.5k

score 2 · Answer 1 · 2012-02-29

For visualization I would recommend Broad's IGV -- it is a great tool to start exploring the genomic alteration data across many samples. If you import your file into IGV, you will probably have a chance to see highly recurrent alterations just by eye. As far as I know, you can import BED files, but here is the documentation on supported file formats just in case: http://www.broadinstitute.org/software/igv/FileFormats

For the analysis, I have heard a couple of people using the RAE algorithm in order to find significant/recurrent CNAs. Here is the original paper if you want to see it in action:

Taylor BS, Barretina J, Socci ND, DeCarolis P, Ladanyi M, et al. 2008 Functional Copy-Number Alterations in Cancer. PLoS ONE 3(9): e3179. doi:10.1371/journal.pone.0003179

score 2 · Answer 2 · 2012-03-01

2

Entering edit mode

12.8 years ago

Chris Miller 22k

In addition to RAE, there are algorithms called RTS, GISTIC, and JISTIC, which do much the same thing - look across a cohort and find focal regions of statistically significant amplification and deletion.

ADD COMMENT • link 12.8 years ago by Chris Miller 22k