Question

common CNVs among multiple files

0

Entering edit mode

6.5 years ago

popayekid55 ▴ 110

Hi all, i have analyzed 35 normal WGS sample for cnvs using cnvNator. Now i want to know common cnv region among these files so that those can be used as control panel.

Is the a tool or method to obtain these common region among all files at once?

thank you

cnv genome • 2.0k views

ADD COMMENT • link updated 6.5 years ago by Kevin Blighe 89k • written 6.5 years ago by popayekid55 ▴ 110

0

Entering edit mode

We are not necessarily familiar with the output format of cnvNator, so it would be best if you could elaborate on which files you have.

ADD REPLY • link 6.5 years ago by WouterDeCoster 48k

0

Entering edit mode

output will be converted into bed file format like below

1   629471  638210  1   0.431094
1   671461  675070  3   2.75301
1   1414076 1416640 1   0.560963
1   2583526 2591885 3   12.4121
1   2634161 2684320 1   0.000940585

chromosome start and end of cnv, type of cnv and a score

ADD REPLY • link 6.5 years ago by popayekid55 ▴ 110

0

Entering edit mode

bedtools multiinter will help you

ADD REPLY • link 6.5 years ago by IP ▴ 780

0

Entering edit mode

Parse the chr and start end form CNVnator results then overlap the output file using Bedtools multiIntersectBed or Bedops.

ADD REPLY • link 6.5 years ago by Arup Ghosh 3.3k

score 0 · Answer 1 · 2018-11-27

0

Entering edit mode

6.5 years ago

Kevin Blighe 89k

GAIA will find recurrent copy number regions from your input data and assign a p-value to each region to help with filtering them. I believe you have the required information to run GAIA. The starting data is a row-binded list of all regions, with an extra column that indicates the sample from which the region derived. You decide your own cut-off points for gain (1) and loss (0) based on the segment mean.

A practical example for cancer is given here: C: How to extract the list of genes from TCGA CNV data

Kevin

ADD COMMENT • link 6.5 years ago by Kevin Blighe 89k

0

Entering edit mode

i did not understand completely. I am looking for common (overlapping) cnv coordinates among these 35 files.

ADD REPLY • link 6.5 years ago by popayekid55 ▴ 110

0

Entering edit mode

GAIA will find the common regions and assign a p-value based on how recurrent (frequent) they are in your dataset. The idea is that the more recurrent ones are more important.

If you literally just want to see the overlapping BED regions, even if it occurs in just 2 samples, then use the BEDTools solutions that were suggested. However, what would you do in the situation were one region is gain (amplified) in one sample but loss (deleted) in another? - does it make sense to merge these in light of what is your downstream analysis plan?

ADD REPLY • link 6.5 years ago by Kevin Blighe 89k