common CNVs among multiple files
1
0
Entering edit mode
6.1 years ago
popayekid55 ▴ 110

Hi all, i have analyzed 35 normal WGS sample for cnvs using cnvNator. Now i want to know common cnv region among these files so that those can be used as control panel.

Is the a tool or method to obtain these common region among all files at once?

thank you

cnv genome • 1.8k views
ADD COMMENT
0
Entering edit mode

We are not necessarily familiar with the output format of cnvNator, so it would be best if you could elaborate on which files you have.

ADD REPLY
0
Entering edit mode

output will be converted into bed file format like below

1   629471  638210  1   0.431094
1   671461  675070  3   2.75301
1   1414076 1416640 1   0.560963
1   2583526 2591885 3   12.4121
1   2634161 2684320 1   0.000940585

chromosome start and end of cnv, type of cnv and a score

ADD REPLY
0
Entering edit mode

bedtools multiinter will help you

ADD REPLY
0
Entering edit mode

Parse the chr and start end form CNVnator results then overlap the output file using Bedtools multiIntersectBed or Bedops.

ADD REPLY
0
Entering edit mode
6.1 years ago

GAIA will find recurrent copy number regions from your input data and assign a p-value to each region to help with filtering them. I believe you have the required information to run GAIA. The starting data is a row-binded list of all regions, with an extra column that indicates the sample from which the region derived. You decide your own cut-off points for gain (1) and loss (0) based on the segment mean.

A practical example for cancer is given here: C: How to extract the list of genes from TCGA CNV data

Kevin

ADD COMMENT
0
Entering edit mode

i did not understand completely. I am looking for common (overlapping) cnv coordinates among these 35 files.

ADD REPLY
0
Entering edit mode

GAIA will find the common regions and assign a p-value based on how recurrent (frequent) they are in your dataset. The idea is that the more recurrent ones are more important.

If you literally just want to see the overlapping BED regions, even if it occurs in just 2 samples, then use the BEDTools solutions that were suggested. However, what would you do in the situation were one region is gain (amplified) in one sample but loss (deleted) in another? - does it make sense to merge these in light of what is your downstream analysis plan?

ADD REPLY

Login before adding your answer.

Traffic: 1700 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6