Question

How to create a pooled normal exome sample?

0

Entering edit mode

9.8 years ago

Dataman ▴ 380

Hi

I am performing copy number analysis on some exome sequencing data where some of the samples does not have the matched normal (the tool that I am using - ADTEx - requires a matched normal sample for the copy number analysis). However, I have 3 normal samples (bam files) which are prepared and sequenced the same way as the tumor samples (those that do not have the normal samples). I was wondering what the best practices are in order to create a pooled normal exome sample?

Currently, I am thinking of merging 3 normal bam files using 'samtools merge' and then find the coverage for each exon using 'coverageBed' and then divide the coverage for each exon by 3. However, I am not sure if this way is correct and whether I need to do some kind of normalization. In addition, I have noticed that 'ExomeCNV' R package has a function called 'pool.coverage()' which is meant for this purpose but unfortunately this package has been removed from CRAN!

I would like to thank you in advance for your thoughts and answers.

copy-number-analysis exome next-gen-sequencing • 3.6k views

ADD COMMENT • link updated 2.4 years ago by Ram 45k • written 9.8 years ago by Dataman ▴ 380

0

Entering edit mode

Hi I would like to know if there was alternate solution for your normalization because I am trying to do similar type of work. Thank you

ADD REPLY • link 7.6 years ago by mittu1602 ▴ 200

Ram · Answer 1 · 2015-07-03

I found the answer to my question in an article (EXCAVATOR). They use the following strategy:

In the pooling scheme, each test sample is compared with a pooled reference obtained by summing the total number of reads for each exon across all the control samples.

So, what I do is that I add the number of reads for each exon across all the normal samples. This constitutes the pooled normal coverage file which can be used as the input to the tool for the normal sample. I do not need to worry about the normalization part since ADTEx performs 'mean coverage normalization' meaning that the tool divides the number of reads at each exon by the mean number of reads before calculating the tumor/normal ratios.