Question

Identification Of Relevant Germline Mutations In Illumina Data.... Filtering Strategies?

0

Entering edit mode

13.2 years ago

Wayne ★ 1.0k

Hello all, I am working with Illumina sequencing data and trying to identify germline mutations present in the data that are relevant to the cancer. To filter out the noise I have required that : Greater than 25% of the reads map to any given variant in both the tumor and normal samples. Filtered out known dbSNP, Filtered out synonymous mutations (except for splice sites), Filtered out any variant who's overall depth is less than 15 reads. Crossed data with Cosmic to see if mutations exist in gene.

Recurrencey analysis (seeing how many samples a gene is mutated in) shows that many genes are mutated across many of the samples I have pooled... but the list is still way too large to really be useful. I need a way to filter it down more to remove sequencing artifacts and other systematic errors. One attempt I tried was using the synonymous mutations as a negative control, i.e. if a gene is relevant to the disease then the number of samples with non-synonymous mutations should be significantly higher then the number of samples with synonymous mutations (making the assumption that the latter would be due to chance alone). This filtered did not seem to work well in accomplishing my goal. Does anyone have any suggestions for things that I could try, papers I could read, or statistics I could use to help prioritize genes that are likely to be important? Any help at all would be greatly appreciated, Thanks again

illumina sequencing cancer mutation gene • 3.6k views

ADD COMMENT • link updated 13.2 years ago by Bioinfosm ▴ 620 • written 13.2 years ago by Wayne ★ 1.0k

0

Entering edit mode

You could use VAAST to rank the mutations.

ADD REPLY • link 13.2 years ago by Zev.Kronenberg 12k

score 0 · Answer 1 · 2012-06-07

0

Entering edit mode

13.2 years ago

Bioinfosm ▴ 620

There are a lot of annotation tools, that can 'help' with the filtering. VAAST is definitely a popular one. ANNOVAR is another tool.

Here is another post that might be helpful with filtering = http://www.biostars.org/post/show/44540/variant-filtration-by-exclusion-of-common-or-well-known-variants/

ADD COMMENT • link 13.2 years ago by Bioinfosm ▴ 620