Question

Pathway Enrichment Analysis Of Family Sequencing Data

2

Entering edit mode

12.2 years ago

Rainer ▴ 130

Is anybody aware of a published software tool for pathway enrichment analysis of NGS data in families (in particular, exome sequencing data)? My understanding is that common enrichment analysis techniques from microarrays or SNP arrays cannot be applied here due to sequencing data specific biases. I would also be interested in pathway analysis tools for case/controls studies, but family sequencing is the main area of interest.

Many thanks in advance.

pathway enrichment sequencing ngs exome • 5.0k views

ADD COMMENT • link updated 11.0 years ago by Biostar 20 • written 12.2 years ago by Rainer ▴ 130

2

Entering edit mode

Most pathway enrichment software just takes lists of genes as input, and frankly, most of the stats aren't operating off great models anyway so I wouldn't be that concerned about the biases introduced by exome sequencing. But more important is what question you are asking? Are you planning on compiling lists of genes with variants in them and looking for enrichment? I'm not exactly sure what sorts of questions you are planning on asking with exome sequencing data in families versus gene expression data, which is where you typically apply these sorts of tools.

ADD REPLY • link 12.2 years ago by DG 7.3k

0

Entering edit mode

We are looking for combinatorial effects of variants that explain complex polygenetic disease phenotypes using family exome sequencing data. Since I expect the variant data to be very noisy (or to have many random effects), a pathway analysis could theoretically help to identify robust and interpretable deregulations in specific cellular processes. However, I think the biases cannot be ignored (e.g. some genes have longer sequences and accordingly also more SNPs, and then there is linkage disequilibrium between genes). What I am looking for is a proper permutation-based p-value estimation approach for pathway enrichment analysis in this case.

ADD REPLY • link 12.2 years ago by Rainer ▴ 130

1

Entering edit mode

Please I hope you did a power analysis before you started to determine the sample size you will need to see these combinatorial effects. This was a major (as in, major) weakness of the previous generation of GWAS studies. Also, I agree with Dan about the currently available pathway analysis stuff. Most publicly available pathway analysis tools do not integrate enough context specific data (primarily timing and cell-type of expression), their results are heavy on the false positives.

ADD REPLY • link 12.2 years ago by Alex Paciorkowski 3.5k

0

Entering edit mode

We are still in the planning phase, so the main first question is which analysis tools are available for this purpose (if any) and what are their strengths/limitations. Sample size estimation is of course also to be done afterwards, but first we need to know what analysis methods are available at all.

ADD REPLY • link 12.2 years ago by Rainer ▴ 130

score 1 · Answer 1 · 2013-02-14

Given your clarifications you may be interested in the following paper, which uses a variation of pathway enrichment analysis in exome data to explain genetic heterogeneity of particular phenotypes: Sequencing and Sequence Analysis BioGranat-IG : A network analysis tool to suggest mechanisms of genetic heterogeneity from exome sequencing data

You may also just want to do standard pathway enrichment analysis using DAVID, GOA, or your other software of choice. But you will want to think carefully about how you generate your gene lists for input. How you filter your variants, etc.