Best Way To Do Pathway Analysis Of A Set Of Genes?
5
32
Entering edit mode
12.7 years ago
Wayne ★ 1.0k

What is the best way to do pathway analysis computational for a set of genes or proteins of interest. Specifically I am trying to identify common functions or pathways in a set of genes mutated in cancer samples. I know I could look at Go terms, and use things like David. Anyone have some other really good techniques for this?

protein genes pathway analysis function • 110k views
ADD COMMENT
30
Entering edit mode
12.7 years ago
Occam ▴ 410

ConsensusPathDB is a meta-search engine for pathway analysis. it basically incorporates all/most of the reputable public access pathway databases out there.

http://cpdb.molgen.mpg.de/

one major source outside of cpdb is ingenuity IPA. this is proprietary software and (in addition to public access database info) has a manually curated database of millions of pathway "associations" mined from academic papers.

http://www.ingenuity.com/products/pathways_analysis.html

between these 2, i think you can capture most compiled pathway info.

ADD COMMENT
5
Entering edit mode

+1 for CPDB. Useful resource.

ADD REPLY
1
Entering edit mode

can anyone tell me how to use IPA, I mean I have list of Differentially expressed genes now I want to use it for viewing the pathways in IPA , can anyone guide me?

ADD REPLY
1
Entering edit mode

Yes, CPDB was incredibly useful. This database needs to be more well-known. Also Reactome and DAVID worked well for me.

ADD REPLY
10
Entering edit mode
12.7 years ago
Gareth Morgan ▴ 310

There are a lot of posts here and elsewhere about pathway analysis. How you go about it depends on what data you have and what you want to see. This post and the review it refers to are good places to start: http://gettinggeneticsdone.blogspot.com/2012/03/pathway-analysis-for-high-throughput.html

ADD COMMENT
8
Entering edit mode
12.7 years ago

To begin with there is no single best method. It is always depend on the data you have in hand.

Also remember

"Gene Ontology enrichment analysis != Pathway analysis"

For a detailed explanation of GO term enrichment see this previous discussion at Biostars.

You mentioned that

I am trying to identify common functions or pathways in a set of genes mutated in cancer samples.

I assume your data could have come from an genome/exome/transcriptome analysis workflow. If your list of genes are from an exome or genome workflow the approach discussed in the previous answers will be enough but you need to take care of few important things.

To do a pathway analysis you primarily need

  • List of background genes
  • List of perturbed genes,
  • Annotation file that map each gene to a pathway

Now you have to be very careful when you define your background. If your data is from a tumor - normal pair your background should only contain the genes that are specific to the cell-line or tissue of your interest. Consult databases like HPRD/Human Protein Atlas to find cell/tissue specific genes. Once you have this data/files you can perform enrichment analysis (standard statistical test followed by multiple testing correction) using R to see significant pathways. You can use external tools only if they allow you to input a user-defined / experimental platform specific background.

If your data is from transcriptome/RNA-Seq you may use GOSeq: It uses a statistical approach developed specifically for RNA-seq data that can incorporate length or total count bias of RNA-Seq data into gene set tests.

If you are working with whole-genome level background you can use web-based tools like: Panther Pathways, Reactome Pathways, KEGG Pathway analysis using SubPathwayMiner or other R/BioC packages

You may also refer to a previous post here

ADD COMMENT
1
Entering edit mode

For gene ontology, is it necessary to do length bias correction, when using RNA-seq data? Even if for example I do differential expression in a count based manner?

ADD REPLY
4
Entering edit mode
12.7 years ago

There are many, many potential methods here:

http://www.biostars.org/post/show/9394/mapping-genes-to-pathway/#9415

http://www.biostars.org/post/show/15101/comparing-pathways-between-two-different-cancer-cell-lines/#15103

Getting GO terms is a good start, but even here the level of curation is mixed.

Always remember to use a word of caution with pathway analyses, and have a plan for how to biologically validate your results if you plan to publish. Most publicly available analysis algorithms work from publicly available data -- and these data are just not complete for most genes of interest. This is true for online web tools such as String and GeneMania -- but if filtered with the most stringent search criteria, interesting connections can be found. Also take a look at the NCI Pathway Interaction Database.

Do you have questions about how to approach specific hypotheses through pathway analysis?

ADD COMMENT
4
Entering edit mode
12.2 years ago
Guangchuang Yu ★ 2.6k

you can use my package http://www.bioconductor.org/packages/2.11/bioc/html/ReactomePA.html for reactome pathway analysis

ADD COMMENT

Login before adding your answer.

Traffic: 2063 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6