Question

During GO / pathway enrichment analysis, should we exclude genes not expressed in both groups?

0

Entering edit mode

6.8 years ago

CY ▴ 750

We found DE genes and then performed GO / pathway enrichment analysis (fisher exact test is used, right?)

What we are doing now is based on these two ratio: 1: number of genes in specific GO / total number of genes 2: number of DE genes in GO / total number of DE genes

Someone suggest that we should exclude genes that not express in both groups.

1: total number of genes -> total number of genes - genes that not express in both groups 2: total number of DE genes -> total number of DE genes - genes that not express in both groups

This suggestion is also kind of make sense considering genes in DE list has to be expressed in at least one group.

Can anyone share some comments? Thanks

RNA-Seq enrichment analysis DE analysis • 1.9k views

ADD COMMENT • link 6.8 years ago by CY ▴ 750

0

Entering edit mode

Look at here

http://lrpath.ncibi.org/

might be helpful

ADD REPLY • link 6.8 years ago by zizigolu ★ 4.3k

0

Entering edit mode

I didn't see anything answering my question in your link. Can you share some comments?

ADD REPLY • link 6.8 years ago by CY ▴ 750

0

Entering edit mode

I suggested this because GO and pathway analysis could be done regarding differential expression. For instance you provide raw read counts in RNA-seq and the program gives you GO and pathways

ADD REPLY • link 6.8 years ago by zizigolu ★ 4.3k

0

Entering edit mode

Yes. My question is, when counting the raw read count, should I subtract the genes that were not expressed in both group?

ADD REPLY • link 6.8 years ago by CY ▴ 750

0

Entering edit mode

if you mean extremely low expressed genes or genes that are all zero for all samples, I used to removing these genes beforehand. If not, sorry I don't know

ADD REPLY • link 6.8 years ago by zizigolu ★ 4.3k

0

Entering edit mode

Yes, that is what I am asking. I used to not exclude genes that don't express at all. Guess that was wrong

ADD REPLY • link 6.8 years ago by CY ▴ 750

2

Entering edit mode

Yes, just remove them, but this is done at the raw count stage, for example, removing all transcripts (genes) whose mean raw count is <10. Genes with high numbers of NAs can also be filtered out. Filtering prior to normalisation and differential expression analysis can vary from study to study.

Then, by the time that you reach the gene enrichment stage, you can have high confidence that the genes that you have included are by default expressed in both groups.

ADD REPLY • link 6.8 years ago by Kevin Blighe 88k