Question

Significance of Clusters

0

Entering edit mode

5.7 years ago

naseerkhan861 ▴ 10

I have a bunch of clusters of genes , 128 clusters to be precise and each cluster contains a bunch of genes . Dataset is case and control dataset with two classes that is diseased and non-diseased, autism disease to be specific. There is a well-defined list of genes associated with the autism and is compiled and scored with different categories at SFARI.

My First Question is that how can I check the biological significance of those clusters, that is how to test if I did not get those clusters by chance, what is the procedure to test the significance of those clusters, is there some tool or R package for this? My Second Question is that how can I check the biological significance of those genes clusters, is there some tool or R package for this also?

It would be pretty helpful if somebody could guide or suggest me relevant steps that I should do or that one should do based on literature.

Regards

clusters genes • 1.3k views

ADD COMMENT • link 5.7 years ago by naseerkhan861 ▴ 10

score 1 · Answer 1 · 2019-10-18

1

Entering edit mode

5.7 years ago

German.M.Demidov ★ 3.0k

In general it is done based on hyper-geometric test (the simplest version). You can use the online tools such as http://cbl-gorilla.cs.technion.ac.il . Also, this enrichment may become more tricky if you've used some sort of enrichment and have data only from the part of the genome, you may specify this as background set of genes in this gorilla tool's input. In general, IMO, the most reliable technique is the simulation-permuation based approach.

ADD COMMENT • link 5.7 years ago by German.M.Demidov ★ 3.0k

0

Entering edit mode

I got that but how can I use the list of gold standard or well-quality available genes as I said there is a list of genes available to download at SFARI on autism so how can I use this information to test the significance of my cluster of genes? Can you please elaborate that?

ADD REPLY • link 5.7 years ago by naseerkhan861 ▴ 10

2

Entering edit mode

To test any form of significance (enrichment analysis) IMO you need to create a contingency table https://en.wikipedia.org/wiki/Contingency_table . Once you will do it, you're golden. However, this table purely depends on the scientific question you want to answer and it is up to you how you will create it. I am answering the 2nd question btw, for the 1st one - significance of clusters - there is no well-defined answer. I think you can extract significant bits of the inormation from the methods from this paper: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5802446/

ADD REPLY • link 5.7 years ago by German.M.Demidov ★ 3.0k