Hello friends,
I have a question concerning the gene ontology tool DAVID: In DAVID it is possbile to create a sublist after a first round of functional annotation. The sublist is created by the user himself, i.e. if I check the tickbox for the GO-Term Extracellular Matrix, I can create a sublist with that term (or several terms selected by that way). My term has p< 0.05 but corrected p-value (Benjamini) of 0.1 (not significant). If I choose now my sublist and perform a second round of analysis (background keeps the same) I get significant values (lets say p<0.000005 and Benjamini p<0.0001). In a third round of analysis by creating a 3rd sublist, with the same (!) terms, the values do not change! Do you think that this is correct? Can one perform that way of analysis? Is there a justification/explanaition for this?
Thank you very much,
kind regards!
Martin
hypergeometric test can definitively be tricky.
What are the function of sublists... The sublists represent features (e.g. genes) sharing a particular concept. The basic idea behind is that you are able to better understand your list of significantly regulated genes.
One can also do a focus analysis of the genes belonging to a particular concept. For example looking specifically at those genes across your different conditions using a heatmap coupled with a hierarchical clustering etc...
5% FDR is the consensus for the significance level. But p-value are not the only statistical measure of interest and are highly sensitive to sample size. To conclude this is only one step in your analysis it helps believing in your analysis but you will have to consider the context and think about what biological validation you should perform to make your point.