Dataset normalization before gene ontology analysis
1
1
Entering edit mode
9.1 years ago
tiago211287 ★ 1.5k

I peformed GO analysis from a list of Genes using the Goseq package from bioconductor. After plotting the results, I could see that the bigger the gene list was, most counts it had from each category, there are some way for normalize this by the size of each gene list?

Gene ontology Normalization RNA-Seq bioconductor • 2.0k views
ADD COMMENT
0
Entering edit mode
9.1 years ago
svlachavas ▴ 790

Dear Tiago211287,

I believe that you get this result from ploting, because generally in RNA-seq the length of one gene is crusial regarding the levels of its expression (which in turn is associated with power). Thus, one way to possibly adjust for this when performing a GO analysis with RNA-seq data, is to use prior the function ?nullp:

nullp(DEgenes, genome, id, bias.data=NULL,plot.fit=TRUE)

This will produce a set of relative weights which are "somehow proportional" to how "big" are your input genes.

Then, you can feed it directly to goseq()

Hope that helps,
Efstathios

ADD COMMENT
0
Entering edit mode

I did that in goseq, generating a pwf(Probability Weighting Function)

ADD REPLY
0
Entering edit mode

Well then, excuse me but I misunderstood your question. So, did you meant that you used more than one gene lists ? If so, (without being an expert on RNA-seq analysis) why do you want to normalize for the size of each list?

ADD REPLY

Login before adding your answer.

Traffic: 2277 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6