Question

Gene set size effect on Gene ontology Semantic Similarity score

0

Entering edit mode

7.9 years ago

ash3m21 • 0

Hello everyone,

My name is Ravi and I am a doctoral student studying the biological processes in human ageing. Recently we wanted to also have a bioinformatic analysis of the same. I am trying to understand the effect gene set size has when I am computing the GO semantic similarity score using the R package 'GOSemSim'.

I have a fixed data set containing about 2000 genes, labelled TraitA.

I compute the semantic similarity between TraitA and several other traits, labelled Trait_Random. Trait_Random will have anywhere from 10 to 2000 genes.

How does this difference in gene set size affects the score that I get?

Also is there any statistical method that I could use if there is a bias in the score generated?

Any thoughts or inputs on this would be very helpful. Thank you so much for your time.

GO SemanticSimilarity GOSemSim R • 1.9k views

ADD COMMENT • link updated 7.9 years ago by Guangchuang Yu ★ 2.6k • written 7.9 years ago by ash3m21 • 0

score 1 · Answer 1 · 2017-01-11

1

Entering edit mode

7.9 years ago

Guangchuang Yu ★ 2.6k

should not have bias on gene set size. please refer to the vignette, which describe the calculation in details.

ADD COMMENT • link 7.9 years ago by Guangchuang Yu ★ 2.6k