Question

Any suggestion for combining different types of scores

1

Entering edit mode

10.1 years ago

Na Sed ▴ 310

I am investigating the similarity between gene list A as reference and another gene list, called B.

I measure the similarity between them in different aspects, such as their GO similarity, DO similarity, the number of literature which have detected them in a specific disease, etc.

I'd like to find a way to integrate all of such scores together such that I combine all of scores in one quantity.

Do you have any suggestion about the combining methods? or have you seen such papers? I appreciate if you could help me.

GO similarity DO similarity • 2.1k views

ADD COMMENT • link updated 10.1 years ago by Jean-Karim Heriche 27k • written 10.1 years ago by Na Sed ▴ 310

2

Entering edit mode

I would advise against collapsing different quantities into a single score - as attractive as it may sound the pitfalls and potential to generate misleading results is just as high.

ADD REPLY • link 10.1 years ago by Istvan Albert 103k

1

Entering edit mode

Indeed you can't just take any set of scores and combine them and expect to still get a valid measure of similarity. For example, if one of the scoring function is not symmetric, i.e. S(a,b)!=S(b,a) then a combination including it is not guaranteed to be interpretable as a similarity measure. However, this being said, data integration in this way using kernels has already a long history in bioinformatics.

ADD REPLY • link 10.1 years ago by Jean-Karim Heriche 27k

score 1 · Answer 1 · 2015-07-23

1

Entering edit mode

10.1 years ago

Jean-Karim Heriche 27k

If your similarity measures are valid kernel functions then the (weighted) average is still a similarity measure (because it's still a kernel, some other combinations also produce valid kernels). For an example, see my paper here.

ADD COMMENT • link 10.1 years ago by Jean-Karim Heriche 27k