Bioinformatic Missing Data For Em Algorithm
2
0
Entering edit mode
13.0 years ago
Haluk ▴ 190

Hi,

I would like to test and verify EM (Expectation-Maximization) algorithm on a given data set. Basically, I am trying to find out missing data by using EM algorithm. Microarray data can be a good data set but i have no idea for verification phase. Also, this application is for weekly seminar so a small data set would be great.

Can someone recommend me a data set for this task?

Thanks.

• 3.1k views
ADD COMMENT
0
Entering edit mode

You might take a look at stats.stackexchange.com there are much more machine learning people compared to biostar.

ADD REPLY
1
Entering edit mode
13.0 years ago
Neilfws 49k

You might get some ideas from this excellent introductory article:

What is the expectation maximization algorithm?

It covers the application of EM to several topics in bioinformatics and provides references to studies that you may be able to replicate for your seminar. For example, using microarray data:

Uncovering Gene Regulatory Networks from Time-Series Microarray Data with Variational Bayesian Structural Expectation Maximization (PDF)

ADD COMMENT
0
Entering edit mode

They used spline interpolation to get rid missing values.

ADD REPLY
0
Entering edit mode
13.0 years ago
Fabian Bull ★ 1.3k

The only verification of EM algorithm I know of is:

The likelihood should increase in every step. If this is not the case: Your implementation is wrong.

There can be no: " this is the right missing value"-test because EM is stochastic. Run it several hundred times, plot the likelihood and take the missing value of the run that had the highest likelihood.

ADD COMMENT

Login before adding your answer.

Traffic: 1658 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6