I used Yeast microarray dataset from SGD. Sample dataset are in this link. But the dataset contains huge number of missing values. I need to compute similarity between genes. If i removed all the gene rows which contains NA values in their sample's column, the number of genes decrease into half of the total number of genes.
How can I handling the large amount of missing value in the path of measuring the similarity between genes in R? What will be the standard approach for it?
I uesd yeast microarray dataset compiled from a variety of expression experiments that provide expression profiles for yeast carrying out a variety of cellular programs and responding to a variety of applied stimuli. Sample dataset are in this link
If you don't want to use complementary data then you need to do imputation or ignore the missing values. You may find this review useful. Another approach could be to use a downstream analysis method that can deal with missing values.