I have RNA-seq as well as miRNA-seq data. These two datasets are of different types. I'm interested in finding the mutual information between an miRNA and a protein-coding gene. First, I normalized both types of data. Is it possible to calculate the mutual information if they are of different types as well as their vectors are continuous and of different lengths? For example my miRNA vector is of size 100 and my protein-coding gene is of size 120...
Yes, it is possible ... as long as you do it correctly ;-)
Damian's link is a helpful read, there is apparently an updated paper describing a newer version of that resource (MAGIA(2)), here. A google search for "mirna gene mutual information" also returns lots of hits, which may be helpful to dig through.
The red flag in your question is that you say your miRNA expression vector and gene (aka potential target) expression vector are of different lengths, but this shouldn't be. You need to have the expression of both miRNA and the potential target gene taken from the same samples, for this to make any sense. If you think about mutual information as a generalized type of correlation, perhaps it's more clear why that has to be the case -- you need measurements of two different "things" (genes) taken from a number of conditions (tissues, perturbations, whatever), but both genes have to be observed from the same conditions.
Thank you very much for your response. Unfortunately I have the expression of miRNA and the potential target gene taken from different samples (e.g., different patient data) across many different classes (e.g., types of cancer). In other words I have two matrices one for miRNA and one for target genes, they both contain the same types of cancer but the two matrices were obtained from different patient data. Would this be sufficient to apply MI?
This paper might be relevant: http://nar.oxfordjournals.org/content/38/suppl_2/W352.full