I have Immunofluorescence(IF)-measured protein expression for each of my samples (6 from two groups - 3 replicates each) and RNA seq counts for each sample.
1) How can I correlate IF value with protein expression for a given gene? Is the correlation coefficient between protein expression and CPM normalized reads for the gene of interest the right way to correlate these data? 2) Can I do the same with all other genes to find out which genes are closely related to protein expression?
Here is what I did to answer Q. 1 and 2
1) IF values = list of protein expression in 6 samples. These are just 6 numeric values co IF_values = c(2.0, 3.99, 44.9, 50.1, 34.2, 1.0, 23.1)
(these values are not normalized to anything. These are absolute counts as obtained from IF.
RNA_coutns_ABCD = c(334.0, 56.7, 34.4, 33.3, 22.2, 100.5)
ie 6 data points from RNA seq counts from each sample (CPM normalized)
then I am computing in R ABCD_coef <- cor(IF Values, RNA_coutns_ABCD)
Can this value be used to infer the correlation between protein expression and RNA expression? The numbers are for example purpose only and are not real.
2) Compute the correlation of all genes (~17K) in the RNA seq count list to the IF values
The genes with the highest correlation values (taking +ve only) are shortlisted as top genes.
Is this the right approach to compare protein expression values to RNA seq?
I don't have an answer to your question.
Out of curiosity, is
+ve
meant to bepositive
? If so, how did you come up with that abbreviation? It is not terribly intuitive to me and I had to think for solid 10 seconds to understand from the context what you meant.standard (mathematical) abbreviation: positive (+ve) and negative (-ve)
I have seen all kinds of standard and non-standard abbreviations - and figured out in 2 seconds the other day what YMMV meant - but this is the first time I have seen either +ve or -ve. Good to know.