Hi,
Given,I have a gene expression dataset. I want to find out, all the genes that are highly correlated with the given gene/ have the same expression pattern.
Example, if there would be an R package, it should take one input as expression data matrix and another input as gene of interest and should provide an output of a list of genes, that could have same expression pattern or are highly correlated.
I know that I could write a function in R regarding this and filter the genes on the basis of their correlation values and take into account the cor value more than 0.5. But, it takes so much of time, when you have a data with more than 20,000 rows and 1,000 columns.
Could you share your code to see why it is very slow? I gave it a try on my computer with a toy example (20'000 rows and 1'000 columns) and it only took few seconds. You should be able to do this without any problem in R.
Deepak, may you help me as already
Suppose that I have already downloaded GSE63706 and normalized that and I have a normalized text file now. and I have also a list of probesets (a text file of my interest probesets) from this array, I want to have a heat map showing the expression pattern of my interest probesets in this array, for example in this array I have 4 varieties and different tissues (rind and flesh) and phases (0,10,20,30,40 and 50 days after harvesting).
Heatmap is not a problem at all. There is a R package called pheatmap. There is a very easy way to show the above mentioned groups with the heatmap. These are called as the annotations of a heatmap. Check this out: How Do I Draw A Heatmap In R With Both A Color Key And Multiple Color Side Bars?
There are codes as well. You could use any heatmap packages, there a many. pheatmap, heatmap, heatmap3, heatmap.3, heatmap.2, whatever you prefer. Read the above mentioned Biostars link properly. Tryout the example codes properly and then edit them according to your data.
Thank you very much