Hi all,
I'm new to R and have spent days trawling the internet, trying to find out how to make a heatmap to best display differential gene expression from a sample of Rseq data I have.
I pretty much understand the R code to use now with pheatmap, but I am struggling with how I should process my data before I can perform the heatmap function since pheatmap gives me a very weird looking heat map at the moment.
I have raw RNA seq reads of 719 genes for a gene knock out (KO) with 3 replicates as well as 3 replicates of a control condition where no gene was knocked out. I am struggling with how to represent this data as a heatmap.
My data is currently in an excel sheet in format with gene names in rows and the six samples (3 Knock outs and 3 Controls) e.g. Control1 Control2 Control3 KO1 KO2 KO3 in the columns with their raw reads for each gene. I have 719 genes with 6 columns of RNA seq reads for each gene. I'm not sure how best to represent this data as a heatmap. Any help, with R code would be greatly appreciated.
This is what I have tried: I loaded relevant packages into R(dplyr, readr, pheatmap, viridis, gplots) I saved my data as a .txt file and read it into R with: data <- read.table("data.txt", header=TRUE, fill=TRUE) (the only way I could get the file to read in was by also including fill=TRUE.)) I made the data into a matrix: data_matrix <-as.matrix[,c(2:7)]), eliminating the gene name column. I then tried pheatmap(data_matrix) which gave me a heat map that was just pretty much all blue, no other colors.
I think I probably should have processed the data beforehand, like normalised it or scales it or something but I'm really not sure how.
@Kevin has some links in his answer here: A: How can I generate Heat Map with dendograms, and PCA analysis in "R Programming"