I'm a beginner in bioinformatics and currently working on RNA-seq analysis. I've conducted DESeq2 analysis to identify differentially expressed (DE) genes across various conditions, iterating through all possible pairs of conditions to create DE values.
total_selec <- list()
x <- 1 for(i in levels(condition)){
x <- x + 1
if (x <= length(levels(condition))){
for(j in x:length(levels(condition))){
res <- results(dds, contrast=c("condition", i, levels(condition)[j]))
d <- paste(i, levels(condition)[j], sep="&")
res$genenames <- rownames(res)
resul <- as.data.frame(res)
significantDE <- resul %>% filter(padj<0.05 & (log2FoldChange>1 | log2FoldChange<(-1*1)) )
selec <- as.list(significantDE$genenames)
total_selec <- append(total_selec, selec)
}
} }
total_selec <- c(unique(total_selec))
total_selec <- t(as.data.frame(total_selec))
selection <- resdata[total_selec[,1],]
DEgenes <- selection
For clustering_scaling the data
DEgenes <- as.matrix(DEgenes)
scaledata <- t(scale(t(DEgenes)))
Following this, I've selected genes based on specific criteria (adjusted p-value < 0.05 and log2FoldChange > 1 or < -1), and proceeded to perform k-means clustering after scaling the data.
I've successfully generated a heatmap using the DEgenes matrix. However, I'm particularly interested in creating a heatmap for a specific subset of genes from this matrix.
My question is, to visualize this subset of genes in a heatmap, should I extract these genes to form a new matrix and then perform the same scaling process on this new matrix before drawing the heatmap?
I'm curious if this is the correct approach when focusing on a specific set of genes in RNA-seq analysis, especially regarding the scaling process.
Thank you.