RNA-seq scaling the data for heatmap
1
0
Entering edit mode
8 months ago
maplewj ▴ 20

I'm a beginner in bioinformatics and currently working on RNA-seq analysis. I've conducted DESeq2 analysis to identify differentially expressed (DE) genes across various conditions, iterating through all possible pairs of conditions to create DE values.

total_selec <- list()

x <- 1 for(i in levels(condition)){

x <- x + 1

if (x <= length(levels(condition))){

for(j in x:length(levels(condition))){

  res <- results(dds, contrast=c("condition", i,  levels(condition)[j]))                      

  d <- paste(i, levels(condition)[j], sep="&")                                      

  res$genenames <- rownames(res)

  resul <- as.data.frame(res)

  significantDE <- resul %>% filter(padj<0.05 & (log2FoldChange>1 | log2FoldChange<(-1*1)) )

  selec <- as.list(significantDE$genenames)

  total_selec <- append(total_selec, selec)
}

} }

total_selec <- c(unique(total_selec))

total_selec <- t(as.data.frame(total_selec))

selection <- resdata[total_selec[,1],]

DEgenes <- selection

For clustering_scaling the data

DEgenes <- as.matrix(DEgenes)

scaledata <- t(scale(t(DEgenes)))

Following this, I've selected genes based on specific criteria (adjusted p-value < 0.05 and log2FoldChange > 1 or < -1), and proceeded to perform k-means clustering after scaling the data.

I've successfully generated a heatmap using the DEgenes matrix. However, I'm particularly interested in creating a heatmap for a specific subset of genes from this matrix.

My question is, to visualize this subset of genes in a heatmap, should I extract these genes to form a new matrix and then perform the same scaling process on this new matrix before drawing the heatmap?

I'm curious if this is the correct approach when focusing on a specific set of genes in RNA-seq analysis, especially regarding the scaling process.

Thank you.

RNA-seq heatmap scaling • 340 views
ADD COMMENT
2
Entering edit mode
8 months ago

Scaling to assess gene expression variance via heatmap should be done per gene - i.e. if your matrix is genes as rows and samples as columns then it should be per each row.

So you should not have any change in values between your subset and original set.

ADD COMMENT

Login before adding your answer.

Traffic: 2533 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6