For clustering_scaling the data

Question

RNA-seq scaling the data for heatmap

0

Entering edit mode

13 months ago

maplewj ▴ 20

I'm a beginner in bioinformatics and currently working on RNA-seq analysis. I've conducted DESeq2 analysis to identify differentially expressed (DE) genes across various conditions, iterating through all possible pairs of conditions to create DE values.

total_selec <- list()

x <- 1 for(i in levels(condition)){

x <- x + 1

if (x <= length(levels(condition))){

for(j in x:length(levels(condition))){

  res <- results(dds, contrast=c("condition", i,  levels(condition)[j]))                      

  d <- paste(i, levels(condition)[j], sep="&")                                      

  res$genenames <- rownames(res)

  resul <- as.data.frame(res)

  significantDE <- resul %>% filter(padj<0.05 & (log2FoldChange>1 | log2FoldChange<(-1*1)) )

  selec <- as.list(significantDE$genenames)

  total_selec <- append(total_selec, selec)
}

} }

total_selec <- c(unique(total_selec))

total_selec <- t(as.data.frame(total_selec))

selection <- resdata[total_selec[,1],]

DEgenes <- selection

For clustering_scaling the data

DEgenes <- as.matrix(DEgenes)

scaledata <- t(scale(t(DEgenes)))

Following this, I've selected genes based on specific criteria (adjusted p-value < 0.05 and log2FoldChange > 1 or < -1), and proceeded to perform k-means clustering after scaling the data.

I've successfully generated a heatmap using the DEgenes matrix. However, I'm particularly interested in creating a heatmap for a specific subset of genes from this matrix.

My question is, to visualize this subset of genes in a heatmap, should I extract these genes to form a new matrix and then perform the same scaling process on this new matrix before drawing the heatmap?

I'm curious if this is the correct approach when focusing on a specific set of genes in RNA-seq analysis, especially regarding the scaling process.

Thank you.

RNA-seq heatmap scaling • 477 views

ADD COMMENT • link updated 13 months ago by matt.a.bennett25890 ▴ 30 • written 13 months ago by maplewj ▴ 20

score 2 · Answer 1 · 2024-03-20

2

Entering edit mode

13 months ago

matt.a.bennett25890 ▴ 30

Scaling to assess gene expression variance via heatmap should be done per gene - i.e. if your matrix is genes as rows and samples as columns then it should be per each row.

So you should not have any change in values between your subset and original set.

ADD COMMENT • link 13 months ago by matt.a.bennett25890 ▴ 30