Hi all,
I am working with RNA-Seq data and need to control for a variable which can be measured via HOX gene expression. The study is a simple case/control study with a treated and untreated condition.
I have performed some clustering on the dds object (code below - all written in rstudio) and generated a correlation heatmap of HOX gene expression. However, I would also like to generate a scores plot as is often done following PCA.
I have made an attempt to do so. However, there are two issues:
I believe the plot I generated is only clustering samples based on their correlations with two particular samples (MC15pos and MC16pos).
As the labels are unwieldy, I would also like to colour the points by the 4 level case_condition variable (i.e. if they are case/control and treated/untreated).
Would you have any suggestions to help my rather poor attempt at plotting this!
Generate and normalise dds object
dds <- DESeqDataSetFromMatrix(counts, meta, design = ~condition)
dds <- estimateSizeFactors(dds)
Perform HOX Clustering (Remove HOXA10-HOXA9 as it has 0 counts)
vst_dds <- vst(dds, blind=TRUE)
vst_mat <- assay(vst_dds)
vst_mat_hox <- vst_mat[grepl("^HOX", row.names(vst_mat)), ]
vst_mat_hox <- vst_mat_hox[row.names(vst_mat_hox) != "HOXA10-HOXA9", ]
vst_cor_hox <- cor(vst_mat_hox)
Generate scores plot
library("pls")
scoreplot(
vst_cor_hox,
labels = meta$condition_treatment,
comps = 1:2,
identify = FALSE,
type = "p",
)
Scores plot