Is it possible to make a PCA plot for samples using TPM exprssion values in R?
1
0
Entering edit mode
6.3 years ago
rimgubaev ▴ 340

I have a table with TPM expression values for several samples (10-12) and I want to create PCA plot in order to estimate similarity of raplicates of a certain conditions. If it possible could you please suggest some pipelines or commands for R?

RNA-Seq PCA TPM • 4.9k views
ADD COMMENT
0
Entering edit mode

I think you can do so

library(scater)

example_sce <- SingleCellExperiment(
    assays = list(counts = matrix of your raw values))

cpm(example_sce) <- calculateCPM(example_sce)

example_sce <- normalize(example_sce)

plotPCA(example_sce)

You can do many things here

https://bioconductor.org/packages/devel/bioc/vignettes/scater/inst/doc/vignette-dataviz.html#generating-pca-plots

ADD REPLY
7
Entering edit mode
6.3 years ago

You should transform your data to a log-like scale. If you're analysing in DESeq2, look at vst or rlog methods, alternatively if you're using Limma Voom, then your data should be good to go. Have a look at the tximport package if you're confused about these different input metrics.

When you've got your data in the correct scale, here's a nice bit of code to produce a PCA - note I'm using dummy data in this case.

library(tidyverse) #CRAN - install.packages("tidyverse")
library(ggrepel)   #CRAN - install.packages("ggrepel")

# Generate some fake data
set.seed(73)
mat.row      <- 1000
mat.col      <- 15
data.pheno   <- data.frame(SampleID   = paste0("SAM", 1:mat.col),
                           SampleType = rep(c("A","B","C"), times = mat.col / 3),
                           stringsAsFactors = F)
foo          <- rnorm(mat.row * mat.col, mean = 300) %>% 
                log2 %>% 
                matrix(., ncol = mat.col) %>% 
                `colnames<-`(data.pheno$SampleID)
# 

# Generate PCA Data & Proportion of variability
pca          <- foo %>% t %>% prcomp
d            <- pca$x %>% as.data.frame %>% 
                add_rownames("SampleID") %>% 
                left_join(data.pheno) 
pcv          <- round((pca$sdev)^2 / sum(pca$sdev^2)*100, 2)
# 

# Make a pretty Picture
plot.pca    <- ggplot(d, aes(PC1,PC2,colour = SampleType)) +
               geom_point() +
               xlab(label=paste0("PC1 (", pcv[1], "%)")) +
               ylab(label=paste0("PC2 (", pcv[2], "%)")) +
               theme_bw() +
               geom_label_repel(aes(label = SampleType), show.legend = F) +
               theme(axis.title.x = element_text(size=15),
                     axis.title.y = element_text(size=15)) +
               labs(title    = "My Fake PCA",
                    subtitle = "With some random data",
                    caption  = "Coloured by my random phenotype")
print(plot.pca)
#

PCA Plot

ADD COMMENT
2
Entering edit mode

Very nice!

ADD REPLY

Login before adding your answer.

Traffic: 1922 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6