What is the best way to pre-process RNA-Seq data?
0
1
Entering edit mode
2.2 years ago

I'm interested in pre-processing RNA-Seq data. I extracted the raw counts from UCSC Xena and performed following steps for pre-processing and getting the data ready for downstream analysis. However the plots obtained after batch effect removal and z score transformation are very interesting but still the median line in the z score transformed data is still not straight. Are there still any outliers present? How to remove the biases/noises to fully clean the data?

#### Load the libraries###

library(EDASeq)
library(NOISeq)
library(edgeR)
library(DESeq2)
library(ggplot2)
library(reshape2)
library(gplots)
library(RColorBrewer)
library(limma)
library(sva)
library(biomaRt)

###Load the read counts and phenotype data###

rawCountTable <- as.matrix(read.delim(file.choose(), row.names=1))
Col_data = read.table(file = "LUSC_Phenotype.txt", header = T, sep = "\t")

###Use DESeq 2 for normalization and log transformation###

dds = DESeqDataSetFromMatrix(countData = adjusted, colData = Col_data, design = ~ Type)
dds = DESeq(dds)
keep <- rowSums(counts(dds)) >= 10
dds <- dds[keep,]
dds = estimateSizeFactors(dds)
sizeFactors(dds)
vsd <- vst(dds)
vsd2 <- assay(vst(dds, blind=FALSE))

###Use NOIseq to remove batch effects from normalized data ###

DATA_BC<- readData(vsd2, factors = PHENO1)
myPCA = ARSyNseq(DATA_BC, factor = "batch_number", batch = TRUE, norm = "n", logtransf = TRUE)
DATA_BC_DONE <- assayData(myPCA)$exprs

### Perform z score transformation using scale function ###

transposed_matrix <- t(DATA_BC_DONE)
z_tr_mt <- scale(transposed_matrix)
z_score <- t(z_tr_mt)

enter image description here

RNA-Seq z-score Normalization • 899 views
ADD COMMENT
0
Entering edit mode

could you explain what is in adjusted variable ? You never declare it before using on this line dds = DESeqDataSetFromMatrix(countData = adjusted, colData = Col_data, design = ~ Type)

ADD REPLY
0
Entering edit mode

Actually adjusted was written mistakenly. There is rawCountTable instead of adjusted. According to yu is the data clean enough to proceed with downstream analysis or there is ned for more cleaning?

ADD REPLY
0
Entering edit mode

I always thought you batch corrected before normalization with DESeq2, can someone correct me if I'm wrong?

ADD REPLY

Login before adding your answer.

Traffic: 2524 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6