Entering edit mode
4.5 years ago
alexa57alexa
▴
10
Is using TCGAanalyze_Normalization function from the TCGAbiolinks a proper way to normalize the TCGA data or should I use another package/workflow?
dataPrep_LGG <- TCGAanalyze_Preprocessing(object = gbm.exp,
cor.cut = 0.6,
datatype = "raw_count",
filename = "LGG_IlluminaHiSeq_RNASeqV2.png")
dataNorm <- TCGAanalyze_Normalization(tabDF = cbind(dataPrep_LGG),
geneInfo = TCGAbiolinks::geneInfo,
method = "gcContent")
dataFilt <- TCGAanalyze_Filtering(tabDF = dataNorm,
method = "quantile",
qnt.cut = 0.25)
Is this enough to obtain correctly normalized values? Does the normalization process depend on what I want to use the values for? Thank you in advance.
TCGAbiolinks uses EDAseq to normalize the expression which considers gene length and GC-content into account. It seems like you can use the EDAseq normalized counts in DESeq2 and edgeR for differential expression analysis if that's what you're trying to do.