Question

TCGA data normalization

1

Entering edit mode

5.0 years ago

alexa57alexa ▴ 10

Is using TCGAanalyze_Normalization function from the TCGAbiolinks a proper way to normalize the TCGA data or should I use another package/workflow?

dataPrep_LGG <- TCGAanalyze_Preprocessing(object = gbm.exp,
                                      cor.cut = 0.6, 
                                      datatype = "raw_count",
                                      filename = "LGG_IlluminaHiSeq_RNASeqV2.png")

dataNorm <- TCGAanalyze_Normalization(tabDF = cbind(dataPrep_LGG),
                                  geneInfo = TCGAbiolinks::geneInfo,
                                  method = "gcContent") 

dataFilt <- TCGAanalyze_Filtering(tabDF = dataNorm,
                              method = "quantile",
                              qnt.cut =  0.25)

Is this enough to obtain correctly normalized values? Does the normalization process depend on what I want to use the values for? Thank you in advance.

tcga RNA-Seq • 2.0k views

ADD COMMENT • link 5.0 years ago by alexa57alexa ▴ 10

1

Entering edit mode

TCGAbiolinks uses EDAseq to normalize the expression which considers gene length and GC-content into account. It seems like you can use the EDAseq normalized counts in DESeq2 and edgeR for differential expression analysis if that's what you're trying to do.

ADD REPLY • link 5.0 years ago by newbio17 ▴ 370