Normalization and Comparison Methods across Gene Expression Data
0
0
Entering edit mode
4 weeks ago
Siqi • 0

Hi,

I am currently working with three breast cancer gene expression datasets from different sources: TCGA, GSE202203, and GSE10886. My research objective is to analyze the expression patterns of five specific genes across four breast cancer subtypes, information that is included within the phenotype data of all datasets.

Having downloaded the datasets, I observed that they appear to have been pre-processed, but the details of these procedures are not documented. Consequently, I am unsure whether additional normalization or scaling is necessary for each dataset. Here the figure is the boxplot of all gene expression distribution for the first 50 samples in each of the 3 datasets.

Now my questions are:

  1. Whether further normalization or scaling is needed for each dataset before analysis?
  2. What additional preprocessing steps might be required to accurately compare gene expression within one dataset and across the three datasets?

Any insights or recommendations are appreciated. Thank you! enter image description here

Expression Statistics Breast-Cancer Normalization • 259 views
ADD COMMENT
0
Entering edit mode

My 2 cents, 1) the 3rd figure (dataset) seems normalized as the mean value is centered around 0. However, the 1st and 2nd lack normalization. You need to normalized these two samples. For TCGA (and likely GSE), you can used DESeq2 that does not need normalized values, and calculate the gene expression. And later, you can get the normalized counts from the DESeq object.

2) For TCGA data set, you can use TCGABiolink R package, that provide useful functionality including quantile normalization. In my opinion, you should calculate DEGs from each sample, and then compare the expression of your genes of interest. Most likely, they will similarly expressed, however, it is not guarantee.

ADD REPLY

Login before adding your answer.

Traffic: 1931 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6