Batch Effect
0
0
Entering edit mode
3.5 years ago

I want to compare 7 RNA-seq datasets from patients with pancreatic cancer with 7 samples of normal pancreatic tissue.

The cancer samples and the controls, however, are not from the same patients.

I would like to be sure my results will be trustful, so I'm concerned about batch effect.

How can I identify batch effect in my analysis and how can I correct it?

I want to perform gene co-expression analysis.

thank you very much,

Fabiano

r RNA-seq • 1.1k views
ADD COMMENT
1
Entering edit mode

Hi,

since the data are from TCGA I would first run a PCA analysis to inspect sample clustering.

What kind of information does TCGA give you for these data?

If you have batch information you have 2 options,

  1. correct the gene expression values using a package like combat,
  2. add batch information as a covariate during differential expression analysis.
ADD REPLY
0
Entering edit mode

Are the samples (cancer vs control) from the same study or is this completely different sources? If the latter you cannot correct for it.

ADD REPLY
0
Entering edit mode

All the samples are from the TCGA.

ADD REPLY
1
Entering edit mode

Should be fine then I guess. Perform PCA (e.g. check PCAtools package at Bioconductor) and see how samples cluster, this can identify potential batch effects. For this one would use transformed data, such as vst from DESeq2 or the normalized counts on the log2 scale.

ADD REPLY

Login before adding your answer.

Traffic: 2478 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6