Question

Batch correction

0

Entering edit mode

5 weeks ago

Sakura • 0

Hi everyone,

I'm a PhD candidate working on bulk RNA-seq analysis, attempting to integrate local experimental data with public datasets. Despite implementing ComBat for batch correction, I'm encountering significant batch effects that persist after correction.

Details of my analysis:

Working with bulk RNA-seq data
Trying to integrate in-house experimental data with public datasets
Attempted ComBat correction but batch effects remain prominent
Data has been normalized and log-transformed before batch correction

I'm particularly interested in:

Modern approaches for batch effect removal, especially AI/ML-based methods
Alternative batch correction tools that might be more effective than ComBat
Best practices for integrating datasets from different sources

Has anyone successfully dealt with similar issues using newer methods? I've heard about some deep learning approaches but would appreciate specific recommendations or experiences.

Thank you in advance for any suggestions!

RNA-seq • 549 views

ADD COMMENT • link 5 weeks ago by Sakura • 0

1

Entering edit mode

Can you show an overview which groups are in your data and which groups are in the public data? If your data are nested, meaning like all controls are in your data and all conditions are in the public data then it is impossible to merge.

ADD REPLY • link 5 weeks ago by ATpoint 86k

0

Entering edit mode

Thanks for your reply.

In the local data, we have data for disease group (1) vs. control disease, and in the public data, we have data for control disease vs. disease group (2). Since the control disease is common to both local and public batch, we would like to integrate them to compare disease group (1) vs. disease group (2).

ADD REPLY • link 5 weeks ago by Sakura • 0

1

Entering edit mode

I would first of all run the most old-fashioned and robust approach there is: Normalize data with edgeR to get logCPMs, run removeBatchEffect from limma, specifying batch as a factor, and then explore data by PCA/MDS. Please show code and plots for guidance.