Entering edit mode
15 days ago
Sakura
•
0
Hi everyone,
I'm a PhD candidate working on bulk RNA-seq analysis, attempting to integrate local experimental data with public datasets. Despite implementing ComBat for batch correction, I'm encountering significant batch effects that persist after correction.
Details of my analysis:
- Working with bulk RNA-seq data
- Trying to integrate in-house experimental data with public datasets
- Attempted ComBat correction but batch effects remain prominent
- Data has been normalized and log-transformed before batch correction
I'm particularly interested in:
- Modern approaches for batch effect removal, especially AI/ML-based methods
- Alternative batch correction tools that might be more effective than ComBat
- Best practices for integrating datasets from different sources
Has anyone successfully dealt with similar issues using newer methods? I've heard about some deep learning approaches but would appreciate specific recommendations or experiences.
Thank you in advance for any suggestions!
Can you show an overview which groups are in your data and which groups are in the public data? If your data are nested, meaning like all controls are in your data and all conditions are in the public data then it is impossible to merge.
Thanks for your reply.
In the local data, we have data for disease group (1) vs. control disease, and in the public data, we have data for control disease vs. disease group (2). Since the control disease is common to both local and public batch, we would like to integrate them to compare disease group (1) vs. disease group (2).
I would first of all run the most old-fashioned and robust approach there is: Normalize data with edgeR to get logCPMs, run
removeBatchEffect
from limma, specifying batch as a factor, and then explore data by PCA/MDS. Please show code and plots for guidance.Great thanks. I will try them and let you know when needed.