Hello, I'm relatively new to eQTL analysis, and I have a question for batch correction during eQTL analysis.
I have two large datasets. One of them has RNA-seq data from one study. Another dataset has RNA-seq data from four different studies. Thus, the two large datasets contain RNA-seq data from 5 different studies in total.
I've checked that the data have a batch effect from 'study'. (checked by PCA) I think I should perform batch correction. Although eQTL analysis uses linear regression model with covariate terms, but I'm not sure that it is okay to put 'study' info into covariate term. Please help me to do robust analysis.
I think there are three options:
- Batch correction (with ComBat-seq) before eQTL analysis -> eQTL analysis without 'study' as covariate
- Batch correction (with ComBat-seq) before eQTL analysis -> eQTL analysis with 'study' as covariate
- No batch correction -> eQTL analysis with 'study' as covariate
Which option is right or best for robust eQTL results?
Thank you in advance :D