Question

Batch effect removal for RNA-seq data (single-end and pair-end)

0

Entering edit mode

2.8 years ago

hanny • 0

I have some RNA-seq data, in which some samples were paired-end reads, and some were single-end reads. I used Combat-seq from sva package to remove the batch effect with input is raw counts. Then I used the adjusted counts (after Combat-seq) as input for DESeq2 and made PCA plot (rld).

I also compiled some other RNA-seq data (paired-end) from 3 papers (I make a raw-count file again from their raw RNA-seq). Now, I set 5 batches: 1: my paired-end data, 2 my single-end data, 3-5: data from 3 different papers. Is that fine?

I am trying to use other methods of batch effect removal and compare those. I tried using the limma::removeBatchEffect. I did make dds file from DESeq2 (input is raw counts) => vsd (<-vst function) => limma::removeBatchEffect => PCA plot.

I had 2 PCA plot show totally differently by 2 different methods of batch effect removal.

I am so new to bioinformatics. Please give me some advice or share with me some relevant materials.

Thank you in advance!

Batch-effect • 905 views

ADD COMMENT • link updated 12 months ago by Ram 45k • written 2.8 years ago by hanny • 0

score 0 · Answer 1 · 2022-10-17

0

Entering edit mode

2.8 years ago

tomas4482 ▴ 430

Sequencing techniques, platform and different sources of data should be considered as independent confounders. Hence for your expression, it should be something like: count~end+platform+cohort.

ADD COMMENT • link 2.8 years ago by tomas4482 ▴ 430