Batch effect consideration (re-seq the same sample twice)
1
0
Entering edit mode
14 months ago
jkim ▴ 190

Hello,

I would like to know how you guys address batch effects on re sequence on the same samples (Fastq files).

Our client targeted 20 million reads for all of her samples. However, in the first run, we generated less than 20 million reads for a couple of samples(sample_2,3 and 7). So we re sequenced those samples again.

For the 1st run

sample_id  #_obtained_reads
sample_1   21.4
sample_2   11
sample_3   12
sample_4   35.5
sample_5   23.8
sample_6   29.4
sample_7   10
sample_8   23.8
sample_9   24.3
sample_10  18.6

For the 2nd run

sample_id    #_obtained_reads
sample_2     9
sample_3     8
sample_7     10

When it comes to downstream analysis, how would you address those samples(sample2, 3 and 7). Would you just merge them? i.g.

cat sample_2.fastq.gz (from the 1st run) sample_2.fastq.gz (from the 2nd run) > sample_2.merged.fastq.gz ?

Or would you visualize PCA or hclustering to see if they cluster together or not, and then decide to drop/merge the samples from the 2nd run?

RNA-seq batch-effect • 778 views
ADD COMMENT
3
Entering edit mode
14 months ago
ATpoint 85k

Would you just merge them?

Yes, it's standard to merge sequencing replicates.

Or would you visualize PCA or hclustering to see if they cluster together or not, and then decide to drop/merge the samples from the 2nd run?

You can do that for the sake of checking, but generally sequencing is not expected to generate batch effects, unless sequencers are different technology, like Illumina vs any other platform. A simple PCA will tell.

ADD COMMENT
0
Entering edit mode

Thanks, ATpoint. Oh I have another quick question. Does this apply to scRNAseq (10x) data also? Merge sequencing replicates just like bulkRNAseq?

ADD REPLY
1
Entering edit mode

Running the exact same library on the same kind of instrument (assuming no instrument glitch) will not add any technical artifacts.

You absolutely should merge any kind of data with UMIs, because you don't want two molecules from the same cell of the same gene and UMi being counted separately just because they ran at different times.

ADD REPLY

Login before adding your answer.

Traffic: 1835 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6