Question

Merging two sequencing datasets

0

Entering edit mode

3 months ago

laura.bertola • 0

Hi all,

I have sequenced some samples years ago on a Novaseq 6000. Now, I want to explore some other options for which I need higher depth, so we repeated the library prep and resequenced the samples on a Novaseq X. Now, I'd like to combine both datasets, but I'm unsure at what stage to do this.

Should I merge the fastq files and then put them through the pipeline together, or should I map and remove duplicates first, and then merge the clean bam files? Any insights in what would be better would be much appreciated!

Thanks!

NovaSeq genomics • 533 views

ADD COMMENT • link updated 3 months ago by swbarnes2 14k • written 3 months ago by laura.bertola • 0

score 1 · Answer 1 · 2024-07-26

1

Entering edit mode

3 months ago

rfran010 ★ 1.3k

I say map and QC them separately to determine any batch differences. Then decide to merge them or not (you can merge the bams). Details may depend on the type of library and analysis goals.

ADD COMMENT • link 3 months ago by rfran010 ★ 1.3k

0

Entering edit mode

We're planning to look at load and do some (recent) demographic analyses. If we QC them separately, are there specific things we should look for? E.g. in mapping quality, distribution of reads?

And if it looks like there are no pronounced batch effects, I assume we would not gain much by redoing the mapping of merged fastq files (because of improved mapping since we'd have more reads, even though the indiviudals datasets should already be 7-15x depth)? We could just as well merge the bams at that point?

Thanks again!

ADD REPLY • link 3 months ago by laura.bertola • 0

0

Entering edit mode

Look at the results from both dates on PCA. Ideally, results from each date should virtually overlap each other. or at least, there should still be more differences between treatments than between dates.

ADD REPLY • link 3 months ago by swbarnes2 14k

score 0 · Answer 2 · 2024-07-26

0

Entering edit mode

3 months ago

swbarnes2 14k

I would be wary of combining data from two separate library preps.

ADD COMMENT • link 3 months ago by swbarnes2 14k