Hi all,
I have sequenced some samples years ago on a Novaseq 6000. Now, I want to explore some other options for which I need higher depth, so we repeated the library prep and resequenced the samples on a Novaseq X. Now, I'd like to combine both datasets, but I'm unsure at what stage to do this.
Should I merge the fastq files and then put them through the pipeline together, or should I map and remove duplicates first, and then merge the clean bam files? Any insights in what would be better would be much appreciated!
Thanks!
We're planning to look at load and do some (recent) demographic analyses. If we QC them separately, are there specific things we should look for? E.g. in mapping quality, distribution of reads?
And if it looks like there are no pronounced batch effects, I assume we would not gain much by redoing the mapping of merged fastq files (because of improved mapping since we'd have more reads, even though the indiviudals datasets should already be 7-15x depth)? We could just as well merge the bams at that point?
Thanks again!
Look at the results from both dates on PCA. Ideally, results from each date should virtually overlap each other. or at least, there should still be more differences between treatments than between dates.