Is it fair to compare 2x75 bp and 2x150 bp exome performances after downsampling at the same number of reads ?
0
0
Entering edit mode
6.1 years ago

Hello, I am performing exome comparisons between different technologies and different sequencing conditions.

I have downsampled my dataset for a fair comparison at 5M 10M ..... 60M (M = million of reads).

Amongst other things, I compare notably exomes with 2 x 150 bp vs 2 x 75 bp. But in both cases I have normalized at 5M reads. But should I not compare 2 x 150bp downsampled at 5M and 2 x 75 bp downsampled at 10 M for a fair comparison ?

What do you think ?

Thank you ! Henri

ngs exome sequencing • 1.8k views
ADD COMMENT
0
Entering edit mode

Hello pegeot.henri ,

why do you think downsampling in necessary for a comparison? What do you try to compare?

fin swimmer

ADD REPLY
0
Entering edit mode

I am interested in comparing standard QC metrics. Amongst other things, I am particularly interested in the target Coverage efficiency as a function of number of reads. Downsampling is usually done in such cases as it allows a comparaison with the same amount of sequencing.

ADD REPLY
0
Entering edit mode

Downsampling to roughly the same number of bases is probably not how this needs to be done. 150 bp read (or a large part of it) is likely to map more specifically than a 75 bp read.

What sort of difference did you see in the alignments before you did any downsampling?

ADD REPLY
0
Entering edit mode

To give more context I work in an hospital and I am looking for the most performant sequencing kit. This will lead to the choice of a technology for routine use for patients analysis. One of the key metrics I want to investigate is the target coverage efficiency for the same sequencing effort.

Concerning 2x150 bp and 2x75bp, after downsampling, without even looking at the alignment, I see that the fastq size for 2x150 bp sequencing is twice bigger than for 2x75bp. Which can be expected. The comparison will be biased if go further. Should I double up the number of reads for 2x75 bp ? I am fine if there are difference in the alignment between 2x75 and 2x150 bp. For me, this is a part of the technological comparison.

ADD REPLY
0
Entering edit mode

Hello pegeot.henri ,

comparing file sizes is never a good idea, because they tell you nothing. Doubling the file size doesn't mean automatically you have twice as much information.

If you are looking for the best technology/sequencing kit, you should have a look on how even is the coverage distribution across your target regions to get a sense for how many samples you can sequence in parallel.

But the most important thing you should do, is comparing the results of your variant calling pipeline.

fin swimmer

ADD REPLY

Login before adding your answer.

Traffic: 2058 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6