Question

Coverage histogram query

0

Entering edit mode

17 months ago

prasundutta87 ▴ 670

Hi,

I was just wondering if anyone has seen a genome coverage histogram as the green one and have any explaination for the same? This plot is generated after aligning ONT long reads to the human genome. Distribution of the number of locations in the reference genome with a given depth of coverage

Regards, Prasun

Alignment • 1.7k views

ADD COMMENT • link 17 months ago by prasundutta87 ▴ 670

0

Entering edit mode

What is represented on X axis?

ADD REPLY • link 17 months ago by GenoMax 148k

0

Entering edit mode

Depth of coverage.

I had actually added a legend, but it never appeared. Probably, I did it wrongly.

ADD REPLY • link 17 months ago by prasundutta87 ▴ 670

0

Entering edit mode

Why is it plotted in that spiky way where as the other curves are smooth?

ADD REPLY • link 17 months ago by GenoMax 148k

0

Entering edit mode

Exactly... that's my query. It is actually a multiqc output plot of multiple qualimap reports created after minimap2 alignment of ont reads.

ADD REPLY • link 17 months ago by prasundutta87 ▴ 670

0

Entering edit mode

There is something systematic here where odd numbers of depth are not found in any bins.

Could it be that for some reason in the qualimap report, maybe for simple presentation, they only summarize coverages of even counts. It seems like there are more bins with higher (>40) coverage that any other histogram, so I could see the program simplifying the report by only reporting bin counts for 0,2,4,6,8... read depths instead. Then, multiqc wouldn't know the difference so would report 0 bins for the missing reads since it is not an XY plot.

I had a similar thing happen once in a different context (not long reads). I cannot remember exactly what the issue was, but I remember it was something simple like that.

ADD REPLY • link 17 months ago by rfran010 ★ 1.3k

0

Entering edit mode

I don't think that it is Qualimap's issue. Other samples look fine. Additionally, I have never seen this issue with ONT long reads for other samples. I am waiting for some other samples to check if it is related to the type of flowcell used for sequencing.

ADD REPLY • link 17 months ago by prasundutta87 ▴ 670

0

Entering edit mode

If you go directly to the qualimap report, does it show the same thing? Admittedly, I am naïve on the technical aspects, but I don't understand how the flow cell would affect read counts over genomic bins.

ADD REPLY • link 17 months ago by rfran010 ★ 1.3k

0

Entering edit mode

Its the same for Qualimap. Even Mosdepth has the same issue. Odd numbers have near 0 coverage. R9 and R10 of ONT have different chemistries and the aligner settings may have to be changed for proper mapping. This is just a speculation, because I am seeing this issue for the first time, so don't have much background on it.

ADD REPLY • link 17 months ago by prasundutta87 ▴ 670

0

Entering edit mode

Very strange. So it seems like almost each fragment is counted twice. Sort of like PE data or some strange duplication.

Is be interested in a follow up if you find the source.

ADD REPLY • link 17 months ago by rfran010 ★ 1.3k

1

Entering edit mode

Just wanted to add here that there was no differnce in the flowcell chemistry in any of the samples..all used R10 flowcell..

ADD REPLY • link 17 months ago by prasundutta87 ▴ 670