I'm using pycoQC to get some metrics on a nanopore sequencing run. I did the basecalling with dorado with bam as output, dorado summary
to get the sequencing_summary.txt file that I pass to pycoQC to generate the report. In the report I get this basecalled reads PHRED quality plot:
I though it looked a bit funky so I checked how it usually looks like and it doesn't have this spikes, it's normally a kind of smooth distribution. I've run also NanoPlot and from the plots generated I can see that all the read qualities are whole numbers.
I also got this plot for the read length which looks too uniform.
I don't know if this is normal (I don't think so), if this is a common thing because I did something wrong or this is just weird. Any help would be appreciated.
Interesting. Do you have access to the original sequencing summary file from the sequencer (even if FAST basecalling was used)? What does that look like compared to this?
They didn't perform basecalling with the sequencer if that's what you are asking. I've found a file within the sequencer output folder called "sequencing_summary_FAR92635_867dc3b7.txt" but when I run pycoQC with it I get this error
pycoQC.common.pycoQCError: Column read_len not found in the provided sequence_summary file