Good day,
Just wondering what the normal efficiency is for basecalling on ONT? I know it is inherently noisy but I am losing about half of my data with ~50% of the data ending up in the 'fail' folder. Are there any settings to increase this?
Here is the code I am using with the associated messages:
C:\Users\chris>"C:\Program Files\OxfordNanopore\ont-guppy-cpu\bin\guppy_basecaller.exe" --input_path F:\ProximaData\220916-CHE014\CHE014\20220916_1641_MN39611_FLO004_54e83c3b\fast5\ --save_path F:\ProximaData\220916-CHE014\CHE014\20220916_1641_MN39611_FLO004_54e83c3b\fast5\ONToutput -c dna_r9.4.1_450bps_fast.cfg --num_callers 7 --cpu_threads_per_caller 4
ONT Guppy basecalling software version 6.3.8+d9e0f648d, minimap2 version 2.22-r1101
config file: C:\Program Files\OxfordNanopore\ont-guppy-cpu\data\dna_r9.4.1_450bps_fast.cfg
model file: C:\Program Files\OxfordNanopore\ont-guppy-cpu\data\template_r9.4.1_450bps_fast.jsn
input path: F:\ProximaData\220916-CHE014\CHE014\20220916_1641_MN39611_FLO004_54e83c3b\fast5\
save path: F:\ProximaData\220916-CHE014\CHE014\20220916_1641_MN39611_FLO004_54e83c3b\fast5\ONToutput
chunk size: 2000
chunks per runner: 160
minimum qscore: 8
records per file: 4000
num basecallers: 7
cpu mode: ON
threads per caller: 4
Use of this software is permitted solely under the terms of the end user license agreement (EULA).By running, copying or accessing this software, you are demonstrating your acceptance of the EULA.
The EULA may be found in C:\Program Files\OxfordNanopore\ont-guppy-cpu\bin
Found 258 input read files to process.
Init time: 56 ms
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
Caller time: 9677368 ms, Samples called: 6817789498, samples/s: 704509
Finishing up any open output files.
Basecalling completed successfully.
Thanks in advance Chris
While ONT software puts the data into "fastq_fail" directories because it fails the "bar", you may want to look through there since there can still be a lot of usable data there. Especially if you have a good reference available.
The minimum qscore is the "bar" which can be modified during basecalling. The default which you can see in your guppy output is 8, so everything below that is put in the 'fail' directory.