Hi reader,
I have some nanopore sequenced fungal sample. Sequenced with R10.4 cell and LSK114 kit.
Some initial Information
- Median Length average: 8055bp
- N50 on average: 11109
- Meadian Quality average: 16.7
Question 1: I am not sure about the term Error-Rate in sequencing. Can you guide or point me to direction where I can read about it and calculate it in my data sets ?
The Sequencing company already basecalled the data with Dorado
and provided fastQ files as pass/fail
. I want to generate assemblies so using fastQ-Pass
files for downstream analysis.
I have used Porechop
to remove adapters and Chopper
to remove reads with Q<10 and Length<2000.
I am using Flye v-2.9.5-b1801
for genome assemblies. I have question regarding selecting the read-type
option.
Question 2: out of these two which one is more suitable for my data .
--nano-raw path [path ...]
ONT regular reads, pre-Guppy5 (<20% error)
--nano-hq path [path ...]
ONT high-quality reads: Guppy5+ SUP or Q20 (<5% error)
Also, Is the --scaffold
option in Flye assembler recomended to use ?
Thank you.
Thank you for clerification.
based on my Raw data i get error rates around 2–2.2%. Shold i use
--nano-hq
or--nano-raw
.what do you suggest based on your experience ?
If data has been basecalled with
dorado
using super accuracy (SUP) or high accuracy (HAC) models then use the--nano-hq
.