Estimate error-rate and assembly oftion for FLYE assembler
1
0
Entering edit mode
22 hours ago
Umer ▴ 130

Hi reader,

I have some nanopore sequenced fungal sample. Sequenced with R10.4 cell and LSK114 kit.

Some initial Information

  • Median Length average: 8055bp
  • N50 on average: 11109
  • Meadian Quality average: 16.7

Question 1: I am not sure about the term Error-Rate in sequencing. Can you guide or point me to direction where I can read about it and calculate it in my data sets ?

The Sequencing company already basecalled the data with Dorado and provided fastQ files as pass/fail. I want to generate assemblies so using fastQ-Pass files for downstream analysis.

I have used Porechop to remove adapters and Chopper to remove reads with Q<10 and Length<2000.

I am using Flye v-2.9.5-b1801 for genome assemblies. I have question regarding selecting the read-type option.

Question 2: out of these two which one is more suitable for my data .

--nano-raw path [path ...]
                        ONT regular reads, pre-Guppy5 (<20% error)
--nano-hq path [path ...]
                        ONT high-quality reads: Guppy5+ SUP or Q20 (<5% error)

Also, Is the --scaffold option in Flye assembler recomended to use ?

Thank you.

flye assembly genome error-rate nanopore • 141 views
ADD COMMENT
0
Entering edit mode
19 hours ago

Error rate typically means a Phred like score

https://en.wikipedia.org/wiki/Phred_quality_score

where the error rate E is plugged into the formula

Error probability = 10^(-10/E)

so for example E=20 would be 10^-2 --> P = 1/100 = 0.01 that is 1% error, one error every hundred basecalls.

Sometimes people call it E or Q, sometimes it is shown as P (probability) as a fraction, and sometimes it is expressed as a percent, so it can be a bit confusing.

ADD COMMENT
0
Entering edit mode

Thank you for clerification.

based on my Raw data i get error rates around 2–2.2%. Shold i use --nano-hq or --nano-raw.

what do you suggest based on your experience ?

ADD REPLY
0
Entering edit mode

If data has been basecalled with dorado using super accuracy (SUP) or high accuracy (HAC) models then use the --nano-hq.

ADD REPLY

Login before adding your answer.

Traffic: 1569 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6