Entering edit mode
15 months ago
eebloom
▴
90
I have long-read data from oxford nanopore. Because the sequencing and basecalling was performed a while ago (a few years now for some of the samples), I want to re-do basecalling before downstream analysis (alignment and variant calling etc.) I am also interested in getting information on DNA methylation.
If I run dorado with --modified-bases, can I use the same resulting uBAM for all my downstream analysis (genomic and epigenomic) or should I run dorado twice, once with --modified-bases and once without?
If a relevant model is not available then I don't know if you can use
dorado
at all. There are separate methylation models available for dorado for different pore types.unless it's too ancient, if the flowcell matches, it should be OK. I recently did something similar: had a rapid (RAD002) run, and Dorado only has Kit14 models for flowcells 9.4.1. However, the results were still _much_ improved comparing to the previous data I had basecalled with Guppy 4-something (I mapped both to the consensus sequence).
the original model used in Guppy was dna_r9.4.1_450bps_hac_prom So I assumed I could use the dorado model dna_r9.4.1_e8_hac@v3.3 I know e8.2 is kit 14 but I wasnt sure what e8 or e8.1 were and I couldn't find it anywhere online.
from a reply on the dorado github...
apparently they are updating the readme soon!
Say there is the correct model available, which I think there is, you can add the --modified-bases flag and dorado will select the relevant modified base model. What I wondered was if the resulting uBAM containing information on methylated bases could also be used for downstream analysis to save compute time/resource and storage?
The dorado developers say that modified base calls should be the same/suitable for any downstream analyses
[https://github.com/nanoporetech/dorado/issues/469#issuecomment-1808151652][1]