I am working with identifying bacterial communities in the fruit fly gut. In October we sequenced the data using Nanopore Technology and the results showed that there were no families to a specific type of bacteria that we have studied in the lab for a while. I decided to re analyze the data myself, but I am confused as to what would be the correct way to analyze it. Apparently Nanopore has a new basecaller called Dorado, I have converted my fast5 files to Pod5 to be able to use this basecaller. I searched up info online and the output is supposed to be a .cram file.
Should I change the cram files to fasta or fastq and then import them to qiime2 for taxonomic classification or visualization or should I just ignore the basecaller and try to clean it using other tools and then importing it to qiime2?
To clean the data I would use the following:
- Remove the adapters using PoreChop
- Trim and remove the reads with NanoFilt
- Filter sequences with fastp
You can just use
guppy
to do basecall and get fastq output, no need to usedorado
.