Dear all, I feel confused because I saw someone uses the minimap2 after demultiplexing, but before proceeding with the assembly (CANU) [case 1], and someone using the minimap2/samfile/BFCTools/medaka after assembly (always CANU) [case 2]. In case 1, the reference file used to align against the fastq files was a public available sequence from NCBI (NC_003310), while in the case 2 the file used as reference were the contigs from CANU to obtain the polished consensus. To be honest, I don't understand when is it convenient to use the 1st or the 2nd approach to get the best result. Does it only depend on having or not the public reference genome? Please, can someone give me some help? Thank you very much. Emilio
It entirely depends what you want to do.
minimap2 is a sequence alignment tool and is blind to whether or not you are aligning from or to the reference genome. In your examples - Case 1 is two genome sequences alignment, Case 2 is aligning contig sequences to a reference genome
Why don't you start by listing
Then people can help you more to get to your goal.
You are right. I am sorry to have been not so clear. Dataset: we have a collection of fast5 files from Nanopore MinION, in total 10 barcodes Research question: metagenomic analysis, in detail: identify viral sequences as much precise as possible My goal:to improve the assembled contigs obtained with MegaHit/CANU to get the whole genome
Thank you very much