Dear all, Suppose to have a collection of viral reads from NGS (Illumina) technology in fastq format. After the usual pre-processing step (addressed by fastp), I need to remove the host sequences (contaminants) without having the reference genome (I cannot use bowtie2 and samtools for mapping, of course). I have ready some approaches, but I am still not sure. Please, can someone suggest an appropriate strategy/starting point/approach? Thanks for your support.
Hello, The goal of my project is to identify the correct taxonomy of the viral reads I have. We always know the host of our sample even if the reference genome is not available, like bat, rodents, human, or mosquito.