Our research unit is trying to describe a virus in animals homolog to a human virus.
This virus is divided in several species and the targeted animals are known to be infected by several species. Therefore i'm considering each samples like a metagenomic samples with different species in them. My first idea was to use a clustering tool in order to detemine groups of reads who "looks alike" and finally to describe each group regarding a reference (similarity approach using BLAST).
Does anyone know other tool or methodoly that ould help me in describing those viral species.
PS: i thought of an other approach using mapping with an exhaustive list of references (human homolog viruses) in order to simultaneously group them and describe them regarding the references. But I'm not sure about the robustness of this method. Moreover mapping is more stringent in a way that you choose the parameters (local, global, the length aligned, the identity threhold) while BLAST is more permissive (and you have to interpret the output the scores, e-value, ... to set the threshold)
The idea behind clustering the reads is to kind of describing the variability of the species in animal hosts
It sounds as you may use Qiime or Mothur for what you want, although I am not sure, as your description of the experimental set up and samples is rather fuzzy.
What is the size of the amplicon? How did you sequence them? How many samples? Are the samples barcoded or pooled together?
The sequencing (150 bp paired-end reads) was conducted on a Illumina's Miseq from 150 bp amplicon. we sequenced 19 sampled pooled together.