Hello,
I have 6 samples of near full length sequences (~8000 bp) of YU2-HIV viral RNA. Each sample has 70-120 Sequences. I have performed a multiseq alignment and used dnaSP to get information on nucleotide diversity from the reference. Now, I would like to get the abundance of all of the variant sequences (full sequences). Dada2 works with fastq sequences and will output all of the sequence variants and the amount of times they are present but this is more tailored towards fast microbiome samples. Is there any program that could give me the counts by de-replicating sequences?
If I could learn how to do this then I could further trim down the sequences to the specific protein regions and do the same calculation.
Thank You, Sara