I want to calculate relative abundances of all bacterial taxa present in the 16s reads of different samples from HMP-Human Microbiome Project.
Mind that these are processed reads in fsa format not fastq raw reads (sample file)
The page states that:
16S rRNA Trimmed Data Set: Raw 16S sequence reads must be processed before they can be used to infer useful taxonomic information. The HMP DACC performed baseline processing and analysis of all 16S variable region sequences generated from >10,000 samples from healthy human subjects, corresponding to 16S variable regions v1v3, v3v5 and v6v9. Here we provide access to all trimmed, deconvoluted fasta files, including a subset of >5000 samples described in a series of 2012 publications (see the 'Published Data' tab for more information), as well as all subsequently sequenced samples.
These trimmed datasets were then processed by a pipeline that ran the following analysis steps: a) 16S reference alignment via the NAST-iEr alignment tool; b) chimera identification via ChimeraSlayer; c) aberrant sequence identification via WigeoN; and d) taxonomic binning using RDP classifier.
I have activated qiime virtualbox I am not able to figure out from what step should I continue the processed reads from HMP?