This is a bit outside of my wheelhouse so I apologize if some terminology is incorrect.
I have a number of reference RBCL gene sequences for a taxonomic family, and I want to extract sequences from a NGS dataset (454) which are phylogenetically similar for downstream analysis. I have done this with blast previously but I'm sure that there are more accurate methods considering non-functional mutations/sequencing errors.
What I consider my current best option is to take all the NGS reads and create a massive alignment with the reference sequences, visually identify the relevant clades from a tree and pull out the relevant sequences. However, this is going to be computationally intensive and time consuming to do if we want to repeat for different taxa or NGS sets in the future.
Does anyone know of a tool/package that works on these principles? I am imagining something exists which operates similarly to blast i.e. $tool -i NGS.fasta -r reference_seqs.fasta .......
Thanks