Hello!
I am very new to biopython and I am trying to accomplish what I think is a simple task: I would like to remove sequences from a protein alignment that do not contain a particular residue at a specified position. I would like to be able to input a protein alignment in fasta format and then output a new alignment where all the sequences that do not meet my criteria are removed
For example: My input protein alignment contains sequences that have a mixture of residues at position 137. I would like to output a new alignment that contains only sequences that have either an arginine or a valine at position 137.
Just a bit of additional clarification: I am sequencing an amplicon of a functional gene and generating protein sequence alignments using RDP's fungene pipeline. I want to further screen the alignment by eliminating any sequences that do not contain a selection of conserved residues at various positions.
Thank you very much for your time.
-J
Thank you so much for the answer!
It was just the advice I needed.
This is the python file I made to accomplish my goal (it's probably really ugly to anyone who has actual python experience)