I have a reference sequence and a multiple fasta file. I'm trying to align each sequence in the file with my ref seq. My purpose is to find the ones with deletion mutations. All sequences are in the same length with ref seq. (I believe there are some insertions too). So I can't filter them by length.
I want to filter sequences with mutations by their score. For example, I need the ones with scores of less than 1000. I want to toss them out from my sequences.
Here's my alignment code:
from Bio import SeqIO
from Bio import pairwise2
ref_seq = SeqIO.parse("ref_seq.fasta",'fasta')
for i in ref_seq:
refseq = str(i.seq)
sequences = SeqIO.parse("deneme.fasta",'fasta')
alignments = []
for i in sequences:
seq = str(i.seq)
alignment = pairwise2.align.globalxx(refseq, seq, one_alignment_only=True)
alignments.append(alignment)
The output of the alignment is like this:
Alignment(seqA='...', seqB='...', score=1269.0, start=0, end=1277)
I read the tutorial for the pairwise2 module but I couldn't find anything. How can I filter the sequences by their alignment score?
Oh.. How couldn't I think that... Thank you very much!