pairwise alignment for multiple sequences in a file
1
1
Entering edit mode
7.8 years ago
biobudhan ▴ 20

I have about 10 protein/DNA sequences in a file in FASTA format and would like to do a pairwise alignment for all possible combinations in this file.

Example:

 Seq1 vs Seq2
 Seq1 vs Seq3
 Seq1 vs Seq4 and so on.

Are there tools that can perform this?

alignment pairwise fasta • 5.2k views
ADD COMMENT
0
Entering edit mode
7.8 years ago
st.ph.n ★ 2.7k

You can use itertools in python, and pairwise2 from BioPython. Itertools.combinations will create the combinations of sequences you need for each alignment.

#!/usr/bin/env python
import itertools, sys
from Bio import SeqIO, pairwise2

fasta = sys.argv[1]
with open(fasta, 'r') as f:
    seqs = []
    for line in f:
        if not line.startswith('>'):
            seqs.append(line.strip())

combos = itertools.combinations(seqs, 2)

for k,v in combos:
    aln = pairwise2.align.localxx(k,v)
    print pairwise2.format_alignment(*aln[0])

Save as aln.py; run as python aln.py your_fasta_file.fasta

ADD COMMENT

Login before adding your answer.

Traffic: 2187 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6