Question

biopython script to alingn mutiple sequences

0

Entering edit mode

4.4 years ago

anasjamshed ▴ 140

this script is supposed to loop through all the sequences present in CSV file and make pairwise alignment with all of them :

from Bio.SubsMat import MatrixInfo as matlist
from Bio import pairwise2
from Bio.pairwise2 import format_alignment
from Bio.Seq import Seq
import pandas as pd


main_df = pd.read_csv('seq_test2.csv')
seq_db = main_df['Sequence'].values


seq1 = 'CTCTACATTCACTTA'
matrix = matlist.blosum62


sequences = []
for seq in seq_db:
    sequences.append(seq)

for seq in sequences:
    seq = Seq(seq)
    align_query = pairwise2.align.globalds(seq1,seq,
                                            matrix, -1, -0.5,
                                               gap_char ='-' ,
                                               one_alignment_only = True)
    print(format_alignment(*align_query[0]))

but it gives the following errors :

KeyError                                  Traceback (most recent call last)
~\Anaconda3\lib\site-packages\Bio\pairwise2.py in __call__(self, charA, charB)
   1287             charB, charA = charA, charB
-> 1288         return self.score_dict[(charA, charB)]
   1289 

KeyError: ('J', 'A')

The above exception was the direct cause of the following exception:

SystemError                               Traceback (most recent call last)
SystemError: PyEval_EvalFrameEx returned a result with an error set

The above exception was the direct cause of the following exception:

SystemError                               Traceback (most recent call last)
SystemError: PyEval_EvalFrameEx returned a result with an error set

The above exception was the direct cause of the following exception:

SystemError                               Traceback (most recent call last)
SystemError: PyEval_EvalFrameEx returned a result with an error set

The above exception was the direct cause of the following exception:

SystemError                               Traceback (most recent call last)
SystemError: PyEval_EvalFrameEx returned a result with an error set

The above exception was the direct cause of the following exception:

SystemError                               Traceback (most recent call last)
SystemError: PyEval_EvalFrameEx returned a result with an error set

The above exception was the direct cause of the following exception:

SystemError                               Traceback (most recent call last)
SystemError: PyEval_EvalFrameEx returned a result with an error set

The above exception was the direct cause of the following exception:

SystemError                               Traceback (most recent call last)
SystemError: PyEval_EvalFrameEx returned a result with an error set

The above exception was the direct cause of the following exception:

SystemError                               Traceback (most recent call last)
SystemError: PyEval_EvalFrameEx returned a result with an error set

The above exception was the direct cause of the following exception:

SystemError                               Traceback (most recent call last)
SystemError: PyEval_EvalFrameEx returned a result with an error set

plz help me to remove this error

python biopython • 1.2k views

ADD COMMENT • link updated 4.4 years ago by Istvan Albert 102k • written 4.4 years ago by anasjamshed ▴ 140

score 0 · Answer 1 · 2020-10-19

0

Entering edit mode

4.4 years ago

Istvan Albert 102k

Verify the sequences that you are loading from the file.

It appears the sequence you are loading has a J in it.

Since J is not a valid code for neither DNA nucleotides nor as an aminoacid code hence does not have an entry in the scoring matrix.

ADD COMMENT • link 4.4 years ago by Istvan Albert 102k

0

Entering edit mode

yes my sequence contains J

ADD REPLY • link 4.4 years ago by anasjamshed ▴ 140

0

Entering edit mode

what does J mean?

I believe the biopython aligner could take a J as long as your scoring matrix contains the scores for matches and penalities.

Thus if you really want to align a J to an A then generate a custom scoring matrix and load that into the aligner.

ADD REPLY • link 4.4 years ago by Istvan Albert 102k