Pairwise2 for seq_record in list
0
0
Entering edit mode
4.5 years ago
adrian18_07 ▴ 10

I would like to do global matching of two sequences. I have seq_record objects in two lists. I would like to match the sequence of the first record in one list with the first record in the second list. And then two out of two, three out of three etc. in a loop.

Here is the content of my first list:

[
SeqRecord(
    seq=Seq('CCCGGGKKGGGKACTGCGGGAGGWCATTGTCGAACCTGCCCGACAGAGCGACCC...GAA', IUPACAmbiguousDNA()),
    id='BIE-1_ITS5',
    name='BIE-1_ITS5_F',
    description='',
    dbxrefs=[]),
SeqRecord(
    seq=Seq('GCGTGKGRAKACTGCGAGAGGWCATTGTCGAACCTGCCCGACAGAGCGACCCGC...AAA', IUPACAmbiguousDNA()),
    id='BIE-2_ITS5',
    name='BIE-2_ITS5_F',
    description='',
    dbxrefs=[]),
SeqRecord(
    seq=Seq('GCGGGTGGAKACTGCGGAGGWCATTGTCGAACCTGCCCGACAGAGCGACCCGCG...AAA', IUPACAmbiguousDNA()),
    id='BIE-3_ITS5',
    name='BIE-3_ITS5_F',
    description='',
    dbxrefs=[])
]

And here the second list:

[
SeqRecord(
    seq=Seq('GGAAGTAAATGTCGTAACAAGGTTTCCGTAGGTGAACCTGCGGAAGGATCATTG...CCT', IUPACAmbiguousDNA()),
    id='BIE-1_ITS4',
    name='BIE-1_ITS4_R',
    description='',
    dbxrefs=[]),
SeqRecord(
    seq=Seq('GTGAAGTATAAAGTCGTAACAAGGTTTCCGTAGGTGAACCTGCGGAAGGATCAT...CCC', IUPACAmbiguousDNA()),
    id='BIE-2_ITS4',
    name='BIE-2_ITS4_R',
    description='',
    dbxrefs=[]), SeqRecord(
    seq=Seq('TGGAAGTAAAAAGTCGTAACAAGGTTTCCGTAGGTGAACCTGCGGAAGGATCAT...TGC', IUPACAmbiguousDNA()),
    id='BIE-3_ITS4',
    name='BIE-3_ITS4_R',
    description='',
    dbxrefs=[])
]

Thanks for any answer.

biopython pairwise2 • 1.3k views
ADD COMMENT
0
Entering edit mode

Please use the formatting bar (especially the code option) to present your post better. You can use backticks for inline code (`text` becomes text), or select a chunk of text and use the highlighted button to format it as a code block. I've done it for you this time.
code_formatting

ADD REPLY
0
Entering edit mode

So if I understand you correctly, what you have is : two lists of equal number of sequences, and what you want to do is : perform pairwise alignment in between those two lists on the conditions : index[list_1_sequence] == index[list_2_sequence] ? Have you considered a classic for-loop approach, where you go through 0:length_of_number_of_sequences(which is essentially indexing), and within the for-loop you perform pairwise alignment on the index.

A simple example:

list_1 = ["aa", "bb","cc"]
list_2 = ["ac","bb","dc"]

for item in list_1:
    ind = list_1.index(item)
    #write code here to perform alignment by using ind as index#
    #example would be
    alignment = pairwise2.align.globalxx(list_1[ind], list_2[ind])

Just a note if you follow this, in your case you do not have only sequences in the list but also have other meta information for every sequence. To extract the sequences only you can use .seq

ADD REPLY
0
Entering edit mode

I get an error on the line: ind = list_1.index(item). NotImplementedError: SeqRecord comparison is deliberately not implemented. Explicitly compare the attributes of interest.

ADD REPLY
0
Entering edit mode

I think it will be helpful if you share your code here, but what I suspect is going wrong is you are performing pairwise alignment on SeqRecord while you should actually be doing this for SeqRecord.seq (like I mentioned in the last part of my comment).

For example what you could do in the for-loop is:

for item in list_1:
   #access the sequence of SeqRecord in list_1
    seq1 = item.seq
   #get the index of this SeqRecord
    ind = list_1.index(item)
  #get the sequence of the same index in list_2
    seq2 = list_2[ind].seq
  #perform alignment
    align = pairwise2.align.globalxx(seq1,seq1)
ADD REPLY
0
Entering edit mode

After entering this code:

for item in f:
    seq1 = item.seq
    ind = f.index(item)
    print(ind)

I get it:

0

Traceback (most recent call last):

File "C:\Users\Adrian\Desktop\Sekwencje\skrypt 1.py", line 44, in <module> ind = f.index(item)

File "C:\Users\Adrian\anaconda3\lib\site-packages\Bio\SeqRecord.py", line 803, in __eq__ raise NotImplementedError(_NO_SEQRECORD_COMPARISON)

NotImplementedError: SeqRecord comparison is deliberately not implemented. Explicitly compare the attributes of interest.

ADD REPLY

Login before adding your answer.

Traffic: 1904 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6