Entering edit mode
4.5 years ago
adrian18_07
▴
10
I would like to do global matching of two sequences. I have seq_record objects in two lists. I would like to match the sequence of the first record in one list with the first record in the second list. And then two out of two, three out of three etc. in a loop.
Here is the content of my first list:
[
SeqRecord(
seq=Seq('CCCGGGKKGGGKACTGCGGGAGGWCATTGTCGAACCTGCCCGACAGAGCGACCC...GAA', IUPACAmbiguousDNA()),
id='BIE-1_ITS5',
name='BIE-1_ITS5_F',
description='',
dbxrefs=[]),
SeqRecord(
seq=Seq('GCGTGKGRAKACTGCGAGAGGWCATTGTCGAACCTGCCCGACAGAGCGACCCGC...AAA', IUPACAmbiguousDNA()),
id='BIE-2_ITS5',
name='BIE-2_ITS5_F',
description='',
dbxrefs=[]),
SeqRecord(
seq=Seq('GCGGGTGGAKACTGCGGAGGWCATTGTCGAACCTGCCCGACAGAGCGACCCGCG...AAA', IUPACAmbiguousDNA()),
id='BIE-3_ITS5',
name='BIE-3_ITS5_F',
description='',
dbxrefs=[])
]
And here the second list:
[
SeqRecord(
seq=Seq('GGAAGTAAATGTCGTAACAAGGTTTCCGTAGGTGAACCTGCGGAAGGATCATTG...CCT', IUPACAmbiguousDNA()),
id='BIE-1_ITS4',
name='BIE-1_ITS4_R',
description='',
dbxrefs=[]),
SeqRecord(
seq=Seq('GTGAAGTATAAAGTCGTAACAAGGTTTCCGTAGGTGAACCTGCGGAAGGATCAT...CCC', IUPACAmbiguousDNA()),
id='BIE-2_ITS4',
name='BIE-2_ITS4_R',
description='',
dbxrefs=[]), SeqRecord(
seq=Seq('TGGAAGTAAAAAGTCGTAACAAGGTTTCCGTAGGTGAACCTGCGGAAGGATCAT...TGC', IUPACAmbiguousDNA()),
id='BIE-3_ITS4',
name='BIE-3_ITS4_R',
description='',
dbxrefs=[])
]
Thanks for any answer.
Please use the formatting bar (especially the
code
option) to present your post better. You can use backticks for inline code (`text` becomestext
), or select a chunk of text and use the highlighted button to format it as a code block. I've done it for you this time.So if I understand you correctly, what you have is : two lists of equal number of sequences, and what you want to do is : perform pairwise alignment in between those two lists on the conditions : index[list_1_sequence] == index[list_2_sequence] ? Have you considered a classic for-loop approach, where you go through 0:length_of_number_of_sequences(which is essentially indexing), and within the for-loop you perform pairwise alignment on the index.
A simple example:
Just a note if you follow this, in your case you do not have only sequences in the list but also have other meta information for every sequence. To extract the sequences only you can use .seq
I get an error on the line:
ind = list_1.index(item)
. NotImplementedError: SeqRecord comparison is deliberately not implemented. Explicitly compare the attributes of interest.I think it will be helpful if you share your code here, but what I suspect is going wrong is you are performing pairwise alignment on SeqRecord while you should actually be doing this for SeqRecord.seq (like I mentioned in the last part of my comment).
For example what you could do in the for-loop is:
After entering this code:
I get it:
0
Traceback (most recent call last):
File "C:\Users\Adrian\Desktop\Sekwencje\skrypt 1.py", line 44, in <module> ind = f.index(item)
File "C:\Users\Adrian\anaconda3\lib\site-packages\Bio\SeqRecord.py", line 803, in __eq__ raise NotImplementedError(_NO_SEQRECORD_COMPARISON)
NotImplementedError: SeqRecord comparison is deliberately not implemented. Explicitly compare the attributes of interest.