Hi all, I'm trying to deal with the restriction module of BioPyhton because I need to create a report of the fragments generated by a double in silico digestion. Briefly, if I digest a .fasta sequence with let's say EcoRI and PstI, I would like to know how many Eco-Pst, Pst-Eco, Eco-Eco and Pst-Pst fragments are generated. Does anybody know how to do it with BioPython?
EDIT
>Sequence
CTAGCGTAATAATAGGTACACTGAATAGTAGTACAGTACGTACAGCTTTTCCTGGGGATC
CTATCGCAATCGCGAATGCGACTTCACGTGAATAGATCTCATTCTGAGCTCCCTTATACG
TTATAGTTCGACTGTGCTTGATACAAAACGTTTTACTGACTATAACGTGGGGGCACGGGA
ATTCAACAGAACTCTCCAAGCTGTCGATTTCTGTATGTTTGAGATTAGATCAGACCTCAC
AAGACTCCCTAAACCATCCAGCCCACTTTATATCCCCTCTTCTCCGCCGGAGGTGAATTC
AATCCGGCACCAAGGGACTGACAATTTAGCGCAGATACGAGGCAGAACACCGGAAAGACC
AGCGGCACTCGCGGGGATCTGGCCCGGTGGGCCCCGGTCCGTGAGCCCGAAGACCCCCTC
CCCGAAGATTGGAGGTGCCAGGCAACTGAGGGAGGTGGCTGTCGACGCGCGCCCGGTGCC
CGGCCGAGATGTGGGGCCTCCCGGACGGGTCGACCAGCAGCCGGCCGGTGCCCCCTCCGT
In this sequence there are 2 EcoRI sites and 4 TaqI sites. I want to know how many fragments Eco-Taq, Eco-Eco, Taq-Eco or Taq-Taq are made by the digestion.
The answer from ALchEmiXt is something reasonable but I can't undestand the subtraction step: It could be that there are many sites of one enzyme, contigous... how can I deal with that?
Tank you for your help.
this all depends on what your data currently looks like. Edit your question and post a small snippet of the data that you are trying to process.
Basically you generate a list sorted on cut position. To get the fragments just subtract item1 from item2, item2 from item3, and so forth... Since you have different enzymes you need to track which enzyme generated the cut...therefore the use of dictionaries (python I believe) or hashes/arrays....what you prefer, to keep de cut position associated with the enzyme.