Hi All,
I am able to find matching motif in my sequence, and I would like to now find overlapping motifs. Basically, after matching my motif, I want to find the 6 amino acids after it. This is the code below that I used to find the motif:
import Bio
import regex
from Bio import SeqIO
input_file = 'sequences.fasta'
fasta_sequences = SeqIO.parse(open(input_file),'fasta')
for fasta in fasta_sequences:
name, sequence = fasta.id, str(fasta.seq)
result=regex.finditer(r"[YFWLIMVA]..[LMALVN]..[AGSTCD].[LAIVNFYMW]",sequence)
for x in result:
print(name, x.start(), x.end(), x.group())
The above code works perfectly becasue it give me the sequence id, positions and the motif. The output is below:
P1 33 41 VTLLPAADL
Right now, what I want to do is to also get the 6 amino acids after matching this motif, such that I get an output like the one below.
P1 33 47 VTLLPAADLLMAIID
The code that I have tried to get the 6 amino acids after my match is below.
import Bio
import regex
from Bio import SeqIO
input_file = 'sequences.fasta'
fasta_sequences = SeqIO.parse(open(input_file),'fasta')
for fasta in fasta_sequences:
name, sequence = fasta.id, str(fasta.seq)
result=regex.finditer(r"[YFWLIMVA]..[LMALVN]..[AGSTCD].[LAIVNFYMW]",sequence)
for x in result:
print(name, x.start(), x.end() + 6, x.group())
This the output it gives me:
#It does not extend my motif by 6 amino acids, after getting the match.
P1 33 47 VTLLPAADL
#My desired output is this which include the overlapping LMAIID motifs
P1 33 47 VTLLPAADLLMAIID
I also tried the code below, but it returns an error.
import Bio
import regex
from Bio import SeqIO
input_file = 'sequences.fasta'
fasta_sequences = SeqIO.parse(open(input_file),'fasta')
for fasta in fasta_sequences:
name, sequence = fasta.id, str(fasta.seq)
result=regex.finditer(r"[YFWLIMVA]..[LMALVN]..[AGSTCD].[LAIVNFYMW]",sequence)
for x in result:
print(name, x.start(), x.end() + 6, x.group() +6)
You have your regex result x, but not the whole fasta record. With the extended numbers, you need to slice the fasta record.
Thank you Michael, how do I do that? I am still new in this, could you perhaps provide me with an example code on how I must do it