Entering edit mode
6.0 years ago
asidhu
▴
10
I'm struggling to create a sliding window that will loop through sequences (first 30 nucleotides) and identify the forward primer to later trim the primer from the sequences. The file being used as a FASTA file with around 3400 sequences.
filename = "paired.fasta"
min_length = 150
mismatches = 3
fprimer = np.asarray(list("GTGCCAGCMGCCGCGGTAA"))
f_len = len(fprimer)
rprimer = "ACAGCCATGCANCACCT"
rprimer = np.asarray(list(reverse_complement(rprimer)))
r_len = len(rprimer)
forward_region = sequences[0:30]
reverse_region = sequences[-30]
winSizeF = len(forward_region) - len(fprimer) + 1
winSizeR = len(reverse_region) - len(rprimer) + 1
for header, sequence in good_reads.items():
for i in range(winSizeF):
start = i
end = i + len(fprimer)
target = forward_region[start:end]
distance = hamming(fprimer, winSizeF)
if distance == 0:
break
When I run the for loop I get the following error:
line 103
""" for header, data in sequences.items():
^
SyntaxError: invalid syntax
Any help with this sliding window would be appreciated.
The code with the error is not shown.
Your line 103 is not in your example, maybe you missed a paranthesis or a braket before line 103