Splicing window on fasta sequence
0
0
Entering edit mode
2.9 years ago
pablo ▴ 310

Hello,

I want to create a splicing window from the beginnin and from the end on a fasta sequence. I will save the results on two separate files.

I have :

from Bio import SeqIO

with open("splicing_window.beginning.fasta","w") as f1, open("splicing_window.end.fasta","w") as f2:
        for seq_record in SeqIO.parse("my_sequence.test.fasta", "fasta"):
            for i in range(len(seq_record.seq) - 9) :
               f1.write(">" + str(seq_record.id) + "\n")
               f1.write(str(seq_record.seq[i:i+10]) + "\n")
               f2.write(">" + str(seq_record.id) + "\n")
               f2.write(str(seq_record.seq[:-i+10]) + "\n")

The problem is for the splicing window from the end. For example, if I have that sequence : TCCGCCGGAAGG ; I'd like to get an output f2 fasta file (window of 10nts) like that :

>1
CGCCGGAAGG
>2
CCGCCGGAAG
>3
TCCGCCGGAA

Any help? Best

biopython python • 953 views
ADD COMMENT
2
Entering edit mode
from Bio import SeqIO

f= SeqIO.parse("test.fa" ,"fasta")
window_size = 10
step_size = 1

for i in f:
    for j in range(window_size,len(i.seq)+1):
        start=j-window_size
        end=j+1
        print(str(">")+i.description+"_"+str(start+1)+"_"+str(j)+"\n"+i.seq[start:end])


$ /bin/python3 sliding_window.py

>seq1_3_12
CGCCGGAAGG
>seq1_2_11
CCGCCGGAAGG
>seq1_1_10
TCCGCCGGAAG
ADD REPLY
0
Entering edit mode

Thanks a lot.

ADD REPLY
1
Entering edit mode

Updated the code again for better understanding.

ADD REPLY

Login before adding your answer.

Traffic: 1646 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6