Entering edit mode
4.8 years ago
Chvatil
▴
130
Hel lo
I have a sequence such as :
record_dict = SeqIO.to_dict(SeqIO.parse("sequence.fasta", "fasta"))
>sequence1
AAACCCGGGTTTAAACCCGGGTTTGGGTTTGGG
and I know from this sequence how to select specific part with coordinates with :
print(record_dict[sequence1].seq[coordinate_start:coordinate_end])
print(record_dict[sequence1].seq[3:7])
and I get :
CCCGG
but what if I would like to remove this part from the
>sequence1
AAACCCGGGTTTAAACCCGGGTTTGGGTTTGGG
and get
>sequence1
AAACGTTTAAACCCGGGTTTGGGTTTGGG
Does someone have an idea?
Thanks for your help
Here is a better exemple
ACCGCTTTGAATCCGAGCTAG
---- ----
and I want to remove 2 parts :
TCCG and GCTA with corresponds to the coordinates
11:14
and 16:19
At the end I would like to remove both and get :
>seq
ACCGCTTTGAAAG
If it were a string I'd say
But I don't know if
+
works for SeqIO records.I see what you mean but here It is an easy example, in the real data I can have thousands of coordinates, I added another exemple in order to show you.
Assuming a list of tuples containing the coordinates you wish to remove:
Again assuming you can just add seq records like that.
thank you for your help