Entering edit mode
3.1 years ago
PolDE
•
0
I have reads that contain repeats of 10 nt (conserved sequence is known). I wish to split the reads into subunits, using the 10 nt as "marker" to know where to split.
As example (the conserved sequence is cccgggttta
):
>
acagtacccgggtttaatcgatcgatcgtacccgggtttagtacgtacgatcgtcccgggtttatgctgtcgtc
To get:
>
acagtacccgggttta
>
atcgatcgatcgtacccgggttta
>
gtacgtacgatcgtcccgggttta
>
tgctgtcgtc
Help is appreciated, thank you