How to slice or trim DNA sequence when it encounters NNNN or letters that represent ambiguity in the sequence?
1
0
Entering edit mode
8.1 years ago

How can I slice DNA sequence whenever NNN occurs and also letters that represent ambiguity? is there any available software tools I can use to do so?

For example:

>chr1:0-45
ATCGCTAGCTAGCTRCGAGCGTAGCNNNNNNCGATCGATCGATCAG

into

>chr1:0-13
ATCGCTAGCTAGCT
>chr1:15-24
CGAGCGTAGC
>chr1:31-45
CGATCGATCGATCAG
fasta sequence genome • 2.3k views
ADD COMMENT
3
Entering edit mode
8.1 years ago
Eric Lim ★ 2.2k

I don't know if there's any tool, I would write a very simple script to accomplish it. In Python, one can do

[(m.start(0), m.end(0)-1) for m in re.finditer('[ACTG]+', seq)]

where seq = 'ATCGCTAGCTAGCTRCGAGCGTAGCNNNNNNCGATCGATCGATCAG'

ADD COMMENT
0
Entering edit mode

This is indeed very helpful. Thanks.

ADD REPLY
0
Entering edit mode

You're welcome. I hope it puts you in the right direction. Feel back to ask if you need additional help.

ADD REPLY

Login before adding your answer.

Traffic: 2648 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6