Convert regex into DNAString object
0
1
Entering edit mode
5.4 years ago
elisheva ▴ 120

Hi,
I am trying to search for a pattern in a sequence in a way that a specific nucleotide won't be at the edges.
For example, given the following sequence:

x <- DNAString("TGCTTGCGCA")

I want to extract all the occurrences of GC where there is no T before or after.
Therefore only one occurrence will fit, since there are: TGCT, TGC and finally CGCA which indeed meets the condition.
In other words, the matching pattern is: {T}GC{T}
But I can't find any way to implement it using the Biostrings package.

I really hope you can help me figure it out.
Thanks for your help.

R bioconductor Biostrings • 1.3k views
ADD COMMENT
0
Entering edit mode

What is the problem with just converting the DNAString to a character and doing your regex with that?

ADD REPLY
0
Entering edit mode

Because I use StringSet and I want the analysis to be as fast as possible. If I will convert any single interval into character, I guess it will be much slower.

ADD REPLY
1
Entering edit mode
ADD REPLY

Login before adding your answer.

Traffic: 1719 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6