Matching RNA motifs to cDNA sequences
0
0
Entering edit mode
6.9 years ago
sgwahls • 0

I am interested in matching RNA motifs (Position Weight Matrices) to cDNA sequences but am confused on the correct order of operations.

I am using the R package Biostrings which only takes DNA sequences. I downloaded cDNA sequences from Ensembl. enter image description here As i understand it the cDNA sequence is the 1st strand cDNA (ie: equivalent to the template strand of the genomic DNA and a reverse complement of the mRNA sequence)

If my understanding is correct, since the RNA sequence is the reverse complement of the cDNA sequence then i should complement my RNA motifs into DNA (A -> T, C -> G, G -> C, U -> A) then reverse them ( reverse the column ordering of the PWM).

Which should mean my RNA_motif has been converted into a cDNA_motif which i can simply match against the cDNA sequence?

R code example:

    ## RNA motif "CCAU" and cDNA sequence "ATGG"
motif = matrix(c(0,1,0,0,0,1,0,0,1,0,0,0,0,0,0,1), nrow = 4)
rownames(motif) = c("A","C","G","U")
motif
# [,1] [,2] [,3] [,4]
# A    0    0    1    0
# C    1    1    0    0
# G    0    0    0    0
# U    0    0    0    1

# complement then reverse to get cDNA
rownames(motif) = c("T","G","C","A")
motif = motif[ ,ncol(motif):1]
## reorder rows for consistency
motif = motif[sort(rownames(motif)), ]
motif
# [,1] [,2] [,3] [,4]
# A    1    0    0    0
# C    0    0    0    0
# G    0    0    1    1
# T    0    1    0    0

Biostrings::countPWM(motif, "ATGG")
#[1] 1
RNA-Seq • 4.5k views
ADD COMMENT

Login before adding your answer.

Traffic: 2057 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6