Entering edit mode
10.2 years ago
marina-orlova
▴
90
Hi
I need to compare a motif to genomic regions using matchPWM(PWM, DNAstring)
and get scores.
I have MEME-output:
Motif 1 position-specific scoring matrix
--------------------------------------------------------------------------------
log-odds matrix: alength= 4 w= 21 n= 47985 bayes= 11.8274 E= 5.3e-108
-123 -290 159 -149
-36 -44 -1250 136
-36 -390 162 -1250
-1250 -1250 -9 164
-64 -158 123 -64
-149 -44 -1250 158
-181 -1250 174 -181
-381 -109 -44 147
-11 -390 156 -1250
-101 -90 -190 147
-281 -190 162 -123
-223 -231 -290 183
-36 -1250 165 -1250
-1250 -1250 -1250 204
-36 -90 127 -223
-101 -231 -73 147
-101 -109 153 -381
-281 -231 -1250 191
-1250 -390 193 -1250
-1 -131 -1250 143
-64 -1250 149 -101
my code:
> A <- c(-123,-36,-36,-1250,-64,-149,-181,-381,-11,-101,-281,-223,-36,-1250,-36,-101,-101,-281,-1250,-1,-64)
> C<-c(-290,-44,-390,-1250,-158,-44,-1250,-109,-390,-90,-190,-231,-1250,-1250,-90,-231,-109,-231,-390,-131,-1250)
> G<-c(159,-1250,162,-9,123,-1250,174,-44,156,-190,162,-290,165,-1250,127,-73,153,-1250,193,-1250,149)
> T<-c(-149,136,-1250,164,-64,158,-181,147,-1250,147,-123,183,-1250,204,-223,147,-381,191,-1250,143,-101)
> df <- rbind(A,C,G,T)
> pm_matrix<-data.matrix(df)
> print(mcols(matchPWM(pm_matrix, seq[[295]], min.score="80%", with.score=TRUE))$score)
[1] 2696 3061 3343 3343 3343 3343 3199 2871
seq[[295]]
is a 141-letter DNAString subject
why scores are four-digit number? I expected floats from 0.8 to 1