Probabilities and statistical significance of a specific kmer frequency
1
0
Entering edit mode
8.5 years ago
arronar ▴ 290

Hello.

I'm wondering if there is any way to determine if the frequency of a specific k-mer (inside a specific length sequence) is significant statistically.

For example, let's say that we have the following sequence:

ATAGATCATAGATAGATGGAGTTACT

the 5-mer ATAGA has a frequency of value 3.

1) How can we determine which is the probability of this 5-mer to be appeared 3 times in that specific sequence ? 2) Is this probability statistically significant ? Could this probability, probable means something ?

I'm not looking for ready R libraries that might be possible to calculate these but for mathematical/statistic models/ideas to approach it.

Thank you

genome sequence • 1.6k views
ADD COMMENT
0
Entering edit mode
8.5 years ago
natasha.sernova ★ 4.0k

I've started writing some explanation, but everything has been already done.

See this post:

A: How To Interpret A T-Test Output Produced By R

ADD COMMENT

Login before adding your answer.

Traffic: 1951 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6