Question

Statistical analysis to compare three different variables

0

Entering edit mode

8.9 years ago

MAPK ★ 2.1k

I have three sets of multiple samples to compare. Those are basically the sequences of three samples and I need to compare them letter by letter (DNA sequences) and I was wondering if there is anyway I can show some concordance using any statistical methods or any codes/package in R. I don't have good knowledge of statistics and would really appreciate if any one can help me explain how to compare them.

Say I have three samples, sampleA, SampleB and SampleC. I want to compare the sequence concordance with the sets of each two and then all three samples (overall concordance). I just want to get the statistical significance for their concordance.

sampleA:   AATGCCTGGAAA
sampleB:   AATGCTTGGAAA 
sampleC:   TTTGCCTGG

statistical-modelling • 2.7k views

ADD COMMENT • link updated 2.3 years ago by Ram 44k • written 8.9 years ago by MAPK ★ 2.1k

0

Entering edit mode

Define sequence concordance (= identity?/conservation?)! To me this looks like a common multiple/global alignment problem or maybe to determine regions under selection, e.g. ka/ks. To calculate significance you need a 0-model, that means a model that explains what happens just by chance. You need to define that too.

As you see, you need to present your problem in full detail including where your samples come from and what the biological question really is, and also avoid to impose a potential solution ("I need to compare them letter by letter" (implies global alignment end to end), "R-package") in the problem description.

Also, you example is most likely not representative.

Most likely there is already a well established framework in genetics that covers it.

Regarding statistics of alignments maybe this helps: https://www.ncbi.nlm.nih.gov/BLAST/tutorial/

Accordingly, approximate p-values of global alignments are best determined from simulations. Is it that what you are aiming at?

ADD REPLY • link updated 4.9 years ago by Ram 44k • written 8.9 years ago by Michael 55k

0

Entering edit mode

Ok, Thanks.

ADD REPLY • link updated 4.9 years ago by Ram 44k • written 8.9 years ago by MAPK ★ 2.1k

0

Entering edit mode

Why did you delete the original content of that comment? It looked like it was more informative...

ADD REPLY • link 8.9 years ago by Michael 55k

score 0 · Answer 1 · 2016-01-23

0

Entering edit mode

8.9 years ago

moranr ▴ 290

Could you just assess the evolutionary distance between each sample to see how similar they are to one another ?

ADD COMMENT • link 8.9 years ago by moranr ▴ 290