Hi,
I have datasets of polymorphisms (with a number of Gs) in DNA sequences from a number of clones for specific phenotypic traits. Clones have different numbers of Gs (denoted as L4,L5,L6).
Data example for wild-type(WT) phenotype:
L4 L5 L6
Clone_B1 2 2 3 WT phenotype
Clone_B2 1 4 5 WT phenotype
Clone_B3 2 2 4 WT phenotype
Clone_B4 4 3 3 WT phenotype
Clone_B5 2 2 2 WT phenotype
Data example for a phenotype under investigation:
L4 L5 L6
Clone_A1 2 3 3 Phenotype_M
Clone_A2 3 4 5 Phenotype_M
Clone_A3 1 2 4 Phenotype_M
Clone_A4 6 3 3 Phenotype_M
Clone_A5 4 1 2 Phenotype_M
Data explanation: in the WT phenotype data, 2 sequence reads (1st row and 1st column element in the matrix) of clone_B1 has 4 repeated Gs (L4), 3 sequence reads (1st row and 2nd column element in the matrix) of clone_B1 has 5 repeated Gs (L4) etc...
My questions is: Is it a good idea to use Bayesian algorithm to determine which of the Ls might be responsible in the 'phenotype under investigation' compared to the 'wild-type phenotype' ? Which Bayesian algorithms and R packages may be useful for this purpose?
Thanks
Thanks hanguangchun. I think the two questions are covered. I will be glad for advice on the appropriate tutorials and algorithm.