Hi all,
I posted my question in the PAML discussion group twice and haven't heard anything for months so hoping someone here can explain this to me:
I have a gene dataset with two partitions A and B, I separate the dataset into just A and just B to use for random-sites models to see if there's any pos selection happening in either dataset. Then I want to also see if there's divergent selection happening so I use CmC. What I notice is that with random-sites for like 40 genes I'm analyzing, there's more support for positive selection (i..e M2a vs. M1a is significant) for like 30 genes for partition A and only 15 for B.
However....in CmC analysis, all the genes are just showing purifying but divergent selection (even when significant, both partitions have dN/dS < 1). I am wondering if the random-sites models is affected more by sample size. Partition A had 40 species but partition B I only had data for 15 species. If this is indeed a discrepancy where more species = more variation detected in random-sites models, why would they appear so similar then in CmC when I use the combine dataset and label my partitions, with partition B set as the foreground.
Hope this made sense! Thank you!