Hi, my statistic is bad. I got two datasets
cell count fold change depending on the knocked-out gene (quantile normalized)
Model GeneA_KO GeneB_KO GeneC_KO
ModelM 0.075638312 -0.870157286 -2.009107843
ModelO -0.162841966 -0.359008725 -0.826469109
ModelS -0.1107031 -0.990406888 -1.118059869
ModelA -0.457154616 -0.120656982 -0.23181682
ModelI -0.505815194 -1.414276318 -0.875524634
presence (1) or absence (0) of a set of known cancer driver mutations
Mutation ModelM ModelO ModelS ModelA ModelI
Gene1_mut 0 0 0 0 0
Gene2_mut 0 0 0 0 0
Gene3_mut 0 0 0 0 0
Gene4_mut 1 1 1 1 1
how do i figure out, for each gene-mutation pair, whether the presence of that variant is associated to a change in the outcome of the CRISPR knock-out experiment for that gene, across all cell lines?
Edit: or is there any tool in python that does this kind of analysis?
Thanks Daniel for your reply
I found Point-biserial correlation which is a way to get the correlation between categorical variables and continuous variable, similar to Pearson correlation.
The x could be mutation 0/1 I'm not quite sure about the y either I should average the cell count fold change?