Sorry if this sounds n00b-ish. But lets say you have a drug and treat cells with one and dont treat the other. you do RNAseq and you find some DEG's that change with drug added. is there a way to score how good that signature is?
I assume something related to how many samples you have. if you have 6 cell lines in the drug treated group and 8 in the non-treated group that n would impact it, you could create come sort of z score for the gene set , then score each sample, and then see what percent is statistically higher than the mean of the non-treated group? although the n is already taken into consideration with DGE analysis a little.
what would be the best way to try and say that signature A was more informative of an outcome (or predictive) than say signature B? if you had two signatures
apologies again for n00b question and I have a feeling I am thinking about this in the wrong way but it was a late night discussion and I haven't got a good solid answer in my mind. if this was a continuous variable I think I would have some more ideas, but its binary situation... drug or no drug, disease or no disease, survivor or non-survivor type thing...
thanks in advance!
yeah was more talking about a collection of DE genes ( a gene set if you will ) and how likely that gene set is associated with an outcome. if i have two gene sets for two different outcomes and I want to test for which gene set is better at predicting or even associated with a specific outcome. where ever gene in the gene set is already DE. but obviously some genes are more driven by specific sample or patients etc , and some phenotypes have more power and more samples, surely that will impact how informative a gene signature is and I just need a way to sorta weigh two or more signatures statistically in this thought experiment.