How to find key apoptotic markers across subtypes of a colorectal cancer dataset?
1
0
Entering edit mode
6.3 years ago
bio94 ▴ 60

I performed CMS and CRIS classification on the dataset GSE14333 and I have added the respective CMS and CRIS labels to my phenotype.

I was wondering on how to proceed from here on, to find the key apoptotic markers across subtypes. I am sort of new to this whole thing, so would appreciate any help.

Many thanks

head(GSE14333_pheno_new)
          X Location DukesStage Age Gender DFSTime DFS_group DFSCens AdjXRT AdjCTX
1 GSM358387   Rectum          B  54      M    9.96      poor       0      Y      Y
2 GSM358392    Right          B  38      F   17.95      poor       1      N      Y
3 GSM358395    Right          B  78      F   22.02      poor       1      N      Y
4 GSM358396     Left          B  65      F   22.38      poor       0      Y      Y
5 GSM358397     Left          B  65      F   22.38      poor       0      Y      Y
6 GSM358399     Left          B  56      F   25.21      poor       0      Y      Y
  RF.CMS1.posteriorProb RF.CMS2.posteriorProb RF.CMS3.posteriorProb RF.CMS4.posteriorProb
1                  0.20                  0.34                  0.40                  0.06
2                  0.46                  0.06                  0.03                  0.45
3                  0.76                  0.02                  0.03                  0.19
4                  0.10                  0.78                  0.00                  0.12
5                  0.01                  0.95                  0.04                  0.00
6                  0.35                  0.42                  0.22                  0.01
  RF.nearestCMS RF.predictedCMS predict.label2 dist.to.template dist.to.cls1.rank  nominal.p
1          CMS3            <NA>         CRIS-B        0.7331209                68 0.00019996
2          CMS1            <NA>         CRIS-A        0.8965833                52 0.00739852
3          CMS1            CMS1         CRIS-B        0.8559375                80 0.00019996
4          CMS2            CMS2         CRIS-C        0.7944693               111 0.00019996
5          CMS2            CMS2         CRIS-C        0.8465627               120 0.00179964
6          CMS2            <NA>         CRIS-D        0.9366855               148 0.00719856
       BH.FDR Bonferroni.p
1 0.000672593    0.0369926
2 0.010214375    1.0000000
3 0.000672593    0.0369926
4 0.000672593    0.0369926
5 0.002684947    0.3329334
6 0.010013035    1.0000000
cancer subtypes apoptotic markers R • 1.2k views
ADD COMMENT
0
Entering edit mode
6.3 years ago

You could build predictive models for each CMS and CRIS classifier. Please take a look here:

Essentially, it would be a multinomial logistic regression analysis where you test each gene's ability to 'predict' the outcome, i.e., CMS or CRIS:

glm(predict.label2 ~ gene1, family="binomial")
glm(predict.label2 ~ gene2, family="binomial")
et cetera

When you get a final list of statistically significant genes from this, include them in a combined model and test it via R2 shrinkage and ROC analysis, as shown: A: Resources for gene signature creation

Note that you technically don't have to use gene expression as the predictors. You can also use other clinical parameters, e.g.:

glm(predict.label2 ~ DukesStage, family="binomial")
glm(predict.label2 ~ DukesSDFSTimetage, family="binomial")
glm(predict.label2 ~ XLocation, family="binomial")

Kevin

ADD COMMENT
0
Entering edit mode

Thanks very much Kevin.

ADD REPLY

Login before adding your answer.

Traffic: 1810 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6