Predict Probability Of Cancer From Microarray Data
3
0
Entering edit mode
13.7 years ago
Sara • 0

Are there statistical methods that aim to predict the probability of somebody being affected by cancer in the future based on microarray data?

What do you know about the performance of these methods?

cancer statistics • 2.7k views
ADD COMMENT
2
Entering edit mode

Sara, it would be smart if you used the same OpenID each time you log in. You now have your questions spread over two different accounts in the BioStar system.

ADD REPLY
2
Entering edit mode

The question is unclear to me. Predictions based on what? Tissue samples of that somebody? What would you expect? Some pre-cancerous changes in expression levels? Would make much more sense to look at sequencing data here, IMHO.

ADD REPLY
5
Entering edit mode
13.7 years ago

Predictive profiles derived or diagnostic profiles derived from microarrays are indeed often used, and alas not very successful. That is for a large part a result of the nature of the arrays and to some extend also of the nature of the diseases.

  • Microarrays in general measure the expression of many many genes. So the chances of finding a false positive result for a single gene are relatively high. The chances of repeatedly finding the same gene regulated in multiple samples as a false positive are slim when calculated per gene, but still relevant when calculated per array. Such false positive genes will normally not survive the validation but that means you have to reject the profile, not just the gene. Performing rigid False discovery rate corrections would of course help, but it also lowers the chance of finding a meaningful profile in the first place.
  • Complex diseases like cancer do indeed often result from aberrant gene expression, but that can also be caused by copy number variations that include the involved genes. Such copy number variations yield a larger number of affected genes that will obscure your analysis. But this can be treated by studying the individual array results for the occurrence of such copy number variations (a number of strongly regulated in the same genomic location).
  • In general lower expressed genes that are measured around the detection limit of your array will yield highly dynamic results that often obscure the analysis. It helps to remove such genes. But unfortunately transcription factors and other regulatory genes mostly have low expressions and these are often important for disease regulation.
  • Some genes show high biological variations for instance the genes that are found in peripheral blood mononuclear cells (PBMCs), that are often used for biomarker development since they can be easily obtained in humans, are important in immune and inflammatory responses and thus show highly dynamic behavior for these genes. This variation may be more the result of a common cold than of a tumor. This can be improved by removing such highly dynamic gens from your analysis.
  • Cellular contaminants can also cause variation. The presence of low amount of reticulocytes in your sample can for instance result in large variations in hemoglobin expression.
  • Tumors (and to a lesser extend other diseased tissues) are highly variable in composition and micro environment like oxygenation. This will result in large differences in gene expression even between otherwise equal tumor cells. This is almost impossible to prevent and by itself probably disqualifies the approach in most cases.
  • One thing you can try to overcome these problems, apart from the cleaning procedures suggested, is to build the profiles not from individual genes but from affected pathways or functional gene classes.
ADD COMMENT
4
Entering edit mode
13.7 years ago
Laurent ★ 1.7k

Cancer is a complex disease, that depends on genetic and environmental factors. In terms of genetic background, some mutations will increase the risk of developing cancer and DNA microarrays can be used to assess association between mutations and cancer through genome wide association studies (GWAS). Searching 'GWAS cancer' should get you enough reading material.

Expression arrays have also been used quite a bit as prognostic or diagnostic assays. Essentially, expression signature are used to predict virulence of treatment effect or to classify cancers. Such studies always use a training set to define the signature (i.e the features of interest) and a test set to validate it. There is a wide range of publications out there: searching 'microarray cancer signature' for example should get you on the path.

My answer might be a bit vague, but there is too much out there and your question is very general.

ADD COMMENT
4
Entering edit mode
13.7 years ago
Bio_X2Y ★ 4.4k

To add to Laurent's answer, microarrays are also often used to predict the risk of metastasis in cancer patients. For example, the following paper describes a microarray-based method for predicting metastasis risk for breast cancer patients. The paper describes the clustering/statistical methods involved.

ADD COMMENT
0
Entering edit mode

This is indeed a big paper in the field of breast cancer microarray signatures.

ADD REPLY

Login before adding your answer.

Traffic: 2161 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6