Machine learning for survival analysis: A case study on recurrence of prostate cancer Journal Article


Authors: Zupan, B.; Demšar, J.; Kattan, M. W.; Beck, J. R.; Bratko, I.
Article Title: Machine learning for survival analysis: A case study on recurrence of prostate cancer
Abstract: Machine learning techniques have recently received considerable attention, especially when used for the construction of prediction models from data. Despite their potential advantages over standard statistical methods, like their ability to model non-linear relationships and construct symbolic and interpretable models, their applications to survival analysis are at best rare, primarily because of the difficulty to appropriately handle censored data. In this paper we propose a schema that enables the use of classification methods - including machine learning classifiers - for survival analysis. To appropriately consider the follow-up time and censoring, we propose a technique that, for the patients for which the event did not occur and have short follow-up times, estimates their probability of event and assigns them a distribution of outcome accordingly. Since most machine learning techniques do not deal with outcome distributions, the schema is implemented using weighted examples. To show the utility of the proposed technique, we investigate a particular problem of building prognostic models for prostate cancer recurrence, where the sole prediction of the probability of event (and not its probability dependency on time) is of interest. A case study on preoperative and postoperative prostate cancer recurrence prediction shows that by incorporating this weighting technique the machine learning tools stand beside modern statistical methods and may, by inducing symbolic recurrence models, provide further insight to relationships within the modeled data. (C) 2000 Elsevier Science B.V. Machine learning techniques have recently received considerable attention, especially when used for the construction of prediction models from data. Despite their potential advantages over standard statistical methods, like their ability to model non-linear relationships and construct symbolic and interpretable models, their applications to survival analysis are at best rare, primarily because of the difficulty to appropriately handle censored data. In this paper we propose a schema that enables the use of classification methods - including machine learning classifiers - for survival analysis. To appropriately consider the follow-up time and censoring, we propose a technique that, for the patients for which the event did not occur and have short follow-up times, estimates their probability of event and assigns them a distribution of outcome accordingly. Since most machine learning techniques do not deal with outcome distributions, the schema is implemented using weighted examples. To show the utility of the proposed technique, we investigate a particular problem of building prognostic models for prostate cancer recurrence, where the sole prediction of the probability of event (and not its probability dependency on time) is of interest. A case study on preoperative and postoperative prostate cancer recurrence prediction shows that by incorporating this weighting technique the machine learning tools stand beside modern statistical methods and may, by inducing symbolic recurrence models, provide further insight to relationships within the modeled data.
Keywords: survival; controlled study; survival analysis; major clinical study; cancer recurrence; follow up; sensitivity and specificity; reproducibility of results; bayes theorem; classification; recurrence; oncology; information processing; prediction; prostate cancer; prostatic neoplasms; probability; prostatectomy; artificial intelligence; outcomes research; radical prostatectomy; computer simulation; decision making; medical computing; decision trees; model; prostate cancer recurrence; mathematical models; prognostic model; statistical methods; machine learning; probability distributions; data structures; learning systems; censored data; machine; hospital data processing; humans; prognosis; human; male; priority journal; article; data weighting; outcome prediction after radical prostatectomy; prognostic models in medicine
Journal Title: Artificial Intelligence in Medicine
Volume: 20
Issue: 1
ISSN: 0933-3657
Publisher: Elsevier Inc.  
Date Published: 2000-09-01
Start Page: 59
End Page: 75
Language: English
DOI: 10.1016/s0933-3657(00)00053-1
PUBMED: 11185421
PROVIDER: scopus
DOI/URL:
Notes: Export Date: 18 November 2015 -- Source: Scopus
Altmetric
Citation Impact
BMJ Impact Analytics
MSK Authors
  1. Michael W Kattan
    218 Kattan