This release of scikit-survival 0.8 adds some nice enhancements for validating survival models.
- it has been shown that it is too optimistic with increasing amount of censoring1,
- it is not a useful measure of performance if a specific time point is of primary interest (e.g. predicting 2 year survival).
The second point can be addressed by extending the well known receiver operating characteristic curve (ROC curve) to possibly censored survival times. Given a time point t, we can estimate how well a predictive model can distinguishing subjects who will experience an event by time t (sensitivity) from those who will not (specificity). The newly added function cumulative_dynamic_auc implements an estimator of the cumulative/dynamic area under the ROC for a given list of time points3.
Both estimators rely on inverse probability of censoring weighting, which means they require access to training data to estimate the censoring distribution from. Therefore, if the amount of censoring is high, some care must be taken in selecting a suitable time range for evaluation.
For a complete list of changes see the release notes.
Pre-built conda packages are available for Linux, OSX and Windows:
conda install -c sebp scikit-survival
Alternatively, scikit-survival can be installed from source via pip:
pip install -U scikit-survival
- Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B., & Wei, L. J. (2011). On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Statistics in Medicine, 30(10), 1105–1117.
- H. Hung and C. T. Chiang, Estimation methods for time-dependent AUC models with survival data, Canadian Journal of Statistics, vol. 38, no. 1, pp. 8–26, 2010.
- H. Uno, T. Cai, L. Tian, and L. J. Wei, Evaluating prediction rules for t-year survivors with censored regression models, Journal of the American Statistical Association, vol. 102, pp. 527–537, 2007.