scikit-survival 0.3 released

Today, I released a new version of scikit-survival, a Python module for survival analysis built on top of scikit-learn.

This release adds predict_survival_function and predict_cumulative_hazard_function to sksurv.linear_model.CoxPHSurvivalAnalysis, which return the survival function and cumulative hazard function using Breslow's estimator.

Moreover, it fixes a build error on Windows (#3) and adds the sksurv.preprocessing.OneHotEncoder class, which can be used in a scikit-learn pipeline.

Download

You can install the latest version via pip:

pip install -U scikit-survival

or download the source from GitHub.

Unfortunately, I was not able to convince the recently released conda-build 3 to create Anaconda packages, therefore you would need to install from source, for the time being.

Introduction to Survival Analysis with scikit-survival

Finally, I created a notebook that introduces survival analysis (based on my previous post) and shows you how to use the Kaplan-Meier estimator and Cox's proportional hazards model.

Comments

Hi Sebastian,

First of all, thank you for making this! I'm using the GradientBoostingSurvivalAnalysis algorithm and it's getting great concordant results, but I'm at a loss as to how to interpret the float predictions it outputs. I'm very new to survival analysis and am coming to this from a more general classification machine learning background. I'm trying to use this to calculate the probability of an event 6 time periods in the future. I don't know how to do this with the GradientBoostingSurvivalAnalysis despite reading your docs. For now, I'll use the predict_survival_function from the Cox algorithm to get the hazard probability.

Thanks!
Nate

In reply to by Nate (not verified)

Not all algorithms are able to predict full survival curves. Often, they only provide a relative risk score. They are only useful when compared between multiple patients. For instance, if the predicted risk scores are 0.3 (A), -1.2 (B), 0.6 (C), then this would mean that (B) is predicted to experience the event before (A), and (C) is predicted to experience it last.