# scikit-survival 0.3 released

Today, I released a new version of scikit-survival, a Python module for survival analysis built on top of scikit-learn.

This release adds predict_survival_function and predict_cumulative_hazard_function to sksurv.linear_model.CoxPHSurvivalAnalysis, which return the survival function and cumulative hazard function using Breslow's estimator.

Moreover, it fixes a build error on Windows (#3) and adds the sksurv.preprocessing.OneHotEncoder class, which can be used in a scikit-learn pipeline.

## Download

You can install the latest version via pip:

pip install -U scikit-survival

or download the source from GitHub.

Unfortunately, I was not able to convince the recently released conda-build 3 to create Anaconda packages, therefore you would need to install from source, for the time being.

## Introduction to Survival Analysis with scikit-survival

Finally, I created a notebook that introduces survival analysis (based on my previous post) and shows you how to use the Kaplan-Meier estimator and Cox's proportional hazards model.

## Comments

## GradientBoostingSurvivalAnalysis use case + predict_survival_fnc

Hi Sebastian,

First of all, thank you for making this! I'm using the GradientBoostingSurvivalAnalysis algorithm and it's getting great concordant results, but I'm at a loss as to how to interpret the float predictions it outputs. I'm very new to survival analysis and am coming to this from a more general classification machine learning background. I'm trying to use this to calculate the probability of an event 6 time periods in the future. I don't know how to do this with the GradientBoostingSurvivalAnalysis despite reading your docs. For now, I'll use the predict_survival_function from the Cox algorithm to get the hazard probability.

Thanks!

Nate

## Not all algorithms are able

In reply to GradientBoostingSurvivalAnalysis use case + predict_survival_fnc by Nate (not verified)

Not all algorithms are able to predict full survival curves. Often, they only provide a relative risk score. They are only useful when compared between multiple patients. For instance, if the predicted risk scores are 0.3 (A), -1.2 (B), 0.6 (C), then this would mean that (B) is predicted to experience the event before (A), and (C) is predicted to experience it last.