PyCon UK, Cardiff
Sebastian Pölsterl
29 October 2017
Survival Analysis is often used when studying:
Formally, each record consists of
The observable time y is defined as:
from sksurv.datasets import load_veterans_lung_cancer
data_x, data_y = load_veterans_lung_cancer()
Age | Cell type | Karnofsky score | Months from Diagnosis | Prior therapy? | Treatment | Survival in days | Dead? |
---|---|---|---|---|---|---|---|
69 | 'squamous' | 60 | 7 | 'no' | 'standard' | 72 | True |
53 | 'smallcell' | 39 | 4 | 'yes' | 'standard' | 16 | True |
57 | 'adeno' | 99 | 3 | 'no' | 'test' | 83 | False |
from sksurv.nonparametric import kaplan_meier_estimator
for group in ("standard", "test"):
mask = data_x["Treatment"] == group
time, surv_prob = kaplan_meier_estimator(
data_y["Status"][mask],
data_y["Survival_in_days"][mask])
plt.step(time, surv_prob, where="post",
label="Treatment = {}".format(group))
from sksurv.preprocessing import OneHotEncoder
from sksurv.linear_model import CoxPHSurvivalAnalysis
encoder = OneHotEncoder()
estimator = CoxPHSurvivalAnalysis()
estimator.fit(encoder.fit_transform(data_x), data_y)
data_new_raw = pd.DataFrame(…)
data_new = encoder.transform(data_new_raw)
pred_curves = estimator.predict_survival_function(data_new)
for curve in pred_curves:
plt.step(curve.x, curve.y, where="post")
from sksurv.datasets import load_breast_cancer
from sksurv.preprocessing import OneHotEncoder
from sksurv.linear_model import CoxnetSurvivalAnalysis
from sklearn.model_selection import GridSearchCV, KFold
X, y = load_breast_cancer()
Xt = OneHotEncoder().fit_transform(X)
cv = KFold(n_splits=5, shuffle=True, random_state=328)
coxnet = CoxnetSurvivalAnalysis(n_alphas=100,
l1_ratio=1.0, alpha_min_ratio=0.01).fit(Xt, y)
gcv = GridSearchCV(coxnet,
{"alphas": [[v] for v in coxnet.alphas_]},
cv=cv).fit(Xt, y)
scikit-survival includes implementations of more advanced methods:
scikit-survival is available for Python 3.4 and later on Linux, OSX, and Windows.
Install via Anaconda:
conda install -c sebp scikit-survival
or via pip:
pip install scikit-survival
Source code: github.com/sebp/scikit-survival