1 Introduction

The Kaplan-Meier Estimator or Kaplan-Meier Survival Plot is a nonparametric test that estimates the probability that some entity will survive beyond a specified time. It is an example of survival analysis, where the response variable is time until an event occurs.

In biomedical research, we are usually interested in things like patient survival following surgery or treatment, but this kind of analysis is generally important for any process that involves the time taken for some event to happen, including:

  • time to failure of electronic devices
  • time a player is at bat in baseball
  • time to recovery (or death) from illness

There are three main areas of application for survival analysis:

  • Estimating survivor function and/or hazard function
  • Comparing survivor functions and/or hazard functions
  • Understanding the relationship between explanatory variables and survival time.

The Kaplan-Meier test is so widely used because it is nonparametric and can cope well with censored data: data where the information is not complete for all individuals. In a biomedical setting, this usually means something like the individual withdraws from the study, or the individual is lost from follow-up.

Non-parametric tests tend to cope better than parametric tests when data has been censored or otherwise does not conform to strong assumptions about the shape of the data (i.e. the statistical model we use in a test to represent the data)

1.1 The Survival Function

The survival function represents the probability that an entity (such as a patient) will survive beyond a specified time.

Figure 1.1 shows an exponential survival function.

An exponential survival function. The proportion of entities survivng for more than one month is 0.37

Figure 1.1: An exponential survival function. The proportion of entities survivng for more than one month is 0.37

Survival functions need not be exponential. They can take any form, which is one reason why we might use the Kaplan-Meier approach rather than an Exponential function (such as was described in notebook 04). The Kaplan-Meier method can be applied to any shape of survival function.

  1. What kinds of processes might give rise to a survival function that is not exponential?

2 Kaplan-Meier Plots

For a single sample, generating a Kaplan-Meier plot can be quite straightforward. Our dataset should contain three columns: an identifier for each entity (e.g. patient ID), their serial time, and their status at the the end of their serial time.

Serial time is the time at which one of the following three things happens:

  • the study ends (the entity survives)
  • the event occurs (usually “death”)
  • the entity is censored (drops out of the study, is lost, etc.)

Several assumptions are made by Kaplan-Meier plots:

  • Censored individuals have the same prospect of survival as individuals who continue to be followed
  • Survival prospects are independent of the date/time an individual is recruited to the study
  • The event recorded (often death) actually happens at the specified time

The table below shows survival data for patients with advanced lung cancer, obtained from the North Central Cancer Treatment Group. This is a dataset from the survival package in R. The data columns are as follows:

  • inst: Institution code
  • time: Survival time in days
  • status: censoring status 1=censored, 2=dead
  • age: Age in years
  • sex: Male=1 Female=2
  • ph.ecog: ECOG performance score as rated by the physician. 0=asymptomatic, 1= symptomatic but completely ambulatory, 2= in bed <50% of the day, 3= in bed > 50% of the day but not bedbound, 4 = bedbound
  • ph.karno: Karnofsky performance score (bad=0-good=100) rated by physician
  • pat.karno: Karnofsky performance score as rated by patient
  • meal.cal: Calories consumed at meals
  • wt.loss: Weight loss in last six months

Loprinzi CL. Laurie JA. Wieand HS. Krook JE. Novotny PJ. Kugler JW. Bartel J. Law M. Bateman M. Klatt NE. et al. “Prospective evaluation of prognostic variables from patient-completed questionnaires. North Central Cancer Treatment Group.: Journal of Clinical Oncology. 12(3):601-7, 1994.https://doi.org/10.1200/jco.1994.12.3.601

Initially, we need only concern ourselves with the institution code (patient ID), survival time, and status at the end of serial time.

Survival function for the complete `lung` dataset, as the solid black line. Censored data is shown by a cross for the corresponding individual, at the time of censoring. The grey ribbon shows the 95% confidence interval for the survival function. The median time to survival is around 310 days.

Figure 2.1: Survival function for the complete lung dataset, as the solid black line. Censored data is shown by a cross for the corresponding individual, at the time of censoring. The grey ribbon shows the 95% confidence interval for the survival function. The median time to survival is around 310 days.

2.1 Comparing Kaplan-Meier Curves

The individuals in the lung dataset were drawn from more than one sex, recorded as 1 (male) or 2 (female). We can visualise survival curves for the two sexes separately on the same plot.

Survival function for the `lung` dataset, stratified by recorded sex (`1` = Male; `2` = Female). Censored data is shown by a cross for the corresponding individual, at the time of censoring. 95% Confidence Intervals are shown as coloured ribbons. The median survival time (MST) for females is 426 days; for males, MST is 270 days.

Figure 2.2: Survival function for the lung dataset, stratified by recorded sex (1 = Male; 2 = Female). Censored data is shown by a cross for the corresponding individual, at the time of censoring. 95% Confidence Intervals are shown as coloured ribbons. The median survival time (MST) for females is 426 days; for males, MST is 270 days.

Figure 2.2 appears to suggest that, at most timepoints, a smaller proportion of female patients have died than male patients. This suggests that the survival time for female patients is longer than that for male patients. Also, the 95% confidence intervals for the two curves do not appear to overlap considerably, which suggests that the two survival curves may be statistically significantly different. But this is just a visual observation. Inspecting the data shows us that the median survival time (MST) is 270 days for males, and 426 days for females, and the 95% confidence intervals for this estimate do not overlap. This supports our contention that the average survival for the two sexes differs.

## Call: survfit(formula = Surv(time, status) ~ sex, data = lung)
## 
##         n events median 0.95LCL 0.95UCL
## sex=1 138    112    270     212     310
## sex=2  90     53    426     348     550

To test whether the difference between survival curves is significant, we can use the log rank test or the Cox proportional hazards model. In the output below, the results of the Cox proportional hazards model and log rank tests are shown. This tests whether the survival curves differ for male and female patients.

## Call:
## coxph(formula = Surv(time, status) ~ sex, data = lung)
## 
##   n= 228, number of events= 165 
## 
##        coef exp(coef) se(coef)      z Pr(>|z|)   
## sex -0.5310    0.5880   0.1672 -3.176  0.00149 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##     exp(coef) exp(-coef) lower .95 upper .95
## sex     0.588      1.701    0.4237     0.816
## 
## Concordance= 0.579  (se = 0.021 )
## Likelihood ratio test= 10.63  on 1 df,   p=0.001
## Wald test            = 10.09  on 1 df,   p=0.001
## Score (logrank) test = 10.33  on 1 df,   p=0.001

This table describes the result of testing whether the curves for each sex differ. The output can be a little intimidating, but the important points to note are:

  • the difference between the curves for the two sexes is statistically significant. In the table Pr(>|z|) = 0.00149 indicates the P-value, testing against the null hypothesis that the curves are produced by the same underlying survival process.
  • the regression coefficient (coef) indicates the relative risk for group 2 (females) compared to group 1 (males); the coefficient is -0.53, which is negative, indicating that females have lower risk of death
  • the hazard ratio is shown as the exponent of the regression coefficient: exp(coef). Here, it is 0.59, which means that being female is calculated to reduce risk by 41%. The 95% confidence intervals suggest the reduction in risk is probably between 18% and 58%.
  • three global significance tests are shown, and each returns a P-value of approximately 0.001, suggesting that there is strong evidence to reject the null hypothesis that the two curves are drawn from the same survival process