Kidney failure is a frequently used clinical end point in CKD studies, and statistical analysis often focuses on the association between exposures, predictors, and the time duration from a designated baseline to that event. Other end points include doubling of serum creatinine in CKD studies and graft failure in kidney transplant studies. Censoring is a common analytical challenge with such outcomes. For example, if a CKD cohort study lasts for 5 years from launch to completion, and patients are enrolled during the first 3 years, then the maximum follow-up time of each patient is between 2 and 5 years, after which the data are censored. Censoring prevents the observation of kidney failure in the sense that there is always a chance that kidney failure occurs after censoring: had the follow-up been longer, more kidney failure events would be observed. Survival analysis techniques, such as Kaplan–Meier analysis and Cox proportional hazard regression, properly account for censoring, and the result can be interpreted relative to a hypothetical but realistic situation where all patients are followed to the end point without censoring.

Death can occur in CKD and it introduces another analytical challenge. Like censoring, it also prevents the observation of kidney failure. However, it is unrealistic to imagine a CKD population without death, and there is no chance that kidney failure occurs after death. This makes death a fundamentally different statistical concept in comparison to censoring. In statistical terminology, death is a competing risk event for kidney failure. Specialized methods are needed to handle prediction problems with competing risks. If death is treated as censoring or excluded from the analysis, the result may be biased.

The recent study by Ravani *et al.*^{1} is a timely contribution to the renal literature. Competing risk is the norm rather than the exception in CKD epidemiology research, but it has not received enough attention and appropriate methods remain underutilized.^{2–4} Although the statistical literature has demonstrated theoretically and through computer simulation that not accounting for the competing risk of death leads to overestimation of the risk of the clinical event of interest, the work by Ravani *et al.* remains novel and impressive because it demonstrates this result through a large and relevant data set. The paper goes further to show that the size of the bias can be clinically important, and that the bias is of particular concern in stage 4 CKD, a subpopulation of substantial research interest. These latter conclusions are highly relevant to kidney research.

Why is the risk of kidney failure overestimated if we simply treat death as censoring? We offer an intuitive explanation here by using the “redistribution to the right” theory.^{5} Take the Kaplan–Meier estimation of kidney failure probability as an example. The theory says that the Kaplan–Meier estimation is equivalent to assigning every censored patient a non-zero probability of kidney failure in the future, breaking that probability into smaller pieces and redistributing them as probability weights at multiple time points beyond the censoring time. When we want to calculate, for example, the 3-year probability of kidney failure from baseline, the method would sum up the probability weights among those who have kidney failure within 3 years. When death is treated as censoring, the algorithm would assign a positive probability of kidney failure beyond the time of death, which is obviously unrealistic. That is why the risk of kidney failure is overestimated when death is treated as censoring. Similar explanation carries over to more complicated analysis, such as regression models for kidney failure.

The cause-specific hazard model and subdistribution hazard model are two important methods for analyzing competing risk data. Typically, the cause-specific hazard model is specified as a Cox model for kidney failure, with death treated as censoring. When the goal is to investigate association, the cause-specific hazard model provides interpretable cause-specific hazard ratios, *i.e.*, how the instantaneous risk (hazard) of kidney failure changes with each unit increase in a covariate. This result may contribute to a misunderstanding that it is also reasonable to treat death as censoring when the goal is to predict future event probabilities. However, the cause-specific hazard model cannot produce predicted probabilities of the event of interest without additional models for the competing risk event, because the predicted probability of kidney failure must take then into account both the incidence of kidney failure and the incidence of death. Here is an intuitive explanation. If patient A has a high hazard for kidney failure and a low hazard for death, then the 3-year probability that this patient reaches kidney failure before they die is high. If patient B has the same high hazard for kidney failure but an even higher hazard for death, then this probability could be very low. The 3-year probabilities of kidney failure, death prior to kidney failure, and being alive without either clinical event sum up to 1. When the probability of death increases, it squeezes the room left for the probability of kidney failure. Therefore, under the cause-specific hazard framework, the cause-specific hazard model for kidney failure alone cannot determine the predicted probability. The subdistribution hazard model, in contrast, establishes a direct relationship between covariates and the probability of kidney failure, taking into account the attrition due to death. Therefore, this model can be used to predict the future risk of kidney failure. However, the hazard ratio from the subdistribution hazard model can have a challenging interpretation when the analytic objective is to understand association, as a low subdistribution hazard ratio for kidney failure may reflect either a protective association of an exposure with kidney failure or an adverse association with death. Therefore, the cause-specific hazard model is often more suitable for studying etiology, *i.e.*, the association between covariates and the instantaneous risk of a clinical event, whereas the subdistribution hazard model is more suitable for prediction, *i.e.*, estimating the future probability of the event.

In prediction model development, competing risks also demand measures of prediction accuracy specifically designed for such data.^{6–7} Conventional metrics designed for censored data may not apply. We also note that the need to adopt a competing risk analysis can sometimes be avoided by considering the composite of kidney failure and death, thus treating death as part of the outcome of interest. In some studies, it is of interest to study both kidney failure and death, irrespective of whether the death occurs before or after kidney failure. Here, death precludes subsequent kidney failure, but kidney failure does not preclude subsequent observation of death. Such data are termed semicompeting risk data and special methods are available.^{8} In summary, competing risk data are common in CKD epidemiologic studies, but proper statistical methods are underutilized despite plentiful statistical literature and software for such data. The timely contribution of Ravani *et al.*^{1} shows that it is time to put competing risk modeling into our standard analytical toolbox.

## Disclosures

None.

## FUNDING

This research was supported by the National Institute of Health grant R01DK118079.

## Footnotes

Published online ahead of print. Publication date available at www.jasn.org.

See related article, “Influence of Mortality on Estimating the Risk of Kidney Failure in People with Stage 4 CKD,” in the November issue on pages 2219–2227.

- Copyright © 2019 by the American Society of Nephrology