## Abstract

Recent recommendations emphasize the need to assess kidney function using creatinine-based predictive equations to optimize the care of patients with chronic kidney disease. The most widely used equations are the Cockcroft-Gault (CG) and the simplified Modification of Diet in Renal Disease (MDRD) formulas. However, they still need to be validated in large samples of subjects, including large non-U.S. cohorts. Renal clearance of ^{51}Cr-EDTA was compared with GFR estimated using either the CG equation or the MDRD formula in a cohort of 2095 adult Europeans (863 female and 1232 male; median age, 53.2 yr; median measured GFR, 59.8 ml/min per 1.73 m^{2}). When the entire study population was considered, the CG and MDRD equations showed very limited bias. They overestimated measured GFR by 1.94 ml/min per 1.73 m^{2} and underestimated it by 0.99 ml/min per 1.73 m^{2}, respectively. However, analysis of subgroups defined by age, gender, body mass index, and GFR level showed that the biases of the two formulas could be much larger in selected populations. Furthermore, analysis of the SD of the mean difference between estimated and measured GFR showed that both formulas lacked precision; the CG formula was less precise than the MDRD one in most cases. In the whole study population, the SD was 15.1 and 13.5 ml/min per 1.73 m^{2} for the CG and MDRD formulas, respectively. Finally, 29.2 and 32.4% of subjects were misclassified when the CG and MDRD formulas were used to categorize subjects according to the Kidney Disease Outcomes Quality Initiative chronic kidney disease classification, respectively.

The prevalent and incident rates of ESRD are continuously increasing in all Western countries. Data from the U.S. Renal Data System predict that the number of patients who were registered with ESRD in 1997 will have doubled in 2010, leading to approximately 700,000 patients with ESRD and 2.2 million patients in 2030 (1), and similar trends are anticipated in other countries (2–4). To level off these incident rates, various initiatives, such as the Kidney Disease Outcomes Quality Initiative (K/DOQI), have provided physicians with guidelines to optimize the care of patients with chronic kidney disease (CKD). These guidelines emphasize the need to assess kidney function using predictive equations rather than serum creatinine alone (5). However, they also highlight that these equations still need to be validated in large samples of subjects, in particular that they should be tested in non-U.S. populations and in individuals with mild decrease in kidney function or normal GFR (5). Validation of the predictive formulas is also particularly important for patients aged 65 and older, who by far have the highest incident rates of ESRD (1,6,7).

The formulas that are most widely used to estimate kidney function and that are recommended in adults by the K/DOQI guidelines (5) are the Cockcroft-Gault (CG) formula (8) and the recently developed (9) and later simplified (10) Modification of Diet in Renal Disease (MDRD) formula. The CG formula is an estimate of creatinine clearance originally developed in a population of 236 Canadian patients, 209 of which were male. The MDRD formulas have been developed as an estimation of ^{125}I-Iothalamate renal clearance–based GFR measurement in a population of 1628 patients with previously diagnosed CKD (9–11). The mean GFR in this population was 39.8 ± 21.2 ml/min per 1.73 m^{2}, and the mean age of the cohort was 50.6 ± 12.7 yr.

The K/DOQI CKD guidelines have established a five-stage classification of patients with CKD that is based solely on kidney function. These stages are defined by GFR ≥90 ml/min per 1.73 m^{2} (stage 1), 60 to 89 ml/min per 1.73 m^{2} (stage 2), 30 to 59 ml/min per 1.73 m^{2} (stage 3), 15 to 29 ml/min per 1.73 m^{2} (stage 4), and <15 ml/min per 1.73 m^{2} (stage 5) (5). The guidelines state that the stage of kidney disease should be determined for each CKD patient and that a clinical action plan should be developed on the basis of the stage of disease (5). Thus, inaccurate estimation of kidney function may be responsible for misclassification of some patients and lead to inappropriate evaluation or treatment of these patients (12). However, so far, few studies have assessed the applicability of the MDRD and CG formulas to large cohorts of subjects with wide ranges of renal function. One study compared various formulas with ^{125}I-iothalamate GFR in a cohort of 1703 blacks with presumed hypertensive nephrosclerosis and mean serum creatinine levels of 1.85 ± 0.88 mg/dl (13). All other studies focused on much smaller cohorts of subjects with or without CKD (14–18). Furthermore, with one exception (15), no particular attention was paid to calibration of serum creatinine measurements, although this has been shown to be of critical importance for individuals with normal or near normal serum creatinine values (19,20).

In this study, we compare renal clearance of ^{51}Cr-EDTA (measured GFR) with GFR estimated by the CG formula (CG GFR) or the MDRD equation (MDRD GFR) in a cohort of 2095 European subjects. Our findings support the preferential use of the MDRD formula but raise caution regarding its usage in some subgroups of individuals, such as young adults with normal renal function or stage 2 CKD or underweight individuals.

## Materials and Methods

### Patient Selection

Records of all patients who were referred to our Department of Physiology between January 1990 and April 2004 to perform GFR measurements were reviewed retrospectively. For patients who had more than one GFR measurement, only the first one was considered. Renal transplant patients and patients who were younger than 18 yr were excluded. Among the remaining 2178 independent patients, only 83 were black. Because ethnicity is one of the determinants of the MDRD equation, we decided to exclude black patients and restrict the analysis to the 2095 nonblack individuals to ensure statistical relevance of the study. Among them, 1933 had CKD and 162 were healthy potential kidney donors.

### GFR Measurements

Renal clearance of ^{51}Cr-EDTA was determined as described previously (21–23). Briefly, 3.5 MBq of ^{51}Cr-EDTA (Amersham Health SA, Pantin, France) was injected intravenously as a single bolus. The injected dose was reduced to 1.8 MBq in patients with an estimated GFR derived from the CG formula of <30 ml/min and in case of body weight <40 kg. After allowing 1 h for distribution of the tracer in the extracellular fluid, urine was collected and discarded. Then, average renal ^{51}Cr-EDTA clearance was determined on five consecutive 30-min clearance periods. Blood was drawn at the midpoint of each clearance period and up to 300 min after injection. The radioactivity measurements in 1-ml plasma and urine samples were carried out on a Packard Cobra 3-inch crystal γ-ray well counter (Boston, MA). When timed urine samples could not be obtained, plasma clearance of ^{51}Cr-EDTA was calculated according to a simplified method described by Brochner-Mortensen (24). This was performed in 219 (10.5%) patients. In our hands, the coefficients of variation of renal clearance of ^{51}Cr-EDTA and plasma clearance of ^{51}Cr-EDTA were 8.4 ± 5.0 and 9.0 ± 5.3%, respectively, whereas the coefficient of variation of inulin clearance was 9.1 ± 6.3% in the same 22 patients. When compared with inulin renal clearance, the mean bias of EDTA renal clearance was 4.0 ± 4.9 ml/min per 1.73 m^{2} (Froissart *et al.*, manuscript in preparation).

### Creatinine Assay

All creatinine measurements were performed in the same laboratory. Blood samples were obtained simultaneously with the GFR measurement. A modified kinetic Jaffé colorimetric method was used with a Bayer RA-XT and a Konelab 20 analyzer. A five-point calibration was applied in each assay. Before measurement, ultrafiltration of plasma through a 20-kD cutoff membrane (MPS-1; Amicon, Beverly, MA) was performed to discard chromogens that were linked to albumin and other heavy proteins. In the absence of an international standard for creatinine assay, the linearity of the measurements was verified by using plasma samples from normal subjects in which increasing amounts of desiccated creatinine hydrochloride (MW 149.6; Sigma Chemicals, Perth, Australia) had been added.

Linear regression analysis showed that the slope of the relationship between measured and expected creatinine concentrations was 1.008 ± 0.006 (95% confidence interval, 0.997 to 1.020) and that the Y-intercept was 0.014 ± 0.013 (95% confidence interval, −0.013 to 0.041; Figure 1). Squared Spearman rank coefficient of correlation was 0.998. Internal quality controls showed a coefficient of variation of 2.3% during the period. An indirect evaluation of the stability of the measurement was obtained from the ratiometric expression of MDRD/GFR values over time. No clear shift was observed during the entire study period, supporting the absence of variation in creatinine calibration (data not shown). Calibration of our creatinine measurements [HEGPcr.] to the ones of the MDRD laboratory [MDRDcr.] Dr F. Van Lente showed a linear relationship defined by the following equation:

Thus, for serum creatinine ranging from 0.6 to 1.2 mg/dl, the difference between both measurements (MDRDcr. − HEGPcr.) is confined between −0.016 and 0.074 mg/dl.

### Creatinine-Based Estimation of GFR

The two formulas that we studied to predict GFR from serum creatinine were the one proposed by Cockcroft and Gault (8):

and the simplified form of the MDRD formula (10):

where PCr is plasma creatinine concentration.

A correction for body surface area (BSA) was necessary for the CG formula. This was performed using estimated BSA according to Du Bois (25):

### Statistical Analyses

Demographic data were expressed as mean ± SD or median and interquartile range, as appropriate. Estimated and measured GFR are statistically dependent variables. To compare the creatinine-based estimations of GFR with the renal clearance of ^{51}Cr-EDTA, we used Bland and Altman recommendations for such evaluations (26). The mean difference between estimated and measured GFR values directly estimates the global bias. The width of the SD of the mean difference is an estimation of precision; a large width means a low precision.

The absolute of the difference between estimated and measured GFR was used to estimate the accuracy of the creatinine-based formulas. It was expressed either in ml/min per 1.73 m^{2} or in percentage of GFR values and was represented in percentiles (50th, 75th, and 90th), allowing to draw absolute and relative boundaries for the lack of accuracy. The accuracy was also measured as the percentage of results that did not deviate >15, 30, and 50% from the measured GFR.

The combined root mean square error (CRMSE) was examined. CRMSE is calculated as the square root of [(mean difference between estimated and measured GFR)^{2} + (SD of the difference)^{2}]. It measures both bias and precision (27). Statistical analyses were performed using Statview 5.0 software (SAS, Cary, NC).

## Results

### Demographics and GFR Distribution

The main characteristics of the study population are shown in Table 1. All 162 kidney donors were younger than 65 yr. Measured GFR values were equally distributed above (1044 subjects) and below (1051 subjects) 60 ml/min per 1.73 m^{2}. For subsequent analyses, the study population was divided into subgroups according to gender, age (18 to 64 yr *versus* 65 yr or older), and/or measured GFR (≥60 *versus* <60 ml/min per 1.73 m^{2}).

Two-way ANOVA test showed that measured GFR values differed with respect to gender and age. Women had higher measured GFR values than men (65.8 ± 33.8 *versus* 57.9 ± 31.5 ml/min per 1.73 m^{2}; *P* < 0.0001). Subjects who were ≥65 yr had lower GFR values than younger ones (45.2 ± 24.3 *versus* 67.4 ± 33.4 ml/min per 1.73 m^{2}; *P* < 0.0001). However, no significant interaction between gender and age was observed (*P* = 0.2880).

### Relationships between Creatinine-Based Estimations of GFR and Measured GFR

The relationships between measured GFR and MDRD GFR or CG GFR are depicted in Figures 2 and 3, respectively. As shown in Figures 2Aand 3A, standard regression analyses of these relationships showed a good global agreement between the two variables (*r* = 0.910 and 0.894, respectively). However, as extensively studied by Bland and Altman, the measurement of agreement between two methods should be preferentially expressed using bias plots of the difference against the average (26,28,29). Such a plot showed a mean difference of −0.99 ml/min per 1.73 m^{2} between MDRD GFR and measured GFR (Figure 2B), which corresponds to a statistically significant (*P* = 0.001) but limited bias of the MDRD equation. Similarly, when applied to CG GFR, the Bland and Altman plot showed a mean difference of 1.94 ml/min per 1.73 m^{2} (Figure 3B), which is highly statistically significant (*P* < 0.0001) but has limited clinical implications. However, for both formulas, the biases were not uniform over the whole range of GFR values (Table 2⇓).

The performance of an equation largely depends on its precision. The SD of the mean difference was used to characterize the precision of each equation. It was 13.7 and 15.4 ml/min per 1.73 m^{2} for the MDRD and CG formulas, respectively. However, as observed in Figures 2B and 3B, this lack of precision was not identical throughout the whole range of GFR values, and both formulas were much more precise for low GFR values. This led us to analyze the precision of each formula according to GFR levels (Table 2). For all categories of GFR, the MDRD formula was more precise than the CG one (Table 2).

Accuracy is a global indicator of the performance of a formula that takes into account its bias and its precision. We tested the accuracy of both formulas in subjects with measured GFR ≥ and <60 ml/min per 1.73 m^{2} by calculating CRMSE and by determining the percentage of subjects who did not deviate >15, 30, and 50% from measured GFR (accuracy within in Table 3⇓). In all cases and with both measurements of accuracy, the MDRD formula had better performances than the CG one (Table 3).

Because the performance of a regression-based equation depends on the population to which the equation is applied, we tested the performance of the equations in CKD patients and in kidney donors (Tables 4 and 5). We also assessed the sensitivity and the specificity of the two formulas for assigning CKD patients to the categories defined by the K/DOQI CKD classification (Table 4) (5). Performance of the MDRD equation was slightly but not significantly better in kidney donors (Table 5) than in stage 1 or 2 CKD patients (ANOVA, *P* = 0.49, NS). The CG formula was less biased in stage 1 or 2 CKD patients than in kidney donors (ANOVA, *P* = 0.001).

### Comparison of Bias and Precision of Estimated GFR Values According to Gender and Age

Besides plasma creatinine values, gender, age, and weight are the three parameters that are taken into account in the MDRD and/or CG formulas. We thus analyzed the performance of these two formulas according to age, gender, and body mass index (BMI). As a first step, we focused on gender and age, because these parameters are used in both formulas.

Biases of the MDRD and CG formulas with respect to gender and in two different age groups are shown in Figure 4. A cutoff age of 65 yr was chosen, because data from the United States Renal Data System show that the incident rates of ESRD are more than twofold higher in individuals who are ≥65 yr than in younger ones (1). The bias of the MDRD formula was very small in all subgroups, except for women who were younger than 65 yr (bias, −3.1 ± 17.2 ml/min per 1.73 m^{2}), whereas the biases of the CG formula were always significantly larger (*P* < 0.0001).

The precision and the accuracy of the two formulas according to gender and age are reported in Table 6. The MDRD formula was more precise and accurate than the CG one in all subgroups of patients; the only exception was the subgroup of women who were ≥65 yr and had a measured GFR <60 ml/min per 1.73 m^{2}.

Another approach to estimate the global accuracy of the formulas was to analyze the absolute of the difference between estimated and measured GFR values (9,30). It was expressed both in ml/min per 1.73 m^{2} and as a percentage of GFR values and represented in percentiles (50th, 75th, and 90th) to allow the drawing of absolute and relative boundaries for the lack of accuracy (Figure 5). In all cases, the MDRD formula was at least as accurate as the CG one. The CG formula principally lacked accuracy in subjects who were younger than 65 yr and had GFR values <60 ml/min per 1.73 m^{2}, whereas the accuracy of the MDRD formula was much more uniform (Figure 5B).

### Comparison of Bias and Precision of Estimated GFR Values According to BMI

The cohort was divided into four standard subgroups according to BMI values: <18.5 kg/m^{2} (underweight, 94 subjects), between 18.5 and 24.9 kg/m^{2} (normal, 1010 subjects), between 25 and 29.9 kg/m^{2} (overweight, 712 subjects), and ≥30 kg/m^{2} (obese, 279 subjects). ANOVA analysis showed that each BMI class was associated with statistically different GFR values (55.1 ± 32.0, 64.3 ± 32.9, 60.9 ± 32.2, and 52.2 ± 31.5 ml/min per 1.73 m^{2} from underweight to overweight subjects, respectively; *P* < 0.0001). As shown in Figure 6, the MDRD formula largely overestimated kidney function in underweight subjects; the bias observed for this subgroup (12.2 ± 24.8 ml/min per 1.73 m^{2}) was significantly higher than the one observed for all other classes of BMI (*P* < 0.0001 by ANOVA test). In all other subgroups, the MDRD formula was less biased, more precise, and more accurate than the CG one (Figure 6).

### Consequences of the Limitations of the MDRD and CG Formulas on the K/DOQI CKD Classification

The K/DOQI guidelines recommend defining a clinical action plan for each patient with CKD on the basis of the stage of disease as defined by the K/DOQI CKD classification (5). Therefore, we evaluated the consequences of the limitations of the MDRD and CG formulas on the classification of CKD patients (Table 7⇓). This analysis was based solely on results of GFR determinations, and all 2095 subjects were considered, regardless of whether they had kidney damage. For subjects with GFR ≥90 ml/min per 1.73 m^{2}, the CG formula was slightly more accurate than the MDRD one, but for all other GFR levels, more subjects were classified in the proper stage by the MDRD formula than by the CG one (Table 8). Overall, only 70.8 and 67.6% of subjects were classified in the correct stage by the MDRD and CG formulas, respectively. Using the average values of both formulas to estimate GFR did not improve the accuracy of the prediction (Table 8). The consequences of the limitations of the formulas can also be depicted by a figure plotting prediction intervals of measured GFR as a function of estimated GFR (Figure 7).

## Discussion

In this study, we evaluated the performances of the CG and MDRD formulas for estimating GFR in a cohort of 2095 subjects. As recommended by the K/DOQI guidelines, these two formulas are increasingly used in daily clinical practice, and decisions regarding the care of CKD patients are based on estimated GFR, but their accuracy is still debated (5).

An important characteristics of our cohort is that it included subjects whose measured GFR ranged from 2.3 to 166.4 ml/min per 1.73 m^{2} (interquartile range, 33.6 to 87.3 ml/min per 1.73 m^{2}), with similar numbers of subjects having measured GFR values ≥ and <60 ml/min per 1.73 m^{2} (1044 and 1051 subjects, respectively). Thus, the performances of the CG and MDRD formulas could be assessed over a wide range of kidney function. Furthermore, because the vast majority of patients included in this study were European, the performances of the MDRD and CG formulas could be assessed in a group of subjects whose anthropometric characteristics are slightly different from those of Americans. For example, when compared with the MDRD cohort (9,11), the mean weight of our study population was 11.2% lower (70.7 ± 15.3 *versus* 79.6 ± 16.8 kg) and the mean BSA was 6.3% lower (1.79 ± 0.21 *versus* 1.91 ± 0.23 kg/m^{2}), whereas, on average, our patients were only 2.2 yr older than those included in the MDRD cohort (52.8 ± 16.5 *versus* 50.6 ± 12.7 yr) and a similar percentage of subjects were male in both cohorts (59 *versus* 60%).

Recent studies have emphasized the importance of careful calibration of serum creatinine measurements to estimate reliably GFR in patients with normal or near-normal renal function, using creatinine-based equations (19,20). In the absence of an international standard, we used plasma samples supplemented with precise amounts of creatinine hydrochloride to calibrate our assay. Analysis of the relationship between expected and measured creatinine concentration strongly suggests that our assay reliably measures creatinine concentrations. The relationship between measured and expected creatinine concentrations was linear over a wide range of values and not different from the identity line. Furthermore, in our population, the ratio of MDRD GFR over measured GFR did not vary over time, which suggests that no calibration bias occurred over time. This careful calibration of plasma creatinine measurements may explain that, for subjects with normal or near-normal kidney function, we found much less difference between estimated and measured GFR than in other studies (14,16,18,31).

In this study, GFR was measured by renal clearance of ^{51}Cr-EDTA, whereas renal clearance of ^{125}I-iothalamate has been used by studies in North America. However, the performance of our method is similar to what has been reported for iothalamate clearance (32).

Analysis of bias, a measure of systematic error, in the entire study population showed a very good global agreement between estimated and measured GFR for each of the two formulas. On average, estimated GFR was only 1.0 ml/min per 1.73 m^{2} lower than measured GFR with the MDRD formula and 1.9 ml/min per 1.73 m^{2} higher with the CG formula. A similar bias was observed when the CG formula was compared with GFR measured by ^{125}I-iothalamate clearance in all patients who were screened for the African-American Study of Kidney Disease and Hypertension; the mean difference between estimated and measured GFR was −2.7 ml/min per 1.73 m^{2} (13). In contrast, in the MDRD cohort, the CG formula was shown largely to overestimate measured GFR (9). The reasons for this discrepancy are not clear, but it may be due to differences in patient characteristics.

When estimating the performance of a formula, precision is probably more important than bias. Our study showed that both the MDRD and the CG formulas largely lack precision. Previous studies that focused on patients with or without CKD have already highlighted the global lack of precision of these two formulas (13–16,31). However, in our analysis, their performances were different in various subgroups of subjects. The greatest lack of precision was observed for subjects who were younger than 65 yr and had measured GFR ≥60 ml/min per 1.73 m^{2} for underweight subjects and, in the case of the CG formula, for obese subjects.

Analysis of the ability of a formula to classify patients into different subgroups depends on the characteristics of the population. In particular, it depends on the proportion of patients who happen to be near the boundaries of the subgroups. In our series, analysis of the performance of both formulas to classify patients according to the K/DOQI CKD classification showed that only 70.8% of subjects were classified in the proper category when using the MDRD formula and 67.6% when using the CG one, which clearly highlights the limitations of both formulas. For example, when using the CG and the MDRD formulas, 28.8 and 16.7% of stage 4 CKD patients were misclassified as stage 3 CKD patients, respectively, which could introduce undue delays in the preparation for renal replacement therapy. By contrast, approximately 20% of subjects with measured GFR ≥60 ml/min per 1.73 m^{2} were classified as having stage 3 CKD with both formulas, which could lead to unnecessary assessment of CKD-related complications. Use of the average of the two formulas did not decrease the misclassification rate, which answers to one of the K/DOQI research recommendations (5). So as not to be misled by the use of the formulas when taking care of individual CKD patients, it is probably important to keep in mind the width of the prediction interval for GFR associated with each value of estimated GFR (Figure 7).

In conclusion, in a study population of 2095 European subjects, the MDRD formula provided more reliable estimations of kidney function than the CG formula. However, both formulas lacked precision, and using either one of them for defining the stage of disease according to the K/DOQI CKD classification would have led to inappropriate staging of approximately 30% of subjects.

## Acknowledgments

Part of this work was presented during the 35th annual meeting of the American Society of Nephrology in Philadelphia, November 2002.

We gratefully thank Dr. Van Lente for measuring plasma creatinine samples at the Cleveland Clinic Foundation.

## Footnotes

Published online ahead of print. Publication date available at www.jasn.org.

- © 2005 American Society of Nephrology