Abstract
ABSTRACT. Chronic allograft nephropathy (CAN) is a major problem in posttransplant management. The lack of a reliable and early surrogate marker of CAN has hampered patient care and research. In this study, the Cortical Fractional Interstitial Fibrosis Volume (VIntFib), quantitated with computerized image analysis of Sirius Red–stained protocol biopsies, was examined as a potential surrogate for time to graft failure (TTGF) in 68 renal allograft recipients. At 6 mo posttransplant, VIntFib was highly correlated with TTGF (r = 0.64, P < 0.001). Both the Banff Chronic Sum and the Acute Sum Scores were also correlated with TTGF, but less strongly (r = 0.28, P < 0.02; r = 0.35, P < 0.003, respectively). As VIntFib was not correlated with the Banff Chronic Score, a multivariate model was created that incorporated VIntFib and both Acute and Chronic Banff pathology. This model was highly correlated with TTGF (r = 0.7, P < 0.0001). These findings suggest that VIntFib determined by computerized image analysis of Sirius Red–stained protocol biopsies at 6 mo posttransplant, with or without incorporation of Banff acute and chronic scoring, may provide an early surrogate for time to graft failure in renal allograft recipients. E-mail: pgrimm@ucsd.edu
Failure of a renal allograft is becoming a more frequent indication for dialysis (1). In most cases, the kidney is lost to chronic allograft nephropathy (CAN), an incompletely understood entity characterized histologically by varying degrees of interstitial fibrosis, tubular atrophy, fibrointimal hyperplasia of the renal vessels, and glomerulosclerosis.
The histologic features of CAN are typically described in patients in whom a biopsy is performed to diagnose the cause of deteriorating graft function. However, protocol biopsies in well-functioning grafts have shown that the histologic changes of CAN may develop as early as three months posttransplant, while deterioration of renal function occurs much later (2,3⇓). These observations raise the possibility that the histologic changes of CAN may be an early surrogate for late graft loss.
The histopathologic feature of CAN that correlates most strongly with subsequent deterioration of graft function is the extent of interstitial fibrosis (4). However, the precise quantitation of interstitial fibrosis may be difficult using current histopathologic systems such as the Banff schema and CADI (5,6⇓), in which interstitial fibrosis is scored semi-quantitatively, i.e., not as a continuous variable, which precludes the use of many sensitive approaches to statistical analysis. Furthermore, recent studies have shown a wide inter-observer variation between pathologists in both North America (7) and Europe using the Banff schema (8), which renders the comparison of chronic histopathology scoring across centers unreliable.
The fractional interstitial volume has been often used as a quantitative indicator of chronicity and fibrosis in native kidney disease but may reflect acute edema and inflammation as well (9). Therefore, the fractional interstitial volume as determined by trichrome staining, may reflect the intensity of interstitial inflammation and edema in early posttransplant biopsies but chronic damage in late biopsies (10). Furthermore, standard stains such as the blue in the Mallory trichrome stain, are nonspecific and stain many matrix components, some of which, (e.g., tenascin) may not correlate with long term renal transplant function (11).
In this study, we have attempted to address the issues of lack of reproducibility and insensitivity of interstitial fibrosis scoring of the Banff and similar histopathologic schemata and the lack of specificity of the fractional interstitial volume. We hypothesized that computerized image analysis of allograft interstitial fibrosis (collagen staining) would be a useful predictor of long-term graft function. Specifically, we have performed protocol biopsies of renal allografts, stained the tissue with Sirius Red, which is specific for collagen types I and III when imaged under polarized light (12), and quantitated the extent of interstitial fibrosis- the Cortical Fractional Interstitial Fibrosis Volume (VIntFib) using computerized image analysis. We compared the quantitation of (VIntFib) using Sirius Red to that of chronic histopathology using the Banff schema as predictors of long-term renal graft outcome. Finally, we combined VIntFib, the Banff Chronic Score and the Banff Acute score in stepwise multiple linear regression to determine what aspects of the biopsy contribute to the optimal prediction of renal allograft outcome.
Materials and Methods
Patients and Biopsies
Protocol renal biopsies, after informed consent, have been performed in Winnipeg, Canada, since 1990. The clinical characteristics of the patient cohorts have been described previously (13,14⇓), as have details of the biopsy scoring (15). Tissue blocks from these biopsies were obtained for Sirius red staining; where these were no longer available, previously stained trichrome slides were destained and Sirius Red was then used. Protocol biopsies available for Sirius Red examination, which included 34 at 1 mo posttransplant, 28 at 2 mo, 40 at 3 mo, 68 at 6 mo, and 20 at 12 mo, were from patient previously reported cohorts (14). The biopsies had been scored using the Banff schema, in which the severity of acute and chronic changes of the glomeruli, vessels, tubules, and interstitium of a renal allograft is assigned a value of 0 to 3 (6,16⇓). By adding these individual acute or chronic scores we generated a Banff Acute Sum Score and a Banff Chronic Sum score as has been previously reported (2). Six-month posttransplant biopsies were used in a multivariate analysis. The Acute Banff Sum Score of these biopsies was 2.1 (range, 0 to 6), the Chronic Banff Sum Score was 0.81 (range, 0 to 8).
Staining
Unstained paraffin-embedded sections fixed in formalin, paraformaldehyde, Bouin’s, or B5 were used. Slides that had been previously stained were treated by soaking of the cover slips in xylene in a slowly stirring bath. Depending on the age of the slide, up to a week was necessary to remove the cover slip. The slide was then treated with five parts 18 M HCl in 95 parts 70% ethanol until color was no longer seen coming off of the slide. The slide was washed in double-distilled water for 5 min and then submitted for Sirius Red staining.
For Sirius red staining, slides were baked at 60°C for 1 h and then taken through xylene and graded ethanols (100%, 95%, 85%, 75%, 60%, 50%) into distilled water. Slides were then stained overnight (minimum 14 h) in saturated picric acid with 0.1% Sirius Red F3BA (Aldrich Chemicals). The next morning slides were removed, washed in 0.01 N hydrochloric acid for 2 min, and rapidly dehydrated through graded alcohols starting at 70%, then to xylene, and finally cover slipped in Permount.
Reliability Analysis
Serial sections of normal and fibrotic kidneys (obtained at nephrectomy) were stained with Sirius Red by two different operators. We performed six separate assays on slides from each sample, each on a different day with a different set of reagents and alternated two technicians. We then performed an intra-experimental study by assaying five sections of each sample in one run on one day. We were then able to estimate the interobserver, intraobserver, and day-to-day variation using standard clinical lab statistics (17). In addition, we performed a reliability analysis by calculation of the coefficient of intraclass correlation using ANOVA (18,19⇓).
Image Analysis
Image analysis was performed by either a technician or a clinician blinded to the clinical source of the sample. The slides were examined with a Nikon E600 microscope, and a Hitachi analog 3 CCD camera was used to capture grayscale 256-bit images that were stored as TIFF files. A background image of a blank area of the slide was initially obtained and background correction was performed in real time to adjust for subtle irregularities in the illumination of the microscope field. The images were acquired using the 40X objective. Images of the entire cortex of the biopsy were obtained in a serpentine fashion starting at one end of the tissue and working toward the other. Recognizable glomeruli or vessels larger than the size of adjacent tubules, the subcapsular cortex, and the medulla were avoided when acquiring the images. Images were stored in a sequential fashion and archived on CD-ROM.
Image analysis was performed using an automated macro (available from the first author) specially written for the software package NIH Image (20). Automated analysis of the images was performed with operator supervision. After the software was set to differentiate the positively stained from negatively stained areas on the first image, the software sequentially opened each image, did the analysis, stored the data, closed the image, and moved on the next image until the entire biopsy is analyzed. The operator’s only function during the analysis phase was to watch the screen and stop the process if an error (such as an accidentally included glomerulus) was detected. The entire process typically required 15 min to acquire and store the images and 5 min to analyze an average biopsy. Separate data files with the Cortical Fractional Interstitial Fibrosis Volume (VIntFib) were obtained.
Clinical Data
Clinical data, obtained prospectively for ongoing studies, were stored in Microsoft Access format. We analyzed long-term function using creatinine clearance (ClCr) derived from the formula of Cockcroft and Gault (21) at multiple time points during the first year then yearly. In patients who maintained a functioning graft during the follow-up period, graphs of ClCr versus time were drawn using all available serum creatinines. A mean of 29 ± 10 creatinine values were used per patient. Contrary to changes that may occur with time in the 1/serum creatinine graph (22), Hunsicker and Bennett (23) have reported that the course of ClCr over time in more than 42,000 patients in the UNOS database is approximately linear. In patients who did not experience graft loss during the study period, the regression line was extended, extrapolating the deterioration of ClCr to 10 ml/min. Patients who had no deterioration of serum creatinine over the observation interval, were arbitrarily allocated a graft survival of 10,000 d (approximately 26 yr). We included patients who had a minimum of 2 yr of follow-up to ensure reliability of the ClCr versus time graph estimation of time to graft failure. It has been previously shown that a minimum follow up of 15 mo is necessary to have a reliable estimate of similar procedures (24). If a patient lost the allograft during the follow-up period, the day the graft was lost was designated the time to graft failure.
Statistical Analyses
Data files from image analysis were transferred to the statistical analysis package Statview 5.0 on a MacIntosh G3 personal computer. The VIntFib was calculated from the raw data by summing the total stained area per biopsy and dividing by the total area scanned, expressing it as fraction. The correlation of VIntFib or Banff Score and graft outcome was analyzed using the simple regression option of the software. The multiple regression analysis of the VIntFib, Banff Scores, and graft outcome was performed in a stepwise forward fashion using the multiple linear regression options.
Results
Reliability Analysis of Sirius Red Staining
Table 1 shows the mean VIntFib, variance. and coefficient of variation (Cv) for inter-assay and intra-assay studies. Note that the normal biopsy had a very low VIntFib so that small random fluctuations between serial sections can lead to seemingly large Cv. The Cv obtained in this study of Sirius Red staining (Table 1) are therefore well within the range of standard laboratory tests used in clinical practice (17).
The coefficient of intra-class variation was analyzed to determine the reliability of these assays in differentiating normal from abnormal specimens (18). As shown in Table 1, the intra-class correlation of 0.99 is highly reliable for differentiating fibrotic from normal tissue samples. For comparison, a group of pathologists reported a mean intra-class correlation coefficient for the chronicity index in biopsies of patients with systemic lupus erythematosus (inter-observer variation) of 0.58 (19).
These data are very similar to the reproducibility studies of Ellingsen et al. (25). These authors showed coefficient of variation between 0.06 and 0.22 in their studies of point counting of interstitial area in renal allograft biopsies. Their interpretation was that a single biopsy is representative of the entire allograft with regard to the interstitium. Nicholson (11) also presented reproducibility data using morphometric techniques to quantitate Collagen Type 3 in renal allograft biopsies. They showed a similarly good reliability.
Correlation of Sirius Red Staining in Biopsies from Various Time Points with Long-Term Graft Outcome
The time to graft failure was the primary outcome variable. This was correlated with the computerized analysis of fibrosis as determined by Sirius Red staining (VIntFib). The VIntFib obtained in biopsies obtained at 1, 2, and 3 mo posttransplant was poorly correlated with the time to graft failure (P = 0.23, P = 0.09, and P = 0.14, respectively; Figure 2). The VIntFib in the 6-mo protocol biopsy, on the other hand, was strongly correlated with time to graft failure (r = 0.52, P < 0.001). The 12-mo biopsy was limited by a small number of specimens but was still correlated with time to graft failure (r = 0.49, P = 0.03; data not shown). For the purpose of this study, the 6-mo biopsy was used for further analysis.
Figure 2. Time to allograft failure versus Sirius Red–derived VIntFib at (A) 1 mo, (B) 2 mo, (C) 3 mo, (D) 6 mo.
Correlation of the Banff (Acute and Chronic) Score with Long-Term Graft Outcome
The Banff Chronic Score is a commonly used example of a standardized pathology scoring system. It has been used to quantitate chronic allograft nephropathy and has been the subject of recent publications (8). To compare the performance of this clinically accepted score with VIntFib reported above, we correlated time to graft failure with the various components of the Banff Chronic Score and with the Banff Sum Score (obtained by adding up the individual chronic lesions reported in the Banff schema, as has been previously reported [2)]). Figure 3 shows using 6-mo posttransplant biopsies that the interstitial fibrosis (Ci) and Chronic Sum Score were correlated with time to graft failure. The best correlation of the individual components was with Ci (r = 0.30, P = 0.014). When we analyzed combinations, the best overall correlation was with the Banff Chronic Sum score (r = 0.28, P = 0.02), followed by the sum of the interstitial and tubular components (Ci+tv, r = 0.25, P = 0.03). The Banff Acute Score was also correlated with outcome (r = 0.35, P = 0.003; Figure 3F). The summation of the Banff Acute Sum Score and the Banff Chronic Score showed some correlation with time to graft failure (r = 0.31, P = 0.02). However, none of the r values generated using the Banff scores, was higher than that obtained with the use of Sirius Red (r = 0.64; Figure 2D).
Figure 3. Time to allograft failure versus Banff Chronic Scores at the 6-mo biopsy. (A) Interstitial fibrosis component; (B) tubular atrophy component; (C) chronic glomerular component; (D) chronic vascular component; (E) sum of all the individual components of the Banff Chronic Score; (F) sum of the individual components of the Banff Acute Score.
Correlation of Sirius Red Stain with Banff Chronic Score
Correlation of Sirius Red-derived VIntFib with Banff Chronic Score was assessed. Neither the Banff Chronic Sum Score (r = 0.09, P = 0.47) nor the interstitial fibrosis (Ci, r = 0.05, P = 0.70), the tubular atrophy (Ct, r = 0.06, P = 0.61), nor the glomerular and vascular (Cg+Cv) (P = 0.28, r = 0.13) components of the chronic score were correlated with the Sirius Red-derived VIntFib (data not shown).
Correlation of Sirius Red and Donor Age
Donor age correlated with the Sirius Red-derived VIntFib in the 1-mo but not the 6-mo protocol biopsy (Figure 4, A and B). With increasing donor age, Sirius Red–derived VIntFib shows a gradual increase as has been reported by studies of renal interstitial fibrosis in human aging (26). The Banff Chronic Sum Score of the 1-mo biopsy (Figure 4C) showed a pattern different from that demonstrated by Sirius Red-derived VIntFib. The Banff Chronic Sum Score, on the other hand remains at 0 until the threshold age of 40 yr, when the score starts becoming positive.
Figure 4. (A) Sirius Red–derived VIntFib at 1 mo versus donor age in years; (B) Sirius Red–derived VIntFib at 6 mo versus donor age in years; (C) Banff Chronic Sum Score at 1 mo versus donor age in years.
Multivariate Linear Regression
The lack of correlation between Sirius Red-derived VIntFib and Banff Chronic Pathology Scores allowed us to explore the use of a multivariate linear regression, combining the two. In addition, we entered the Banff Acute Score at 6 mo, VIntFib, the individual components of the Banff Chronic Score (Ci, Ct, Cv, Cg) Banff Chronic Sum Score, Banff Cg+Cv, Banff Ci+Ct, at the 6-mo biopsy into a stepwise model. Indeed, multiple regression analysis indicated that both Sirius Red-derived VIntFib and Banff Chronic Score and Banff Acute Score contributed independently to explain the variance in the model. This generated a model with the equation:
Time To Failure (days) = 11260 − 27166 (Sirius Red-derived VIntFib) − 537 (Banff Chronic Sum Score) − 240 (Acute Score)
This model was significantly correlated with Time to Graft Failure (r = 0.7, P < 0.0001).
Discussion
The purpose of this study was to compare the quantitation of interstitial fibrosis (VIntFib) using Sirius Red staining to the Banff scoring of chronic histopathologic lesions as potential early surrogates for time to graft failure in renal transplant patients. We demonstrated that a precise quantitation of VIntFib by computerized image analysis provides a better surrogate marker for time to graft failure than any combination of chronic lesion scoring using the Banff schema; however, when VIntFib, acute, and chronic Banff scores in the 6-mo protocol biopsy are combined, the correlation with time to graft failure is increased.
Sirius Red is a histochemical stain that has been used for nearly 30 yr (27). The dye molecule intercalates into the tertiary groove in the structure of collagen types I and III and imparts a pink stain to most tissues when observed under white light (Figure 1A). However, when observed under polarized light, collagen types I and III are strongly birefringent (Figure 1B). Types I and III collagen represent 80% and 15 to 20%, respectively, of the total collagen synthesized by fibroblasts, and they are therefore important molecules to quantitate in fibrotic renal disease. The reliability studies we present in Table 1 show this assay has very good reproducibility on different days and among different observers.
Figure 1. Photomicrograph of Sirius Red-stained renal allograft biopsy observed under (A) white light and (B) polarized light. Notice the structure of the birefringent tubulointerstitium in panel B. In addition, there are areas of pink stain under white light that are not birefringent (arrows), these correspond to areas of matrix without collagen types I or III. Scale bar, 100 μM.
In animal models, computerized image analysis of Sirius Red staining has been used to accurately quantitate both the age of fibrotic lesions (28,29⇓) and their extent (28,29⇓). Sirius Red has also been used to quantitate the reduction in interstitial fibrosis by pharmacologic therapy in the rat 5/6 nephrectomy model (33).
In human studies, Sirius Red staining has been used to quantitate fibrotic changes in pediatric liver transplant biopsies (30) and the response to interferon alpha 2B therapy in patients with chronic non-A, non-B hepatitis (31). In this latter study, Sirius Red staining was shown to be highly correlated with total liver collagen as determined by hydroxyproline assay (31). In cardiac transplantation, fibrosis, as determined by Sirius Red staining, was shown to correlate with total ischemic time (32). In renal biopsies, Sirius Red staining has been quantitated using computerized image analysis and correlates with the interstitial volume fraction of the cortex as measured by the more laborious point counting method (33) and with the GFR at the time of biopsy (34).
In the present study, we demonstrate that VIntFib determined by Sirius Red staining was a better correlate of time to graft failure than any combination of Banff chronic lesion scoring. However, it is clear that not all biopsies with high VIntFib scores were associated with accelerated deterioration of renal function. This is, however, not surprising; in some biopsies, the fibrosis may be the result of processes that are no longer active. Therefore, in an attempt to detect patients in whom fibrosis was progressive, we determined the change in VIntFib between protocol biopsies obtained at 1 to 6 mo (ΔVIntFib) as a correlate of time to graft failure. Unfortunately, we only had a small number of these individuals (n = 30) with both biopsies. Nevertheless there was a trend for a correlation between ΔVIntFib and time to graft failure (r = 0.3, P = 0.10) that may be worth exploring in prospective studies. Similarly, Serón et al. (35) suggest the use of protocol biopsies of the transplant kidney at implantation and “within 1 yr” to be analyzed quantitatively for the assessment of progression of chronic lesions.
We have previously reported that acute rejection (as defined by Banff criteria) in the 6-mo protocol biopsy is an independent predictor of long-term graft dysfunction (2,3⇓). Not surprisingly, therefore, we found that by multivariate analysis the acute and chronic features of the Banff schema and VIntFib score in the 6-mo protocol biopsy, in combination, provided the best histologic predictor for time to graft failure. In this regard, Furness and Taub (8) have recently reported, albeit in a small number of patients, that deterioration of renal function in patients with chronic lesions is best correlated with the coexistence of acute pathology. Similarly, in an experimental model of chronic rejection, Shimizu et al. (36) showed that markers of ongoing inflammation and injury, such as apoptosis, increased proliferating cell nuclear antigen (PCNA), and α-smooth muscle actin were predictive of subsequent increase in fibrosis.
In contrast to the predictive power of the histology at 6-mo, we found that early biopsies are unhelpful for the prediction of long-term outcomes. In fact, many early protocol biopsies that showed active rejection demonstrated large areas with less than normal amounts of Sirius Red staining, perhaps due to collagen degradation by matrix metalloproteases secreted by immune reacting cells or edema.
Two additional findings of this study are worthy of mention. First, the lack of correlation between VIntFib and interstitial fibrosis as determined by Banff scoring was unexpected. It is possible that interstitial expansion due to edema or non-collagen matrix components may be attributed to fibrosis by the trichrome stain used in the Banff schema, but not by the use of Sirius Red (10). An overestimation of fibrosis by the Banff schema may be the cause for its poor correlation with time to graft failure, as compared with Sirius Red. Second, the fact that the VIntFib at 1 mo but not that at 6 mo correlates with donor age suggests that the changes in interstitial fibrosis that occur between these time points are related to peritransplant and posttransplant events more than to the age of the transplanted organ.
The findings of this study may be relevant for the design of future comparative trials of renal allograft outcome. The mean Sirius Red-derived VIntFib in the 6-mo biopsies from our cohort was 0.105 ± 0.056. A study designed to have an 80% chance to detect a reduction in interstitial volume fraction of 15% (P = 0.05) would require a total of 300 patients (150 in each group) (37). This is a much smaller number of patients than those suggested for the study of long-term outcomes such as renal function or allograft survival (38). A 15% decrease of the VIntFib from a mean of around 0.1 to 0.085, is a clinically meaningful end point, as it corresponds with an extension of time to graft failure of approximately 1 yr.
In summary, we have shown that the time to graft failure is correlated with the VIntFib in the 6-mo protocol biopsy. VIntFib is derived from Sirius Red–stained biopsies using a fast, reproducible computerized system. This study confirms and extends the findings of other investigators that quantitative analysis of late allograft biopsies may be a useful tool in prediction of long-term allograft outcome. VIntFib alone or in addition to Banff Acute and Chronic scoring has potential to be applied to both individual patient care decisions and multi-center clinical research.
Acknowledgments
This study was funded by grants from the Baxter Extramural Grant Program, NIH (NIDDK R21 DK53610–01 and NIAID RO1-AI43655–02) and The Kidney Foundation of Canada. Dr. Grimm is supported by the Department of Pediatrics, UCSD. Dr. Nickerson is supported by a Medical Research Council of Canada Scholarship. We would like to thank the patients, nurses, and secretaries of the Winnipeg Transplant Clinic for their cooperation and support, Adam Merry for excellent technical assistance, and Dr. Henry Krous for thoughtful criticism.
- © 2003 American Society of Nephrology