Table 3.

Distance metrics for human annotators and computational technique

ComparisonCase NumberFraction=GTAverage Distance <GTAverage Distance >GT
RP1 versus GT, scheme: T, data: patients480.56−1.19±0.482±0 (1 case)
RP2 versus GT, scheme: T, data: patients480.8−1.4±0.522±0 (2 cases)
C versus GT, scheme: T, data: patients540.5−0.61±0.570.57±0.60
C versus GT, scheme: T, data: sections1210.52−0.65±0.630.66±0.71
C versus GT, scheme: F, data: patients330.65−0.36±0.40.40±0.47
C versus GT, scheme: F, data: sections850.98−0.08±0.20.01±0.04
Baseline versus GT, scheme: T, data: patients540.27−1.79±0.881.9±0.96
  • Distance is defined as the difference of the assigned label minus the ground truth label. Negative distances indicate undercalling; positive distances indicate overcalling. T refers to classifications according to the Tervaert scheme; F refers to classifications according to the Fogo scheme. Values are reported as mean±SD taken over all of the cases. Data description identifies whether the experiment was performed using separate patients or separate sections as individual data. GT, ground truth; RP, renal pathologist; C, computer.