Predicting Risk for Acute Kidney Injury in the Outpatient Setting: a Continuous Risk Prediction Equation and Two Binary Tests for Identifying High-risk Patients A THESIS SUBMITTED TO THE FACULTY OF THE UNIVERISTY OF MINNESOTA BY Daniel Patrick Murphy, MD IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE IN CLINICAL RESEARCH Advisor: Paul E. Drawz, MD MHS MS February 2021 © 2021 Daniel Patrick Murphy i Acknowledgements I would like to acknowledge and thank my thesis advisor, mentor, and principal investigator for this work, Dr. Paul Drawz, for his support in the conception and production of this research product. I would like to acknowledge and thank my thesis committee member and collaborator, Dr. Scott Reule, for his support in data acquisition and management. I would like to acknowledge and thank my thesis committee member and biostatistician, Dr. David Vock, for his support in study design and the interpretation of results. Finally, I would like to acknowledge and thank Dr. Maxwell Leither and Luke Bicknese for their roles in providing preliminary statistical code and data extraction, respectively. The contents of this article do not necessarily represent the views of the U.S. Department of Veterans Affairs or U.S. Government. ii Abstract Background: Risk-factors for acute kidney injury (AKI) in the hospital have been well studied. Yet, risk-factors for AKI occurring and managed in the outpatient setting are unknown and may differ. Methods: A development cohort for modelling risk of AKI without concurrent or subsequent hospitalization was defined by repeated primary care encounters in a single urban healthcare system using electronic health record data. An external validation cohort was similarly defined in the Veterans Health Administration nationally. Logistic regression with bootstrap sampling for backward stepwise covariate elimination was used to develop a model for outpatient AKI over an 18-month outcome period. The model was then transformed into two binary tests: one identifying high-risk subjects for potential research and another identifying patients for additional clinical monitoring or intervention. Results: Outpatient AKI was seen in 4,611 (3.0%) and 115,744 (2.4%) patients in the development and validation cohorts, respectively. The model produced C-statistics of 0.717 (95% confidence interval (CI): 0.710-0.725) and 0.722 (95% CI: 0.720-0.723) in the development and validation cohorts, respectively. The research-test, identifying the top 5.2% most at-risk patients in the validation cohort, had sensitivity of 0.210 (95% CI: 0.208-0.213) and specificity of 0.952 (95% CI: 0.951-0.952). The clinical-test, identifying the top 20% most at-risk, had sensitivity of 0.494 (95% CI: 0.491-0.497) and specificity of 0.806 (95% CI: 0.806-0.807). Conclusions: The outpatient AKI-risk prediction model performed well in both the development and validation cohorts and was transformed into two binary tests, one for potential use in research and another in clinical care. Multiple novel risk-factors were identified. iii Table of Contents List of Tables……………………………………………………………….…………….iv List of Figures………………………………………………………………..……….…...v List of Abbreviations……………………………………………………………………..vi Introduction………………………………………………………………………………..1 Methods……………………………………………………………………………………2 Results……………………………………………………………………………………..7 Discussion………………………………………………………………………………..18 Conclusion……………………………………………………………………….………22 Bibliography………………………………………………………………….………….23 Appendix………………………………………………………………………………...26 iv List of Tables Table 1. Baseline characteristics of model development cohort, overall and by outpatient acute kidney injury Table 2. Baseline characteristics of model validation cohort, overall and by outpatient acute kidney injury Table 3. Performance in the validation cohort (unless otherwise specified) of two binary tests produced from the risk prediction model for outpatient acute kidney injury in the 18 months after a primary care-baseline period Supplemental Table 1. Accepted physiologic lab or vital sign values Supplemental Table 2. Lab test serving as most recent to define proteinuria by cohort Supplemental Table 3. Restricted cubic spline fitting of continuous covariates Supplemental Table 4. Interaction terms considered for variable selection in model development Supplemental Table 5. Risk prediction model for outpatient acute kidney injury in the 18 months following a primary care-defined baseline period 9 11 13 26 26 27 28 30 v List of Figures Figure 1. Patient study-eligibility flow diagram for the development cohort, derived from the M Health Fairview healthcare system electronic health record. Figure 2. Patient study-eligibility flow diagram for the validation cohort, derived from the Veterans Affairs healthcare system electronic health record. Figure 3. Observed risk vs. mean predicted risk of outpatient acute kidney injury (AKI) in the 18 months after a primary care-baseline period by decile of predicted risk in the validation cohort (N = 4,864,576). Observed and mean predicted risks by deciles are: 0.5% vs. 0.9% for 1st decile, 0.8% vs. 1.3% for 2nd decile, 1.0% vs. 1.6% for 3rd decile, 1.2% vs. 1.9% for 4th decile, 1.5% vs. 2.2% for 5th decile, 1.9% vs. 2.6% for 6th decile, 2.3% vs. 3.1% for 7th decile, 2.9% vs. 4.0% for 8th decile, 4.0% vs. 5.4% for 9th decile, and 7.7% vs. 12.1% for 10th decile, respectively. N = 486,458 for odd-numbered deciles and 486,457 for even-numbered deciles. Figure 4. Kaplan-Meier survival curves for (A) the lowest decile of predicted outpatient acute kidney injury (AKI)-risk (N = 450,160) and (B) the highest decile of predicted outpatient AKI-risk (N = 481,058) in the validation cohort, beginning at the end of the 18- month AKI-period, among those not lost to follow-up prior to the end of the AKI-period, and censored on final encounter, vital sign, or lab measurement. 14 15 16 17 vi List of Abbreviations AKI CKD ESKD EHR VA eGFR CI PPV NPV Acute Kidney Injury Chronic Kidney Disease End-stage Kidney Disease Electronic Health Records Veterans Affairs Estimated Glomerular Filtration Rate Confidence Interval Positive Predictive Value Negative Predictive Value 1 Introduction Acute kidney injury (AKI) occurs frequently among hospitalized patients,1 including as many as half of critically ill patients.2 AKI may also happen to patients in the ambulatory setting, with only a subset of patients subsequently admitted to a hospital.3-8 Regardless of the setting of AKI, patients with AKI are at increased risk for adverse clinical outcomes, including progression of chronic kidney disease (CKD), end- stage kidney disease (ESKD), cardiovascular disease, and death.8-14 Among hospitalized patients, the risk-factors for AKI have become increasingly well recognized, and prediction-models for patients in the hospital have been published.15-21 Yet, there has been no such examination of the predictability of AKI when it occurs in the ambulatory setting or support tool developed to guide assessing the risk for AKI in outpatients. As a substantial proportion of AKI cases may occur in the outpatient setting,6, 8 identifying patients at high risk for outpatient AKI may lead to improved care for an underrecognized population at risk for kidney disease. We sought to develop and externally validate a risk prediction model for AKI occurring in patients in the outpatient setting using covariates widely available in electronic health records (EHR). Such a predictive model could be used to risk-stratify patients based on the expected risks and benefits of subsequent clinical monitoring or to identify high-risk patients for recruitment for prospective clinical trials. 2 Methods Study Population A model-development cohort was derived from the M Health Fairview system (of the metropolitan Minneapolis-St. Paul, Minnesota area). Data were acquired dating back to the adoption of the current EHR, beginning November 29, 2000, through June 1, 2018. A baseline period, from which baseline labs, vital signs, and comorbidities were assessed, was defined by receipt of at least two primary care encounters at least 18 months apart. All baseline periods ended by December 1, 2016 to allow a subsequent 18-month outcome period. We have previously shown that varying this outcome period to 12 or 24 months does not meaningfully alter the increased risk for death or poor renal outcomes seen after outpatient AKI.8 Primary care specialties were: family medicine, internal medicine, medicine-pediatrics, obstetrics/gynecology, or geriatrics. An external validation cohort was derived from the Veterans Affairs (VA) healthcare system. National-level data were accessed via the VA Informatics and Computing Infrastructure. Due to a longer history of EHR use by the VA, the validation cohort was limited to patients with a second qualifying primary care encounter after December 31, 2009. Data were accessed through January 1, 2019 (with all baseline periods ending by July 1, 2017). Cohort participants were required to have at least one serum creatinine measurement from the outpatient setting prior to the second qualifying primary care encounter and from the subsequent outcome period. Participants were limited to ages 18- 90 years and to those without ESKD, kidney transplantation, or CKD stage 5 at baseline. AKI in these setting is either not possible, beyond the scope of this study, or unlikely to be treatable before dialysis initiation, respectively. Application of inclusion/exclusion criteria for the development and validation cohorts are shown in flow diagrams (Figures 1 and 2). 3 Definitions of Baseline Characteristics and Study Outcome Race was defined as ‘black’ if ever recorded as black or ‘non-black’ otherwise. Smoking status was defined as either: 1) ever documented as being a current or former smoker, or 2) neither. History of hospitalization was defined as any prior hospitalization during which serum creatinine was measured in order to exclude hospitalizations for non- medical-surgical care. Baseline estimated glomerular filtration rate (eGFR) and proteinuria have been identified as independent risk-factors for inpatient AKI.21, 22 For this study, eGFR was treated as a continuous variable and was independently calculated by the Modification of Diet in Renal Disease equation,23 which is utilized in the VA EHR.24, 25 The eGFR equation used in the M Health Fairview EHR was not consistent across the study timeframe. While the CKD-Epidemiology Collaboration equation offers increased accuracy at higher eGFR levels, greater AKI-risk was anticipated at reduced eGFR levels.26 Race was included in the calculation of eGFR. To avoid widespread missingness or the use of old data, proteinuria was defined as the most recent of either spot albuminuria, spot total proteinuria, or semi-quantitative urinalysis-albumin concentration. In the event of multiple measurements from the same time or uninterpretable results, spot albuminuria was the first choice (due to precalculated ratios in the abstracted development cohort data) and urinalysis was last. “Severe” proteinuria was defined as spot albuminuria or total proteinuria ≥ 300 mg/g or urinalysis- albumin concentration of “≥ 300” or equivalent.27 “Moderate” proteinuria was defined as spot albuminuria or total proteinuria measurement 30-300 mg/g or urinalysis-albumin concentration of “30-300” or equivalent. Proteinuria below these thresholds was considered “normal.” Due to expected non-randomness in missing data collected in usual clinical care, binary missingness covariates were paired with every lab- and vital sign-covariate, as we and others have done previously.8, 20 This allowed modeling the risk of outpatient AKI on both lab or vital sign values and whether the lab or vital sign had ever been measured. 4 When missing, the original covariate was entered as zero. Occasional, non-physiologic values were recorded as missing (defined in Supplemental Table 1). Labs and vital signs included were: serum sodium, potassium, calcium, albumin, hemoglobin A1c, hemoglobin, and systolic and diastolic blood pressures. AKI was divided into two mutually exclusive categories due to potential differences between AKI occurring in the inpatient and outpatient setting. A baseline history of inpatient AKI was defined as an increase in inpatient-measured creatinine ≥ 50% above the most recent creatinine from the outpatient setting measured 7-365 days prior to the date of admission or, if none existed, the next most recent creatinine from any setting. This definition was chosen to avoid missing cases of inpatient AKI that may develop prior to admission (so-called community-acquired AKI).3, 5-7, 28 Outpatient AKI was similarly defined as a ≥ 50% increase in outpatient-measured creatinine above a moving baseline of the three (or fewer) most recent outpatient- measured creatinine values 25-365 days prior or, if none existed, the next most recent creatinine from any setting. This is similar to prior definitions of baseline creatinine used for AKI originating in the outpatient setting.5, 8 An absolute creatinine rise of 0.3 mg/dL over 48 hours was not used as labs are rarely obtained that frequently in the outpatient setting.29, 30 Outpatient AKI prior to the end of the baseline period defined baseline outpatient AKI, while outpatient AKI occurring during the 18-month outcome period defined the study outcome. Other comorbid diseases were defined by the presence of two or more International Classification of Diseases and/or Current Procedural Terminology billing codes consistent with the diagnosis. These diagnoses included: diabetes mellitus, hypertension, cardiovascular disease (defined as the composite of coronary artery disease, peripheral vascular disease, congestive heart failure, or stroke), hypertension, cancer, and liver disease. Statistical Analyses Restricted cubic splines with four knots at the 5th, 35th, 65th, and 95th percentiles 5 were initially applied to all continuous variables to assess non-linearity. Non-linear terms and a priori specified interaction terms were included in subsequent variable selection if both statistically significantly associated with outpatient AKI (P < 0.05) and if meaningful non-linear fitting or effect modification of AKI-risk was observed on visual inspection. The final model was developed using logistic regression modeling with backward stepwise elimination of covariates with N = 1,000 bootstrap sampling based on the Akaike Information Criterion. If either a continuous or paired missingness covariate were selected to remain in the model, then the pair were forced into the model. This risk- equation was then applied to the validation cohort. The model’s performance was assessed by the C-statistic for discrimination, the Hosmer-Lemeshow test for goodness of fit, and visually assessed by decile of mean-predicted versus observed AKI-risk. To demonstrate clinical significance of predicting outpatient AKI, time-to- mortality after the outpatient AKI-period was determined in the subsets of patients in the highest and lowest deciles of predicted AKI-risk. Death was defined by the date of reported death within the VA EHR. Censoring was defined by the final encounter, vital sign, or lab measurement after the AKI-period. Crude analyses for mortality predicted by outpatient AKI were done with Kaplan-Meier curves and the log-rank test. Unadjusted and adjusted hazard ratios were calculated using Cox proportional hazards modeling. Covariates for adjustment were: age, sex, race, smoking history, eGFR category (eGFR ≥ 60, 45-59, 30-44, or 15-29 mL/minute per 1.73m2),31 baseline inpatient and outpatient AKI history, cardiovascular disease, diabetes mellitus, hypertension, liver disease, cancer, and prior hospitalization. The original model was then transformed into two binary tests based on potential uses: 1) identification of high-risk research-subjects or 2) identification of patients warranting additional clinical monitoring or interventions. Each cut-point in the continuous AKI-risk equation was selected by weighting the utility or cost of true and false positive and negative cases based on perceived risks and benefits of each.32 The maximum value of the resulting ‘total utility’ created by a positive-negative cut-point was 6 calculated in the development cohort, and the optimal cut-point was then applied to the validation cohort. The cut-point for the research-test was selected by setting the utility of true positives at 100 times the cost of false positives to promote an adequate number of AKI-outcomes for well-powered research. This was balanced by weighting the utility of true negatives at ten times the cost of false negatives due to the risk and financial cost of study inclusion incurred from those not truly at high risk. The cut-point for the clinical- test was similarly selected by setting the utility of true positives at 20 times the cost of false positives to promote diagnosing cases of AKI, but without the strict concern for statistical power, and the utility of true negatives at one twentieth the cost of false negatives due to an assumed lower risk and cost incurred from unnecessary clinical monitoring compared to a research study. The Institutional Review Boards of the University of Minnesota and the Minneapolis VA approved this study. Cohorts were defined in SQL (Microsoft, Seattle, Washington). Statistical analyses were performed in R (R Foundation for Statistical Computing, Vienna, Austria). 7 Results Cohort Characteristics There were 152,371 patients in the development cohort, of whom 4,611 experienced outpatient AKI during the 18-month outcome period. In the validation cohort, 115,744 of 4,864,576 patients experienced outpatient AKI. The median age in the development and validation cohorts were 55 and 63 years, respectively. Median baseline serum creatinine values were similar between the two cohorts: 0.9 and 1.0 mg/dL for the development and validation cohorts, respectively. Baseline characteristics and their linear associations with outpatient AKI for the development and validation cohorts are shown in Tables 1 and 2. The distribution of proteinuria labs defining the most recent measurement are shown in Supplemental Table 2. Model Development Linear fitting versus restricted cubic spline fitting for continuous covariates were compared (Supplemental Table 3). The majority of interaction terms were not included in the variable selection step (Supplemental Table 4). The final model included: age, sex, history of smoking, history of outpatient AKI, eGFR, proteinuria, cardiovascular disease, diabetes mellitus, hypertension, liver disease, serum sodium, potassium, hemoglobin A1c, albumin, hemoglobin, systolic blood pressure, diastolic blood pressure, and history of hospitalization. Included interaction terms were: age and history of hospitalization, sex and systolic blood pressure, and sex and diastolic blood pressure. The model is shown in Supplemental Table 5. Covariates not selected were: race, history of cancer, history of inpatient AKI, serum calcium, and missing calcium measurement. The resulting model performed well in the development cohort with a C-statistic of 0.717 (95% confidence interval (CI): 0.710 to 0.725). Model Validation The model performed well in the validation cohort, with C-statistic of 0.722 (95% 8 CI: 0.720-0.723). The Hosmer-Lemeshow test suggested appreciable differences between the predicted and observed risk for outpatient AKI (P < 0.001). The performance of the model in the validation cohort is shown by decile of predicted AKI-risk in Figure 3 and did not suggest over-fitting of the model. Clinical Significance of the Model In the lowest decile of predicted AKI-risk in the validation cohort, there were 5,149 deaths among 450,160 patients with documented follow-up after the AKI-period. Median follow-up after the AKI-period was 3.7 years. 73 of these deaths occurred in the 2,321 patients who experienced outpatient AKI. The Kaplan-Meier survival curves are shown in Figure 4. Greater mortality in the lowest AKI-risk decile of patients was seen among those who experienced outpatient AKI compared to those who did not (P < 0.001). The unadjusted hazard ratio for death after outpatient AKI was 2.81 (95% CI: 2.23-3.54). The adjusted hazard ratio was 2.54 (95% CI: 2.02-3.20). In the highest decile of predicted AKI-risk, there were 191,892 deaths among 481,058 patients not lost to follow-up. Median follow-up was 4.5 years. 17,452 of these deaths occurred in the 36,973 patients with outpatient AKI. The Kaplan-Meier survival curves are shown in Figure 4. Again, greater mortality was observed in the highest AKI- risk decile of patients for those who experienced outpatient AKI compared to those who did not (P < 0.001). The unadjusted hazard ratio for death after outpatient AKI was 1.36 (95% CI: 1.33-1.38). The adjusted hazard ratio was 1.59 (95% CI: 1.56-1.61). Model Performance as Two Binary Tests The performance of the model when transformed into research-oriented and clinically-oriented tests and applied to the validation cohort are shown in Table 3. A “positive” result for the research-test, derived from the development cohort, was ≥ 9.5% predicted risk, resulting in a positive predictive value (PPV) of 0.096 (95% CI: 0.094- 0.097) and negative predictive value (NPV) of 0.980 (95% CI: 0.980-0.980) in the validation cohort. A “positive” result for the clinical-test was ≥ 4.5% risk, resulting in a PPV of 0.058 (95% CI: 0.058-0.059) and NPV of 0.985 (95% CI: 0.985-0.985). 9 Table 1. Baseline characteristics of model development cohort, overall and by outpatient acute kidney injury Baseline Characteristic All Patients (N = 152,371) With outpatient AKIa (N = 4,611) Without outpatient AKIa (N = 147,760) P- valueb Age in years, median (IQR) 55.1 (43.1-66.7) 60.7 (48.3-71.8) 55.0 (43.0-66.6) <0.001 Male, N (%) 69,799 (46%) 1,789 (39%) 68,010 (46%) <0.001 Black, N (%) 9,185 (6.0%) 365 (7.9%) 8,820 (6.0%) <0.001 Ever smoker, N (%) 66,388 (44%) 2,254 (49%) 64,134 (43%) <0.001 Baseline Creatinine in mg/dL, median (IQR) 0.9 (0.8-1) 0.9 (0.7-1.1) 0.9 (0.8-1) 0.031 eGFRc stage, N (%) <0.001 eGFR ≥ 60 127,976 (84%) 3,269 (71%) 124,707 (84%) Stage 3A, eGFR 45-59 17,732 (12%) 779 (17%) 16,953 (12%) Stage 3B, eGFR 30-44 5,518 (3.6%) 401 (8.7%) 5,117 (3.5%) Stage 4, eGFR 15-29 1,145 (0.8%) 162 (3.5%) 983 (0.7%) Most recent proteinuria elevatedd, N (%) 12,245 (8.0%) 715 (16%) 11,530 (7.8%) <0.001 Most recent proteinuria not elevatedd, N (%) 78,138 (51%) 2,402 (52%) 75,736 (51%) 0.26 Missing, N (%) 61,988 (41%) 1,494 (32%) 60,494 (41%) <0.001 Baseline proteinuria is “moderate”d, N (%) 10,514 (6.9%) 550 (12%) 9,964 (6.7%) <0.001 Baseline proteinuria is “severe”d, N (%) 1,731 (1.1%) 165 (3.6%) 1,566 (1.1%) <0.001 Baseline history of outpatient AKIe, N (%) 4,336 (2.8%) 492 (11%) 3,844 (2.6%) <0.001 Baseline history of inpatient AKIf, N (%) 766 (0.5%) 98 (2.1%) 668 (0.5%) <0.001 Number creatinine values at baseline, median (IQR) 2 (1-4) 3 (1-6) 2 (1-4) <0.001 Diabetes mellitus, N (%) 25,696 (17%) 1,353 (29%) 24,343 (17%) <0.001 Hypertension, N (%) 65,787 (43%) 2,487 (54%) 63,300 (43%) <0.001 Cardiovascular diseaseg, N (%) 19,511 (13%) 1,113 (24%) 18,398 (13%) <0.001 Coronary artery diseaseg, N (%) 12,833 (8.4%) 700 (15%) 12,133 (8.2%) <0.001 Congestive heart failureg, N (%) 3,951 (2.6%) 382 (8.3%) 3,569 (2.4%) <0.001 Strokeg, N (%) 4,221 (2.8%) 224 (4.9%) 3,997 (2.7%) <0.001 Peripheral vascular diseaseg, N (%) 3,789 (2.5%) 272 (5.9%) 3,517 (2.4%) <0.001 Liver disease, N (%) 4,758 (3.1%) 269 (5.8%) 4,489 (3.0%) <0.001 Cancer, N (%) 13,443 (8.8%) 559 (12%) 12,884 (8.7%) <0.001 Systolic blood pressure in mmHg, median (IQR) 124 (114-135) 126 (114-138) 124 (114-135) <0.001 10 Missing, N (%) 6,166 (4.0%) 279 (6.1%) 5,887 (4.0%) <0.001 Diastolic blood pressure in mmHg, median (IQR) 76 (69-82) 74 (67-82) 76 (69-82) <0.001 Missing, N (%) 6,244 (4.1%) 281 (6.1%) 5,963 (4.0%) <0.001 Sodium in mmol/L, median (IQR) 140 (138-142) 140 (138-142) 140 (138-142) <0.001 Missing, N (%) 5,901 (3.9%) 142 (3.1%) 5,759 (3.9%) 0.005 Potassium, in mmol/L, median (IQR) 4.1 (3.9-4.4) 4.1 (3.9-4.5) 4.1 (3.9-4.4) 0.29 Missing, N (%) 3,205 (2.1%) 95 (2.1%) 3,110 (2.1%) 0.84 Calcium in mg/dL, median (IQR) 9.2 (9.0-9.5) 9.2 (8.9-9.5) 9.2 (9.0-9.5) <0.001 Missing, N (%) 7,050 (4.6%) 191 (4.1%) 6,859 (4.6%) 0.11 Hemoglobin in g/dL, median (IQR) 14.0 (12.9-15.0) 13.3 (11.9-14.4) 14.0 (13.0-15.0) <0.001 Missing, N (%) 27,023 (18%) 650 (14%) 26,373 (18%) <0.001 Albumin in g/dL, median (IQR) 4.1 (3.9-4.4) 4 (3.7-4.3) 4.1 (3.9-4.4) <0.001 Missing, N (%) 43,216 (28%) 1,153 (25%) 42,063 (29%) <0.001 Hemoglobin A1c in percentage, median (IQR) 6.1 (5.6-7) 6.5 (5.8-7.6) 6.1 (5.6-7) <0.001 Missing, N (%) 108,953 (72%) 2,770 (60%) 106,183 (72%) <0.001 Baseline history of hospitalization, N (%) 16,222 (11%) 1,004 (22%) 15,218 (10%) <0.001 IQR: interquartile range. aOutcome of acute kidney injury (AKI) occurring and managed in the outpatient setting over an 18-month period. bTwo sample t-test or Chi-square test, as applicable, applied to those with outpatient AKI vs. those without. cEstimated glomerular filtration rate (eGFR), by the Modification of Diet in Renal Disease equation, in mL/minute per 1.73m2. dProteinuria defined as the most recent of spot albuminuria, spot total proteinuria, or semi- quantitative albuminuria by urinalysis, derived from Table 7 of the Kidney Disease Improving Global Outcomes 2012 Clinical Practice Guideline for the Evaluation and Management of Chronic Kidney Disease (reference 27). “Severe” = spot albuminuria or total proteinuria ≥ 300 mg/g or albuminuria by urinalysis of ‘300’ or equivalent. “Moderate” = spot albuminuria or total proteinuria ≥ 30 mg/g or albuminuria by urinalysis of ‘30’ or equivalent. “Normal” or not elevated if otherwise measured. eBaseline history of AKI occurring and managed in the outpatient setting. fBaseline history of AKI managed in the inpatient setting regardless of setting of onset. gCardiovascular disease defined as the composite of coronary artery disease, congestive heart failure, stroke, or peripheral vascular disease as non-mutually exclusive categories. 11 Table 2. Baseline characteristics of model validation cohort, overall and by outpatient acute kidney injury Baseline Characteristic All Patients (N = 4,864,576) With outpatient AKIa (N = 115,744) Without outpatient AKIa (N = 4,748,832) P- valueb Age in years, median (IQR) 63.3 (52.4-70.2) 63.9 (56.8-70.6) 63.2 (52.3-70.2) <0.001 Male, N (%) 4,492,178 (92%) 106,449 (92%) 4,385,729 (92%) <0.001 Black, N (%) 815,167 (17%) 23,457 (20%) 791,710 (17%) <0.001 Ever smoker, N (%) 3,466,441 (71%) 90,231 (78%) 3,376,210 (71%) <0.001 Baseline Creatinine in mg/dL, median (IQR) 1.0 (0.9-1.1) 1.0 (0.8-1.2) 1.0 (0.9-1.1) <0.001 eGFRc stage, N (%) <0.001 eGFR ≥ 60 4,080,191 (84%) 87,880 (76%) 3,992,311 (84%) Stage 3A, eGFR 45-59 572,480 (112%) 15,938 (14%) 556,542 (12%) Stage 3B, eGFR 30-44 177,056 (3.6%) 8,129 (7.0%) 168,927 (3.6%) Stage 4, eGFR 15-29 34,849 (0.7%) 3,797 (3.3%) 31,052 (0.7%) Most recent proteinuria elevatedd, N (%) 623,912 (1%) 29,135 (27%) 594,777 (14%) <0.001 Most recent proteinuria not elevatedd, N (%) 3,797,304 (86%) 77,568 (73%) 3,719,736 (86%) <0.001 Missing, N (%) 443,360 (9.1%) 9,041 (7.8%) 434,319 (9.1%) <0.001 Baseline proteinuria is “moderate”d, N (%) 569,567 (12%) 23,851 (21%) 545,716 (12%) <0.001 Baseline proteinuria is “severe”d, N (%) 54,345 (1.1%) 5,284 (4.6%) 49,061 (1.0%) <0.001 Baseline history of outpatient AKIe, N (%) 322,564 (6.6%) 20,484 (18%) 302,080 (6.4%) <0.001 Baseline history of inpatient AKIf, N (%) 119,253 (2.5%) 9,184 (7.9%) 110,069 (2.3%) <0.001 Number creatinine values at baseline, median (IQR) 8 (4-17) 12 (5-24) 8 (4-17) <0.001 Diabetes mellitus, N (%) 1,311,503 (27%) 51,945 (45%) 1,259,558 (27%) <0.001 Hypertension, N (%) 3,024,529 (62%) 90,739 (78%) 2,933,790 (62%) <0.001 Cardiovascular diseaseg, N (%) 1,259,851 (26%) 45,748 (40%) 1,214,103 (26%) <0.001 Coronary artery diseaseg, N (%) 1,006,004 (21%) 34,841 (30%) 971,163 (21%) <0.001 Congestive heart failureg, N (%) 221,907 (4.6%) 14,361 (12%) 207,546 (4.4%) <0.001 Strokeg, N (%) 148,727 (3.1%) 5,739 (5.0%) 142,988 (3.0%) <0.001 Peripheral vascular diseaseg, N (%) 271,887 (5.6%) 12,792 (11%) 259,095 (5.5%) <0.001 Liver disease, N (%) 142,127 (2.9%) 7,114 (6.1%) 135,013 (2.8%) <0.001 Cancer, N (%) 816,634 (17%) 26,079 (23%) 790,555 (17%) <0.001 12 Systolic blood pressure in mmHg, median (IQR) 129 (120-138) 131 (120-142) 129 (120-138) <0.001 Missing, N (%) 18,296 (0.4%) 517 (0.4%) 17,779 (0.4%) <0.001 Diastolic blood pressure in mmHg, median (IQR) 77 (70-84) 76 (68-84) 77 (70-84) <0.001 Missing, N (%) 21,091 (0.4%) 593 (0.5%) 20,498 (0.4%) <0.001 Sodium in mmol/L, median (IQR) 139 (138-141) 139 (137-141) 139 (138-141) <0.001 Missing, N (%) 11,946 (0.2%) 325 (0.3%) 11,621 (0.2%) 0.014 Potassium, in mmol/L, median (IQR) 4.2 (4.0-4.5) 4.2 (3.9-4.5) 4.2 (4.0-4.5) <0.001 Missing, N (%) 33,930 (0.7%) 816 (0.7%) 33,114 (0.7%) 0.76 Calcium in mg/dL, median (IQR) 9.3 (9.0-9.6) 9.3 (9.0-9.6) 9.3 (9.0-9.6) <0.001 Missing, N (%) 259,971 (5.3%) 4,733 (4.1%) 255,238 (5.4%) <0.001 Hemoglobin in g/dL, median (IQR) 14.5 (13.5-15.4) 13.8 (12.5-14.9) 14.5 (13.5-15.4) <0.001 Missing, N (%) 77,238 (1.6%) 1,614 (1.4%) 75,624 (1.6%) <0.001 Albumin in g/dL, median (IQR) 4.1 (3.8-4.3) 4 (3.6-4.2) 4.1 (3.8-4.3) <0.001 Missing, N (%) 330,596 (6.8%) 5,750 (5.0%) 324,846 (6.8%) <0.001 Hemoglobin A1c in percentage, median (IQR) 5.9 (5.5-6.5) 6.1 (5.6-7.2) 5.9 (5.5-6.5) <0.001 Missing, N (%) 1,212,012 (25%) 20,102 (17%) 1,191,910 (25%) <0.001 Baseline history of hospitalization, N (%) 1,069,398 (22%) 44,850 (39%) 1,024,548 (22%) <0.001 IQR: interquartile range. aOutcome of acute kidney injury (AKI) occurring and managed in the outpatient setting over an 18-month period. bTwo sample t-test or Chi-square test, as applicable, applied to those with outpatient AKI vs. those without. cEstimated glomerular filtration rate (eGFR), by the Modification of Diet in Renal Disease equation, in mL/minute per 1.73m2. dProteinuria defined as the most recent of spot albuminuria, spot total proteinuria, or semi- quantitative albuminuria by urinalysis, derived from Table 7 of the Kidney Disease Improving Global Outcomes 2012 Clinical Practice Guideline for the Evaluation and Management of Chronic Kidney Disease (reference 27). “Severe” = spot albuminuria or total proteinuria ≥ 300 mg/g or albuminuria by urinalysis of ‘300’ or equivalent. “Moderate” = spot albuminuria or total proteinuria ≥ 30 mg/g or albuminuria by urinalysis of ‘30’ or equivalent. “Normal” or not elevated if otherwise measured. eBaseline history of AKI occurring and managed in the outpatient setting. fBaseline history of AKI managed in the inpatient setting regardless of setting of onset. gCardiovascular disease defined as the composite of coronary artery disease, congestive heart failure, stroke, or peripheral vascular disease as non-mutually exclusive categories. 13 Table 3. Performance in the validation cohort (unless otherwise specified) of two binary tests produced from the risk prediction model for outpatient acute kidney injury in the 18 months after a primary care-baseline period Statistic Research-test Clinical monitoring-test Predicted risk cut-pointa ≥ 9.5% ≥ 4.5% Sensitivity 0.210 (95% CI: 0.208-0.213) 0.494 (95% CI: 0.491-0.497) Specificity 0.952 (95% CI: 0.951-0.952) 0.806 (95% CI: 0.806-0.807) Positive predictive value 0.096 (95% CI: 0.094-0.097) 0.058 (95% CI: 0.058-0.059) Negative predictive value 0.980 (95% CI: 0.980-0.980) 0.985 (95% CI: 0.985-0.985) Positive likelihood ratio 4.34 (95% CI: 4.29-4.39) 2.55 (95% CI:2.53-2.57) Negative likelihood ratio 0.83 (95% CI: 0.83-0.83) 0.63 (95% CI: 0.62-0.63) Percent of development cohort “negative” by the test (percentile of cut-point) 96.7% 85.5% Percent of validation cohort “negative” by the test (percentile of cut-point) 94.8% 80.0% P-value for predicting outpatient AKIb <0.001 <0.001 CI: confidence interval. aThreshold defining a “positive” test, as defined based on model performance in the development cohort. bBy Chi-squared test, for the association between outpatient acute kidney injury (AKI) and test- positivity. 14 Figure 1. Patient study-eligibility flow diagram for the development cohort, derived from the M Health Fairview healthcare system electronic health record. 15 Figure 2. Patient study-eligibility flow diagram for the validation cohort, derived from the Veterans Affairs healthcare system electronic health record. 16 Figure 3. Observed risk vs. mean predicted risk of outpatient acute kidney injury (AKI) in the 18 months after a primary care-baseline period by decile of predicted risk in the validation cohort (N = 4,864,576). Observed and mean predicted risks by deciles are: 0.5% vs. 0.9% for 1st decile, 0.8% vs. 1.3% for 2nd decile, 1.0% vs. 1.6% for 3rd decile, 1.2% vs. 1.9% for 4th decile, 1.5% vs. 2.2% for 5th decile, 1.9% vs. 2.6% for 6th decile, 2.3% vs. 3.1% for 7th decile, 2.9% vs. 4.0% for 8th decile, 4.0% vs. 5.4% for 9th decile, and 7.7% vs. 12.1% for 10th decile, respectively. N = 486,458 for odd-numbered deciles and 486,457 for even-numbered deciles. 17 Figure 4. Kaplan-Meier survival curves for (A) the lowest decile of predicted outpatient acute kidney injury (AKI)-risk (N = 450,160) and (B) the highest decile of predicted outpatient AKI-risk (N = 481,058) in the validation cohort, beginning at the end of the 18-month AKI-period, among those not lost to follow-up prior to the end of the AKI- period, and censored on final encounter, vital sign, or lab measurement. 18 Discussion In this study, we developed and externally validated a risk prediction model for AKI occurring and managed in the outpatient setting. Our model performed equally well in both the development and external validation cohorts at discriminating between cases and non-cases of outpatient AKI (C-statistic ~0.72). This performance was similar to that seen in inpatient AKI-risk models, which have generally produced C-statistics ranging from 0.7-0.8.18, 19, 21, 33 Our results stand in contrast to prior risk prediction models focused on AKI among hospitalized patients. Several risk-factors known to associate with inpatient AKI overlapped with the risk-factors identified in this study of outpatient AKI, while other novel risk-factors were identified. In the study by Malhotra et al., like our study, CKD, heart disease, liver disease, hypertension, and anemia were identified as predictive of AKI.19 Park et al. identified albuminuria, diabetes mellitus, hypoalbuminemia, and hyponatremia as post-surgical AKI risk-factors.21 Other known predictors of AKI that are typically limited to the inpatient setting, including documented acidemia, mechanical ventilation or hypoxia, or sepsis,15-17, 19, 34 were not included in our model for variable selection. Novel predictors of AKI in the outpatient setting identified included history of hospitalization, smoking history, and disturbances in serum sodium or potassium that include both elevation or depression of levels. Surprisingly, hypotension has not been frequently identified as a risk-factor for inpatient AKI.19 We identified a greater impact on AKI-risk by diastolic blood pressure over the physiologic range observed than by systolic blood pressure, though both were selected for model inclusion. We also identified interactions between age and prior hospitalizations and between sex and systolic and diastolic blood pressures on their association with outpatient AKI-risk. Also identified were the predicted risk of outpatient AKI associated with lacking a baseline lab or vital sign measurement, which to our knowledge has not been previously 19 reported for AKI in any setting. The study by Tomasev et al. did not provide risk- estimates for individual covariates.20 While lacking a prior lab or vital sign measurement may be a marker of infrequent or incomplete care by which outpatient AKI may be missed, not all missingness-covariates predicted a reduced risk of observing outpatient AKI. Rather, lacking a prior proteinuria assessment, hemoglobin A1c measurement, and/or serum albumin measurement increased the risk for outpatient AKI. Lacking a prior serum sodium and/or potassium measurement, which are frequently included on metabolic panels with creatinine, reduced the risk for outpatient AKI in this observational study, as did lacking a systolic and/or diastolic blood pressure measurement at baseline. Missing a baseline hemoglobin measurement also appeared protective. Not identified in our study as a risk-factor for outpatient AKI was a history of cancer despite being identified as a risk-factor for AKI in the critical care setting.17 Neither race nor serum calcium level were selected for model inclusion despite being identified as risk-factors for inpatient AKI by Matheny et al.18 While a history of prior outpatient AKI was identified as a risk-factor for subsequent outpatient AKI, a history of inpatient AKI was unexpectedly not identified as an outpatient-AKI risk-factor in our study. Our model did display statistically significant differences between the observed and predicted risk in the validation cohort. This was not an unexpected result given the cohort’s size, providing power to detect small differences. While validation cohorts are typically recommended to contain at least 100 cases and 100 non-cases,35, 36 ours contained 115,744 cases and over 4.7 million non-cases. Despite this, reasonable visual agreement was seen between the observed risk for outpatient AKI in the validation cohort and the mean predicted risk, as shown by each decile of predicted risk. Our model did typically overestimate the risk for outpatient AKI observed in the validation cohort, which may owe to important differences between the two cohorts not captured by the covariates included in our study, suggesting a higher-risk population in the development cohort. It does not appear this difference is easily attributable to less frequent lab 20 measurements in the VA system as a smaller proportion of patients were excluded in that population for lacking a creatinine measurement during the outcome period. While not the focus of this study, we observed that a baseline history of outpatient AKI was two to three times as prevalent as a baseline history of inpatient AKI, consistent with our prior work on incident outpatient AKI.8 In this study, we showed that outpatient AKI predicts future risk for mortality even among groups with uniform risk for outpatient AKI. Put differently, despite overlapping risk-factors for death and outpatient AKI, mortality-risk is not fully captured by the presence of AKI risk-factors and is heightened by the occurrence of outpatient AKI. While a larger hazard ratio was observed for the lowest AKI-risk group, on an absolute scale, the occurrence of outpatient AKI predicts a greater mortality-burden in those patients at the greatest AKI-risk. It is unknown what pre- or post-AKI interventions, after early identification of patients at high risk for outpatient AKI, may reduce this mortality-risk. In addition to its continuous form, our outpatient AKI-risk prediction model was transformed into two separate binary tests with different applications: one for a hypothetical research recruitment-scenario and one for identifying patients for closer clinical monitoring or intervention. These examples illustrated how our model could be used to define certain patients as “high-risk” for outpatient AKI, depending on the risks and benefits of being so labeled. As expected, greater sensitivity and a corresponding tradeoff of decreased specificity was seen with lowering the “positive” threshold. Additional strengths of this study include a lack of selection bias from excluding patients with missing data or the need for imputation to model missing data for inclusion. The large size of our development and validation cohorts was also a notable strength. Finally, study strengths also include the use of objective variables in the EHR for our model that could be used by both clinicians and researchers to identify high-risk patients. The limitations of this study include its retrospective, observational design. Additional cases of outpatient AKI not reflected in the EHR may have been missed. Our study design also restricted the cohorts to those patients surviving the 18-month AKI- 21 period. While some cases of outpatient AKI soon followed by death may have been missed, this design was implemented to predict outpatient AKI within the subsequent 18 months among those patients surviving this period upon whom future interventions may have the greatest likelihood of improving outcomes. This design also avoided bias from systematically classifying patients who died before outpatient AKI into the ‘no AKI’ group. Finally, medications were not considered in this study, in part, due to limited accuracy of outpatient medication lists outside of managed care organizations. While the VA system may qualify as such, our development cohort was not derived from such a system. 22 Conclusion Our outpatient AKI-risk prediction model performed well in both our development and validation cohorts. The risk predicted by our model is calculable from readily available vital signs, labs tests, and diagnosed comorbidities in addition to incorporating the predictive quality of lacking these measurements. Such a tool can be adapted into a binary test to categorize patients as high-risk or not based on weighing risks and benefits and could be used to guide clinical monitoring or identify subjects for future research. Future studies will be needed to prospectively validate our model and to determine if interventions in patients with elevated risk for outpatient AKI can reduce the incidence of AKI and lead to a reduction in the mortality associated with AKI in the outpatient setting. 23 Bibliography 1. Hsu, CY, McCulloch, CE, Fan, D, Ordonez, JD, Chertow, GM, Go, AS: Community-based incidence of acute renal failure. Kidney Int, 72: 208-212, 2007. 2. Sileanu, FE, Murugan, R, Lucko, N, Clermont, G, Kane-Gill, SL, Handler, SM, Kellum, JA: AKI in low-risk versus high-risk patients in intensive care. Clin J Am Soc Nephrol, 10: 187-196, 2015. 3. Kaufman, J, Dhakal, M, Patel, B, Hamburger, R: Community-acquired acute renal failure. Am J Kidney Dis, 17: 191-198, 1991. 4. Wonnacott, A, Meran, S, Amphlett, B, Talabani, B, Phillips, A: Epidemiology and outcomes in community-acquired versus hospital-acquired AKI. Clin J Am Soc Nephrol, 9: 1007-1014, 2014. 5. Soto, K, Campos, P, Pinto, I, Rodrigues, B, Frade, F, Papoila, AL, Devarajan, P: The risk of chronic kidney disease and mortality are increased after community-acquired acute kidney injury. Kidney Int, 90: 1090-1099, 2016. 6. Holmes, J, Rainer, T, Geen, J, Roberts, G, May, K, Wilson, N, Williams, JD, Phillips, AO, Welsh, AKISG: Acute Kidney Injury in the Era of the AKI E-Alert. Clin J Am Soc Nephrol, 11: 2123- 2131, 2016. 7. Sawhney, S, Fluck, N, Fraser, SD, Marks, A, Prescott, GJ, Roderick, PJ, Black, C: KDIGO-based acute kidney injury criteria operate differently in hospitals and the community-findings from a large population cohort. Nephrol Dial Transplant, 31: 922-929, 2016. 8. Leither, MD, Murphy, DP, Bicknese, L, Reule, S, Vock, DM, Ishani, A, Foley, RN, Drawz, PE: The impact of outpatient acute kidney injury on mortality and chronic kidney disease: a retrospective cohort study. Nephrol Dial Transplant, 34: 493-501, 2019. 9. Jones, J, Holmen, J, De Graauw, J, Jovanovich, A, Thornton, S, Chonchol, M: Association of complete recovery from acute kidney injury with incident CKD stage 3 and all-cause mortality. Am J Kidney Dis, 60: 402-408, 2012. 10. Bucaloiu, ID, Kirchner, HL, Norfolk, ER, Hartle, JE, 2nd, Perkins, RM: Increased risk of death and de novo chronic kidney disease following reversible acute kidney injury. Kidney Int, 81: 477-485, 2012. 11. Wu, VC, Wu, CH, Huang, TM, Wang, CY, Lai, CF, Shiao, CC, Chang, CH, Lin, SL, Chen, YY, Chen, YM, Chu, TS, Chiang, WC, Wu, KD, Tsai, PR, Chen, L, Ko, WJ, Group, N: Long-term risk of coronary events after AKI. J Am Soc Nephrol, 25: 595-605, 2014. 12. Chawla, LS, Amdur, RL, Shaw, AD, Faselis, C, Palant, CE, Kimmel, PL: Association between AKI and long-term renal and cardiovascular outcomes in United States veterans. Clin J Am Soc Nephrol, 9: 448-456, 2014. 13. Wald, R, Quinn, RR, Luo, J, Li, P, Scales, DC, Mamdani, MM, Ray, JG, University of Toronto Acute Kidney Injury Research, G: Chronic dialysis and death among survivors of acute kidney injury requiring dialysis. JAMA, 302: 1179-1185, 2009. 14. Hobbs, H, Bassett, P, Wheeler, T, Bedford, M, Irving, J, Stevens, PE, Farmer, CK: Do acute elevations of serum creatinine in primary care engender an increased mortality risk? BMC Nephrol, 15: 206, 2014. 24 15. Coritsidis, GN, Guru, K, Ward, L, Bashir, R, Feinfeld, DA, Carvounis, CP: Prediction of acute renal failure by "bedside formula" in medical and surgical intensive care patients. Ren Fail, 22: 235-244, 2000. 16. Hoste, EA, Lameire, NH, Vanholder, RC, Benoit, DD, Decruyenaere, JM, Colardyn, FA: Acute renal failure in patients with sepsis in a surgical ICU: predictive factors, incidence, comorbidity, and outcome. J Am Soc Nephrol, 14: 1022-1030, 2003. 17. Chawla, LS, Abell, L, Mazhari, R, Egan, M, Kadambi, N, Burke, HB, Junker, C, Seneff, MG, Kimmel, PL: Identifying critically ill patients at high risk for developing acute renal failure: a pilot study. Kidney Int, 68: 2274-2280, 2005. 18. Matheny, ME, Miller, RA, Ikizler, TA, Waitman, LR, Denny, JC, Schildcrout, JS, Dittus, RS, Peterson, JF: Development of inpatient risk stratification models of acute kidney injury for use in electronic health records. Med Decis Making, 30: 639-650, 2010. 19. Malhotra, R, Kashani, KB, Macedo, E, Kim, J, Bouchard, J, Wynn, S, Li, G, Ohno-Machado, L, Mehta, R: A risk prediction score for acute kidney injury in the intensive care unit. Nephrol Dial Transplant, 32: 814-822, 2017. 20. Tomasev, N, Glorot, X, Rae, JW, Zielinski, M, Askham, H, Saraiva, A, Mottram, A, Meyer, C, Ravuri, S, Protsyuk, I, Connell, A, Hughes, CO, Karthikesalingam, A, Cornebise, J, Montgomery, H, Rees, G, Laing, C, Baker, CR, Peterson, K, Reeves, R, Hassabis, D, King, D, Suleyman, M, Back, T, Nielson, C, Ledsam, JR, Mohamed, S: A clinically applicable approach to continuous prediction of future acute kidney injury. Nature, 572: 116-119, 2019. 21. Park, S, Cho, H, Park, S, Lee, S, Kim, K, Yoon, HJ, Park, J, Choi, Y, Lee, S, Kim, JH, Kim, S, Chin, HJ, Kim, DK, Joo, KW, Kim, YS, Lee, H: Simple Postoperative AKI Risk (SPARK) Classification before Noncardiac Surgery: A Prediction Index Development Study with External Validation. J Am Soc Nephrol, 30: 170-181, 2019. 22. Grams, ME, Sang, Y, Ballew, SH, Gansevoort, RT, Kimm, H, Kovesdy, CP, Naimark, D, Oien, C, Smith, DH, Coresh, J, Sarnak, MJ, Stengel, B, Tonelli, M, Consortium, CKDP: A Meta- analysis of the Association of Estimated GFR, Albuminuria, Age, Race, and Sex With Acute Kidney Injury. Am J Kidney Dis, 66: 591-601, 2015. 23. Levey, AS, Bosch, JP, Lewis, JB, Greene, T, Rogers, N, Roth, D: A more accurate method to estimate glomerular filtration rate from serum creatinine: a new prediction equation. Modification of Diet in Renal Disease Study Group. Ann Intern Med, 130: 461-470, 1999. 24. Hall, RK, Wang, V, Jackson, GL, Hammill, BG, Maciejewski, ML, Yano, EM, Svetkey, LP, Patel, UD: Implementation of automated reporting of estimated glomerular filtration rate among Veterans Affairs laboratories: a retrospective study. BMC Med Inform Decis Mak, 12: 69, 2012. 25. Wang, V, Maciejewski, ML, Hammill, BG, Hall, RK, Van Scoyoc, L, Garg, AX, Jain, AK, Patel, UD: Recognition of CKD after the introduction of automated reporting of estimated GFR in the Veterans Health Administration. Clin J Am Soc Nephrol, 9: 29-36, 2014. 26. Levey, AS, Stevens, LA, Schmid, CH, Zhang, YL, Castro, AF, 3rd, Feldman, HI, Kusek, JW, Eggers, P, Van Lente, F, Greene, T, Coresh, J, Ckd, EPI: A new equation to estimate glomerular filtration rate. Ann Intern Med, 150: 604-612, 2009. 27. Kidney Disease: Improving Global Outcomes (KDIGO) CKD Work Group. KDIGO 2012 clinical practice guideline for the evaluation and management of chronic kidney disease. Kidney Int Suppl, 3: 1-150, 2013. 25 28. Wang, Y, Wang, J, Su, T, Qu, Z, Zhao, M, Yang, L, Consortium, IAbC: Community-Acquired Acute Kidney Injury: A Nationwide Survey in China. Am J Kidney Dis, 69: 647-657, 2017. 29. Bellomo, R, Ronco, C, Kellum, JA, Mehta, RL, Palevsky, P, Acute Dialysis Quality Initiative, w: Acute renal failure - definition, outcome measures, animal models, fluid therapy and information technology needs: the Second International Consensus Conference of the Acute Dialysis Quality Initiative (ADQI) Group. Crit Care, 8: R204-212, 2004. 30. Mehta, RL, Kellum, JA, Shah, SV, Molitoris, BA, Ronco, C, Warnock, DG, Levin, A, Acute Kidney Injury, N: Acute Kidney Injury Network: report of an initiative to improve outcomes in acute kidney injury. Crit Care, 11: R31, 2007. 31. Go, AS, Chertow, GM, Fan, D, McCulloch, CE, Hsu, CY: Chronic kidney disease and the risks of death, cardiovascular events, and hospitalization. N Engl J Med, 351: 1296-1305, 2004. 32. Liao, HF, Cheng, LY, Hsieh, WS, Yang, MC: Selecting a cutoff point for a developmental screening test based on overall diagnostic indices and total expected utilities of professional preferences. J Formos Med Assoc, 109: 209-218, 2010. 33. Thakar, CV, Arrigain, S, Worley, S, Yared, JP, Paganini, EP: A clinical score to predict acute renal failure after cardiac surgery. J Am Soc Nephrol, 16: 162-168, 2005. 34. Peres, LA, Wandeur, V, Matsuo, T: Predictors of acute kidney injury and mortality in an Intensive Care Unit. J Bras Nefrol, 37: 38-46, 2015. 35. Vergouwe, Y, Steyerberg, EW, Eijkemans, MJ, Habbema, JD: Substantial effective sample sizes were required for external validation studies of predictive logistic regression models. J Clin Epidemiol, 58: 475-483, 2005. 36. Collins, GS, Ogundimu, EO, Altman, DG: Sample size considerations for the external validation of a multivariable prognostic model: a resampling study. Stat Med, 35: 214- 226, 2016. 26 Appendix Supplemental Table 1. Accepted physiologic lab or vital sign values Lab (blood or serum unless specified) or vital sign Accepted physiologic values Creatinine 0.3-15.0 mg/dL Sodium 110-155 mmol/L Potassium 2-7 mmol/L Calcium 7-12 mg/dL Hemoglobin A1c 3.0-15.5 % Albumin 1.0-5.5 g/dL Hemoglobin 6-18 g/dL Spot urine albumin to creatinine ratio 0-20 g/g Spot urine total protein to creatinine ratio 0-20 g/g Systolic blood pressure 70-210 mmHg Diastolic blood pressure 30-120 mmHg Supplemental Table 2. Lab test serving as most recent to define proteinuria by cohort M Health Fairview-Development Cohort (N = 152,371) Spot total proteinuria Spot albuminuria Urinalysis-albumin concentration Missing 1307 (0.9%) 18,585 (12%) 70,491 (46%) 61,988 (41%) Veterans Administration-Validation Cohort (N = 4,864,576) Spot total proteinuria Spot albuminuria Urinalysis-albumin concentration Missing 23,548 (0.5%) 70,933 (1.5%) 4,326,735 (89%) 443,360 (9.1%) 27 Supplemental Table 3. Restricted cubic spline fitting of continuous covariates Covariate P-value for non- linear terma Passed visual inspection of non-linearity for inclusion in variable selectionb Age < 0.001 Yes Sodium < 0.001 Yes Potassium < 0.001 Yes Calcium 0.013 Yes Hemoglobin A1c 0.40 Albumin < 0.001 Yes Hemoglobin 0.081 Systolic blood pressure < 0.001 Yes Diastolic blood pressure < 0.001 Yes Baseline eGFRc < 0.001 Yes aBy Chi-squared test. bNot considered if P ≥ 0.05. cEstimated glomerular filtration rate (eGFR) by the Modification of Diet in Renal Disease equation in mL/minute per 1.73m2. 28 Supplemental Table 4. Interaction terms considered for variable selection in model development Covariate 1a Covariate 2a P-value for interaction termb Passed visual inspection of covariate- interaction for inclusion in variable selection stepc Systolic blood pressure Diastolic blood pressure 0.28 Baseline history of hypertension Systolic blood pressure 0.46 Diastolic blood pressure 0.79 Baseline proteinuriad Systolic blood pressure 0.16 Diastolic blood pressure 0.40 Missing proteinuria Systolic blood pressure 0.31 Diastolic blood pressure 0.78 Baseline history of outpatient AKIe Systolic blood pressure 0.63 Diastolic blood pressure 0.94 Baseline proteinuria 0.11 Missing proteinuria 0.049 N/A Baseline history of inpatient AKI 0.095 Baseline eGFR 0.064 Baseline history of hospitalization 0.020 No Baseline history of inpatient AKIf Systolic blood pressure 0.63 Diastolic blood pressure 0.43 Baseline proteinuria 0.53 Missing proteinuria 0.33 Baseline eGFR 0.033 No Baseline eGFRg Systolic blood pressure 0.78 Diastolic blood pressure 0.57 Baseline proteinuria 0.63 Missing proteinuria 0.46 Baseline history of hospitalization < 0.001 No Age Systolic blood pressure 0.007 No Diastolic blood pressure 0.26 Baseline proteinuria 0.16 Missing proteinuria 0.20 Baseline history of outpatient AKI 0.097 Baseline history of inpatient AKI 0.44 Baseline eGFR < 0.001 No Baseline history of hospitalization 0.014 Yes Sex 0.050 No Race 0.60 Sex Systolic blood pressure 0.017 Yes Diastolic blood pressure 0.007 Yes Baseline proteinuria 0.58 29 Missing proteinuria 0.85 Baseline history of outpatient AKI 0.23 Baseline history of inpatient AKI 0.026 No Baseline eGFR 0.11 Baseline history of hospitalization 0.35 Race 0.74 Race Systolic blood pressure 0.75 Diastolic blood pressure 0.37 Baseline proteinuria 0.086 Missing proteinuria 0.041 N/A Baseline history of outpatient AKI 0.55 Baseline history of inpatient AKI 0.12 Baseline eGFR < 0.001 No Baseline history of hospitalization 0.13 aIf a continuous covariate, modeled with restricted cubic splines with 4 knots. bBy Chi-squared test. cNot considered if interaction P ≥ 0.05 and, for Missing proteinuria, a priori decided to be not applicable (N/A) if Baseline proteinuria at interaction P ≥ 0.05. dBaseline proteinuria categorically defined as “normal,” “moderate,” or “severe.” eAcute kidney injury (AKI) occurring and managed in the outpatient setting. fAcute kidney injury (AKI) managed in the inpatient setting regardless of setting of onset. geGFR: Estimated glomerular filtration rate by the Modification of Diet in Renal Disease equation in mL/minute per 1.73m2. 30 Supplemental Table 5. Risk prediction model for outpatient acute kidney injury in the 18 months following a primary care-defined baseline period Covariates or interaction term Coefficient on logit-scale Term forced back into modela Intercept 7.3838946 Age in yearsb -2.4745e-3 (Maximum of Age - 26.936345 or 0)^3 +1.52e-5 (Maximum of Age - 48.510609 or 0)^3 -6.61e-5 (Maximum of Age - 61.798768 or 0)^3 +6.81e-5 (Maximum of Age - 82.069815 or 0)^3 -1.71e-5 Sex (male = 1, female = 0) -0.0496985 Yes – selected in interaction term History of smoking (yes = 1, no = 0) +0.2034074 Baseline history of outpatient AKIc (yes = 1, no = 0) +0.7728651 Baseline eGFRd in mL/minute per 1.73m2 -0.0224306 (Maximum of eGFR - 46.677272 or 0)^3 +9.7e-6 (Maximum of eGFR - 70.912146 or 0)^3 -2.12e-5 (Maximum of eGFR - 85.877661 or 0)^3 +9.6e-6 (Maximum of eGFR - 121.64961 or 0)^3 +1.8e-6 Baseline proteinuriae = “moderate” +0.1237063 Baseline proteinuriae = “severe” +0.4220050 Missing proteinuria (missing = 1, measured = 0) +0.0446272 Yes – paired variable selected Cardiovascular disease +0.2543603 Diabetes mellitus +0.2951630 Hypertension +0.0953643 Liver disease +0.2248400 Sodium in mmol/L -0.0408204 (Maximum of Sodium - 132 or 0)^3 +1.347e-4 (Maximum of Sodium - 139 or 0)^3 +2.0678e-3 (Maximum of Sodium - 141 or 0)^3 -3.9851e-3 (Maximum of Sodium - 144 or 0)^3 +1.7826e-3 Missing sodium (missing = 1, measured = 0) -5.8451221 Potassium in mmol/L -0.4424903 (Maximum of Potassium – 3.4 or 0)^3 +0.2412870 (Maximum of Potassium – 4.0 or 0)^3 -0.1970429 (Maximum of Potassium – 4.3 or 0)^3 -0.3603348 (Maximum of Potassium – 4.8 or 0)^3 +0.3160908 Missing potassium (missing = 1, measured = 0) -1.3293186 Yes – paired variable selected Hemoglobin A1c in percentage +0.0789011 Missing hemoglobin A1c (missing = 1, measured = 0) +0.4989531 Albumin in g/dL +0.4728501 (Maximum of Albumin - 0 or 0)^3 -0.0278710 (Maximum of Albumin – 3.6 or 0)^3 +0.9334718 (Maximum of Albumin – 4.1 or 0)^3 -1.4930423 31 (Maximum of Albumin – 4.7 or 0)^3 +0.5874415 Missing albumin (missing = 1, measured = 0) +0.2825063 Yes – paired variable selected Hemoglobin in g/dL -0.1229505 Missing hemoglobin (missing = 1, measured = 0) -1.7772110 Systolic blood pressure in mmHg -1.9426e-3 (Maximum of Systolic blood pressure - 92 or 0)^3 +2e-7 (Maximum of Systolic blood pressure - 118 or 0)^3 +6.4e-6 (Maximum of Systolic blood pressure - 130 or 0)^3 -1.01e-5 (Maximum of Systolic blood pressure - 154 or 0)^3 +3.5e-6 Missing systolic blood pressure (missing = 1, measured = 0) -0.6393532 Yes – paired variable selected Diastolic blood pressure in mmHg -0.0287675 (Maximum of Diastolic blood pressure - 52 or 0)^3 +2.89e-5 (Maximum of Diastolic blood pressure - 70 or 0)^3 -9.50e-5 (Maximum of Diastolic blood pressure - 80 or 0)^3 +7.78e-5 (Maximum of Diastolic blood pressure - 92 or 0)^3 -1.17e-5 Missing diastolic blood pressure (missing = 1, measured = 0) -1.0947418 History of hospitalization (yes =1, no = 0) -0.4081559 Yes – selected in interaction term History of hospitalization * Age (hospitalization: yes = 1, no = 0; age in years) 0.0207889 History of hospitalization * (Maximum of Age - 26.936345 or 0)^3 -1.60e-5 History of hospitalization * (Maximum of Age - 48.510609 or 0)^3 +3.77e-5 History of hospitalization * (Maximum of Age - 61.798768 or 0)^3 -1.90e-5 History of hospitalization * (Maximum of Age - 82.069815 or 0)^3 2.7e-6 Sex * Systolic blood pressure (sex: male = 1, female = 0; systolic blood pressure in mmHg) -0.0146417 Sex * (Maximum of Systolic blood pressure - 92 or 0)^3 +6.3e-6 Sex * (Maximum of Systolic blood pressure - 118 or 0)^3 -1.78e-5 Sex * (Maximum of Systolic blood pressure - 130 or 0)^3 +1.04e-5 Sex * (Maximum of Systolic blood pressure - 154 or 0)^3 +1.1e-6 Sex * Diastolic blood pressure (sex: male = 1, female = 0; diastolic blood pressure in mmHg) 0.0250623 Sex * (Maximum of Diastolic blood pressure - 52 or 0)^3 -3.87e-5 Sex * (Maximum of Diastolic blood pressure - 70 or 0)^3 +1.725e-4 Sex * (Maximum of Diastolic blood pressure - 80 or 0)^3 -1.873e-4 Sex * (Maximum of Diastolic blood pressure - 92 or 0)^3 +5.35e-5 aAll other terms selected by backward stepwise selection based on the Akaike Information Criterion with N = 1,000 bootstrap sampling. bAge in years, not rounded down to the nearest integer. cAcute kidney injury (AKI) occurring and managed in the outpatient setting. 32 dBaseline estimated glomerular filtration rate (eGFR), which by cohort inclusion/exclusion criteria was never missing, calculated by the Modification of Diet in Renal Disease equation. eBaseline proteinuria categorically defined as “severe,” “moderate,” or, if otherwise measured, “normal” by default and included in the y-intercept.