Ascertaining the validity of suicide data to quantify the impacts and identify predictors of suicide misclassification

Thumbnail Image

Persistent link to this item

View Statistics

Journal Title

Journal ISSN

Volume Title


Ascertaining the validity of suicide data to quantify the impacts and identify predictors of suicide misclassification

Published Date




Thesis or Dissertation


Background: Misclassification plagues suicide data, but few evaluations of these data have been done, and even fewer studies have estimated the impacts of misclassification. There are specific concerns about misclassifying true suicides as other deaths are rarely, if ever, certified as suicides. Public health surveillance efforts may be hampered through inaccurate accounting of suicide because of misclassification. Furthermore, misclassification biases estimates of effect when identifying risk and protective factors for suicide. In parallel, with an enrichment of data sources, data-driven machine learning methods have begun to be harnessed as a tool to predict misclassification and identify factors or decedent characteristics associated with increased likelihood of misclassification. Thus, there is a critical need to examine the validity of suicide data and the ways misclassification is understood in order to support public health interventions and policy that are directed by these data.The overall objective of this proposal was to examine misclassification in suicides from death certificates and estimate the impact and determinants of this misclassification. The central hypothesis was that suicide data were misclassified and the validity of such data was poor. The central hypothesis was tested by pursuing three specific aims. Aim 1: Calculate estimates of sensitivity, specificity, and positive and negative predictive values for the misclassification of suicides identified from death certificates in Minnesota overall and by industry group. Methods: A classic validation study was conducted that compared suicides reported from death certificates with suicides classified using a proxy gold standard, the Self-Directed Violence Classification System (SDVCS). Results: Contrary to our hypothesis, minimal misclassification of suicides was identified. One exception was observed in the Armed Forces industry, where relatively poor sensitivity estimates suggested potential underreporting of suicide. The data abstraction process, however, revealed common circumstances and risk factors shared between suicides and non-suicides, including mental and physical health diagnoses, substance use, and treatment for mental and substance use conditions. Aim 2a: Demonstrate the impact of misclassification on suicide incidence by applying estimates of sensitivity and specificity to suicide counts from death certificates; Aim 2b: determine the impact of misclassification on the association between opioid use and suicide through misclassification bias analysis. For aim (2a), suicide counts along with sensitivity and specificity estimates were used to calculate and compare corrected suicide incidence rates. For aim (2b), a probabilistic misclassification bias analysis was conducted where estimates of sensitivity and specificity were used to produce a record-level bias-adjusted data set. The bias-adjusted data were then used to calculate the measure of association between opioid use and suicide, which was compared with the results using the original data. Results: The true incidence of suicide increased after suicide misclassification was accounted for in each industry sector. This was consistent across the various validity scenarios. For the misclassification bias analysis, the odds ratio showed that opioid-involved deaths were 0.25 (95% CI: 0.20, 0.32) times as likely to be classified as suicide compared with non-opioid-involved deaths. After correcting for misclassification, the association estimate did not change from the original estimate (0.22, 95% simulation interval: 0.07, 0.32). Aim 3: Identify factors indicative and predictive of suicide misclassification through machine learning. Methods: Aim 3 was attained by developing classification algorithms that predicted and identified risk factors associated with suicide misclassification under various suicide comparison scenarios (i.e., medical examiner/coroner certified suicides, probable suicides, and possible suicides). Results: Accurate models were developed across the suicide comparison scenarios that consistently performed well and offered valuable insights into suicide misclassification. The top variables influencing the classification of overdose suicides included previous suicidal behaviors, the presence of a suicide note, substance use history, and evidence of mental health treatment. Treatment for pain, recent release from an institution, and prior overdose were also important factors that had not been previously identified as predictors of suicide classification. Conclusion: The proposed research was innovative because it represented a substantive departure from acknowledgment of suicide misclassification to attempting to understand and correct measures of suicide incidence and association, along with identifying factors indicative of suicide misclassification. However, minimal evidence of misclassification was found, and when a misclassification bias analysis was done it showed no effect on the association estimate. Data limitations, such as missing or not collected circumstance factors, along with a control group that may not meet the exchangeability assumption, likely impacted the results. Novel factors associated with suicide misclassification were identified, providing a foundation for future research to build upon. The need remains, though, for accurate and valid suicide data both for public health surveillance, as well as to produce unbiased association estimates to identify risk and predictive factors of suicide. The extent to which suicide statistics are used by researchers and policy makers mandates that efforts be made to understand and improve the validity of suicide data.


University of Minnesota Ph.D. dissertation. November 2023. Major: Epidemiology. Advisor: Marizen Ramirez. 1 computer file (PDF); viii, 66 pages.

Related to




Series/Report Number

Funding information

Isbn identifier

Doi identifier

Previously Published Citation

Suggested citation

Wright, Nate. (2023). Ascertaining the validity of suicide data to quantify the impacts and identify predictors of suicide misclassification. Retrieved from the University Digital Conservancy,

Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.