Browsing by Subject "validity"

Now showing 1 - 5 of 5

Correlates of Annual Testing for Sexually Transmitted Infections (STIs) in an Online Sample of Men Who Have Sex with Men (MSM): Study Sample Validity, Measure Reliability, and Behavioral Typologies
(2013-07) Grey, Jeremy
Objective: Testing for STIs has been prioritized as part of a comprehensive HIV/AIDS prevention plan. Internet-based studies of STI testing among men who have sex with men (MSM) are efficient methods of recruiting non-clinic samples from diverse geographic areas. However, online survey methods raise unique concerns regarding threats to the validity of study samples and unknown measurement properties. Thus, this dissertation had two aims. The first was to examine methods related to online survey research by evaluating a protocol to detect invalid survey entries and determining the test-retest reliability of online measures of sexual behavior and STI testing. The second aim was to use the validated sample and reliable measures to examine correlates of STI testing in the year prior to the survey. Methods: In Manuscript 1, survey submissions were classified as valid and invalid according to a de-duplication and cross-validation protocol. Logistic regression models were used to determine associations between invalidity and key demographic and behavioral variables. In Manuscript 2, test-retest reliability over one week was evaluated using intraclass correlation coefficients (ICCs) and kappa statistics for measures of sexual behavior, HIV status, HIV testing, and STI diagnoses. Finally, in Manuscript 3, the valid sample from Manuscript 1 and measures that were evaluated in Manuscript 2 were used to examine the clustering and correlates of STI testing behaviors. Results: In Manuscript 1, three components of the protocol for detecting invalid submissions were responsible for identifying the most invalid survey submissions: duplicate IP address, changed eligibility responses, and duplicate payment name. A total of 146 (11.6%) of the submissions were identified as invalid. Invalid submissions had lower odds of reporting HIV testing in the past year. Hispanic/Latino identity, age, and HIV status were also significantly associated with invalidity. In Manuscript 2, counts of sexual partners (three months), HIV status, HIV testing, and STI diagnoses were found to have substantial (0.61-0.80) to almost perfect (0.81-1.00) seven-day test-retest reliabilities, according to commonly used cutpoints. Partner-specific data, however, were only fairly or moderately reliable (0.21-0.60). Finally, in Manuscript 3, a latent class analysis indicated five STI testing classes: no STIs, all STIs, bacterial STIs and hepatitis, bacterial STIs only, and hepatitis only. The largest class was no STIs, indicating that 45.8% of the validated sample had not been tested for STIs in the past year. Predictors of being in a testing class versus no STI testing included age, education, outness about having sex with men, HIV status, and having a sexual partner in the last three months. Conclusions: This dissertation served two primary aims. The first was to evaluate sample validity and measure reliability in an online study of MSM. The second was to apply the information from those analyses to examine the presence and correlates of a latent variable of STI testing. Across all three manuscripts, online survey research appears to be a viable method of studying STI testing in Internet-based samples of MSM.
The Development and Validation of an Evaluation Use Scale for Multi-site Evaluations
(2016-07) Johnson, Kelli
Evaluation researchers and practitioners share a commitment to evaluation use, and the research community has focused on evaluation use because it is an essential component of the practice of evaluation. While evaluation use is among the most frequently studied topics in the field, a scale for measuring the use of evaluations in multi-site settings has yet to be validated. This study describes the development and validation of the Evaluation Use Scale for assessing program evaluation use and the factors associated with evaluation use in multi-site science, technology, mathematics, and engineering (STEM) education programs. The data used in this study were collected as part of the NSF-funded Beyond Evaluation Use study (Lawrenz & King, 2009) and included the development and administration of the NSF Program Evaluation Follow-up Survey, a web-based survey of project leaders and evaluators in four multi-site STEM education programs. The study used Messick’s unitary concept of validity as a framework for assembling empirical evidence and theoretical rationales to assess the adequacy and appropriateness of inferences and actions based on the Evaluation Use Scale. The overall evidence in support of the validity of the Evaluation Use Scale as a measure of evaluation use in multi-site evaluations is mixed and varies by aspect of validity. In four of the six aspects, the evidence is adequate to strong. However, the evidence in the remaining two aspects is sharply split and, therefore, inconclusive.
Faking and the Validity of Personality Tests: Using New Faking-Resistant Measures to Study Some Old Questions
(2017-02) Huber, Christopher
Despite strong evidence supporting the validity of personality measures for personnel selection, their susceptibility to faking has been a persistent concern. Research has found that many job applicants exaggerate their possession of desirable traits, and there are reasons to believe that this distortion reduces criterion-related validity. However, the lack of studies that combine experimental control with real-world generalizability makes it difficult to isolate the effects of applicant faking. Experimental studies have typically induced faking using explicit instructions to fake, which elicit unusually extreme faking compared to typical applicant settings. A variety of non-experimental approaches have also been employed, but these approaches are largely inadequate for establishing cause-and-effect relationships. Thus, researchers continue to debate whether applicant faking substantially attenuates the validity of personality tests. The present study used a new experimental framework to study this question and related methodological issues in the faking literature. First, it included a subtle incentive to fake in addition to explicit instructions to respond honestly or fake good. Second, it compared faking on standard Likert scales to faking on multidimensional forced choice (MFC) scales designed to resist deception. Third, it compared more and less fakable versions of the same MFC inventory to eliminate confounding differences between MFC and Likert scales. The result was a 3 x 3 design that simultaneously manipulated the motivation and ability to fake, allowing for a more rigorous examination of the faking–validity relationship. Results indicated complex relationships between faking and the validity of personality scores. Directed fakers were much better at raising their scores on Likert scales than MFC measures of the same traits. However, MFC scales failed to retain more validity than Likert scales when participants faked. Supplemental analyses suggested that extreme faking decimated the construct validity of all scales regardless of their fakability. Faking also added new common method variance to the Likert scales, which in turn contributed to the scales’ criterion-related validity. In addition to the effects of faking, the present study investigated two recurring methodological issues in the faking literature. First, I investigated the claim that directed faking is fundamentally different from typical faking by comparing results from directed and incentivized fakers. Directed faking results generally replicated using a subtle incentive to fake, but the effects were much smaller and less consistent. Second, some have argued that traditional criterion-related validity coefficients fail to capture the negative effects of faking on actual selection decisions. I investigated this possibility by creating simulated selection pools in which fakers and honest responders competed for limited positions. The simulation results generally indicated reasonable correspondence between validity estimates and selected group performance, suggesting that validity coefficients adequately reflected the effects of faking. Results are interpreted using existing theories of faking, and new methodologies are proposed to advance the study of typical faking behavior.
Using psychometric models to measure social and emotional learning constructs
(2020-08) Smith, Mireya
In Testing Standards (2014), a construct is a concept or characteristic that an assessment is intended to measure. From a quantitative lens, a construct is trait or domain that may include attitudes, skills, abilities, dispositions and some aspects of knowledge (e.g., competencies). Research studies suggest that social and emotional learning (SEL) constructs may be useful in narrowing the achievement gap, however there is no agreed upon definition of SEL as SEL constructs are multifaceted and defined by the researcher(s). Currently, some SEL constructs are measured qualitatively but this ignores the quantitative structure of the construct. In the quantitative field, SEL constructs are measured by applying a less complex model before a complex model. However, this disregards the qualitative definition for the SEL construct. Furthermore, a construct cannot be directly measured (e.g., person’s height), instead, we need to indirectly observe SEL constructs through item responses (e.g., polytomous items). The problem is that there is a lack of clarity in how the SEL constructs are defined and measured. In addition, there is very little research in an approach for SEL constructs to have accumulating evidence that supports score interpretation and use. This study proposes using the paradigm for SEL assessment that can lead to meaningful, useful, appropriate, and fair score interpretation and use. The paradigm consists of three components. The first component, the structural components of SEL, makes a distinction of the units of SEL assessment (framework, construct(s), measure(s) and item responses) where the construct is the centerpiece. The second component is where the construct definition and measurement model work together to put forth plausible competing models for the internal structure (e.g., bifactor) of selected SEL constructs. The final component is forms of validity evidence (e.g., measurement invariance) where the focus is to evaluate the claims (e.g., scores can be compared across groups) regarding what the scores represent and how they should be used. The paradigm for SEL assessment encourages researchers from the qualitative and quantitative fields to work together to properly define SEL constructs in a qualitative (e.g., theory) and quantitative (e.g., confirmatory factor analysis and item response theory models) manner.
Validity of Electronic Prescription Claims Records: A Comparison of Commercial Insurance Claims with Pharmacy Provider Derived Records
(University of Minnesota, College of Pharmacy, 2014) Martin, Bradley C.; Shewale, Anand
Objectives: To determine if and to what extent records obtained from PBM pharmacy claims differ from source documents obtained directly from pharmacy providers. This study also sought to explore possible associations between patient, pharmacy benefits, and pharmacy provider characteristics and the likelihood a patient would have missing prescription claims. Methods: This study used a cross-sectional design which included a sample of 1,484 patients residing in a single state with a common pharmacy benefit. Profiles describing all prescriptions filled in a pharmacy between January 1, 2002 through June 30, 2002 of these patients were requested directly from their pharmacy providers. Logistic regression was used to explore the factors associated with a person receiving a prescription that did not appear on the PBM claims. Results: Of the 1,484 eligible recipients sampled, profiles were obtained for 323 (22%) persons and there were analyzable profiles for 315 (21%) persons. There were a total of 2,977 prescriptions filled for the 315 subjects. Of those 2,977 prescriptions, 207 (7.0%) were missing from the claims files indicating that 93% were captured. Only prescription volume consistently influenced the likelihood a patient would have a missing prescription from the PBM claims (OR =1.08; 95%CI:1.05-1.12). Conclusion: Claims obtained from pharmacy benefit companies capture approximately 93% of prescription records when verified with records obtained from pharmacy providers. The rate of missing records from PBM claims does not appear to be meaningfully influenced by most finance based pharmacy benefit design features. However, certain classes of drugs such as iron products, digoxins, diuretics, sulfonylureas, and antigout may have incomplete claims records compared to other classes of drugs. Higher prescription utilizers are more likely to have prescription records filled that are not captured by PBMs. These conclusions should be interpreted in light of the modest usable response rate from pharmacy providers of 22% and the unknown generalizability of these patients utilizing one particular PBM from 2002 in the state of Georgia.

University Digital Conservancy

Browse by Subject

Browsing by Subject "validity"