For test scores that rely on the accurate estimation of ability via an IRT model, their use and interpretation is dependent upon the assumption that the IRT model fits the data. Examinees who do not put forth full effort in answering test questions, have prior knowledge of test content, or do not approach a test with the intent of answering questions to the best of their ability are exhibiting aberrant response behaviors and the accuracy and validity of the resulting test scores are called into question. The test administrator is left with the problem of determining whether test scores are a true representation of examinee ability (Reise, 1990; Karabatsos, 2003). Model fit is typically assessed through item-fit indices. An equally important aspect of assessing model fit is determining how well an IRT model fits the response patterns of examinees, which is commonly referred to as person fit (Meijer & Sijtsma, 2001). The purpose of this research was to explore the application of person-fit analysis in the identification of cheating behavior. Specifically, issues that may impact the effectiveness of person-fit indices, also called person-fit measures, were evaluated. A primary focus of this research was the value of using multiple types of measures (scalar, response time, graphical), both individually and combined, in determining whether or not a response pattern is indicative of cheating behavior. A review of the literature on person-fit research is presented, followed by a discussion of considerations for designing a person-fit simulation study. A study was then conducted to determine the effectiveness of three person-fit measures in identifying simulated cheating behavior under various conditions. The person-fit measures used in the study were lz (Drasgow, Levine & Williams, 1985), Effective Response Time (Meijer & Sotaridona, 2006), and the Person Response Curve (Trabin & Weiss, 1983). The effectiveness of the individual measures and the measures used in combination was evaluated. Study factors included IRT model, exam length, examinee ability level, amount of aberrance within an exam, and amount of aberrance within a sample or population. A real-parameter simulation study (Seo & Weiss, 2013) was conducted using Rasch and two-parameter logistic (2PL) IRT parameters estimated from a large dataset obtained from a language skills assessment.