An Empirical Investigation of Methods for Assessing Item Fit for Mixed Format Tests

ISSN： 0895-7347

Source： Applied Measurement in Education, Vol.26, Iss.1, 2013-01, pp. : 1-15

Disclaimer: Any content in publications that violate the sovereignty, the constitution or regulations of the PRC is not accepted or approved by CNPIEC.

Previous Menu Next

Abstract

Empirical information regarding performance of model-fit procedures has been a persistent need in measurement practice. Statistical procedures for evaluating item fit were applied to real test examples that consist of both dichotomously and polytomously scored items. The item fit statistics used in this study included the PARSCALE's </inline-formula>, Orlando and Thissen's (2000) </inline-formula> and </inline-formula>, and Stone's (2000) </inline-formula> and </inline-formula>. The results of this study indicated that the fit of an individual item was affected by the choice of model-fit analyses. The performance of fit indices appeared to vary depending on item response theory (IRT) model mixtures used for calibration, sample size, and test length. In terms of consistency among the fit indices, the statistics based on the same approach (e.g., </inline-formula> and </inline-formula>) showed considerably higher agreement in detecting misfitting items than the statistics based on different approaches (e.g., </inline-formula> and </inline-formula>).</i> Consistent and inconsistent findings compared to previous research are discussed along with practical implications.