چکیده:
Evidence suggests that variability in the ratings of students’ essays results not onlyfrom their differences in their writing ability, but also from certain extraneoussources. In other words, the outcome of the rating of essays can be biased byfactors which relate to the rater, task, and situation, or an interaction of all or any ofthese factors which make the inferences and decisions made about students’writing ability undependable. The purpose of this study, therefore, was to examinethe issue of variability in rater judgments as a source of measurement error; thiswas done in relation to EFL learners’ essay writing assessment. Thirty two Iraniansophomore students majoring in English language participated in this study. Thelearners’ narrative essays were rated by six different raters and the results werea Email address: m_saeidi@iaut.ac.ir; mnsaeidi@yahoo.caCorresponding address: Department of English, Faculty of Persian Literature andForeign Languages, Tabriz Branch, Islamic Azad University, Tabriz, Iranb Email address: m_yusefi@yahoo.comc Email address: puryabaghaei@gmail.com146 Rater Bias in Assessing Iranian EFL Learners’ Writing Performanceanalyzed using many-facet Rasch measurement as implemented in the computerprogram FACETS. The findings suggest that there are significant differencesamong raters concerning their harshness as well as several cases of bias due to therater-examinee interaction. This study provides a valuable understanding of howeffective and reliable rating can be realized, and how the fairness and accuracy ofsubjective performance can be assessed.
خلاصه ماشینی:
"Iranian Journal of Applied Linguistics (IJAL), Vol. 16, No. 1, March 2013, 145-175 Rater Bias in Assessing Iranian EFL Learners’ Writing Performance Mahnaz Saeidi a Associate Professor of Applied Linguistics, Tabriz Branch, Islamic Azad University, Tabriz, Iran Mandana Yousefi b PhD Graduate of TEFL, Tabriz Branch, Islamic Azad University, Tabriz, Iran Purya Baghayei c Assistant Professor of Applied Linguistics, Mashhad Branch, Islamic Azad University, Mashhad, Iran Received 23 September 2012; revised 13 January 2013; accepted 7 February 2013 Abstract Keywords: Introduction It is common practice to describe learners' achievements on the basis of test scores.
The early efforts at investigating bias, classical test theory indices and ANOVA approaches are no longer considered appropriate for 152 Rater Bias in Assessing Iranian EFL Learners’ Writing Performance studying items, because "mean differences in performance are confounded with item difficulty" (Camilli & Shepard, 1994, p.
IJAL, Vol. 16 , No. 1, March 2013 153 According to Lumley and McNamara (1995, as cited in Sudweeks, Reeve, & Bradshaw, 2005), MFRM which is implemented through the computer program FACETS, allows assessing the effects of different sources of systematic errors in the ratings such as, inconsistencies between raters, differences in ratings between rating occasions, and differences in the relative difficulty of various writing tasks (prompts).
160 Rater Bias in Assessing Iranian EFL Learners’ Writing Performance Figure 1: The examinee ability, rater harshness and dimension difficulty measures According to Figure 1, vocabulary is the most harshly scored and register is the most leniently scored dimension."