چکیده:
The current popularity of second/foreign language oral performance assessment has
led to a growing interest in tasks as a tool for assessing language learners’ oral
abilities. However, most oral assessment studies so far have investigated tasks
separately; therefore, any possible relationship among them has remained
unexplored. Twenty English as a foreign language (EFL) teachers rated the oral
performances produced by 200 EFL learners before and after a rater training
program using description, narration, summarizing, role-play, and exposition tasks.
The findings demonstrated the usefulness of multifaceted Rasch measurement
(MFRM) in detecting rater effects and demonstrating the consistency and variability
in rater behavior aiming to evaluate the quality of rating. The outcomes indicated
that test difficulty identification is complex, difficult, and at the same time
multidimensional. On the other hand, test takers’ ability is a more determining factor
in their score variation than other intervening variables. The outcomes displayed no
relationship between task difficulty and raters’ interrater reliability measures. The
findings suggest that tasks have various effects on oral performance assessment tests
and most importantly, performance conditions in estimating the oral ability of test
takers. Since various groups of raters have biases to different tasks in use, the
findings indicated that training programs can reduce raters’ biases and increase their
consistency measures. The findings imply that decision makers had better not be
concerned about raters’ expertise in oral assessment, whereas they should establish
better rater training programs for raters to increase assessment reliability.
خلاصه ماشینی:
The Speaking Test The elicitation of test takers’ oral proficiency was done through the use of five different tasks including description, narration, summarizing, {مراجعه شود به فایل جدول الحاقی} role-play, and exposition tasks.
The outcome displayed a significant mean difference among all pairs of tasks with respect to their scorings of test takers’ oral performance ability at the pre-training phase except for narration-role play (p=0.
The outcome of the table displays that there is significant mean difference among all pairs of tasks with respect to their scorings of test takers oral performance ability at the post-training phase except for the following pairs: description-summarizing (p = 0.
52, p Similar to the pre-training phase, in order to make sure whether there is a significant difference between NEW and OLD raters with regard to rating difficulty of each particular task, an independent t-test was run.
The outcome of the first and second research questions dealing with raters’ biases to the tasks of various levels of difficulty indicated significant differences between NEW and OLD raters in their biases to the oral tasks at the pre-training phase.
Nevertheless, unlike the pre-training phase, data analysis showed no significant difference between NEW and OLD raters’ biases in scoring tasks with respect to various difficulty measures.
The high amount of obtained 50 Journal of Modern Research in English Language Studies 5(4),27-53 (2018) residual, as compared to the effect of raters and tasks, demonstrates that test takers ability acts as a more significant role in test takers oral ability rather than other involving factors.