Why increasing the number of raters only helps sometimes: Reliability and validity of peer assessment across tasks of different complexity
Tong, YM (通讯作者),Dalian Univ Technol, Sch Foreign Languages, 2 Linggong Rd, Dalian 116024, Peoples R China.
Number of raters is theoretically central to peer assessment reliability and validity, yet rarely studied. Further, requiring each student to assess more peers' documents both increases the number of evaluations per document but also assessor workload, which can decline performance. Moreover, task complexity is likely a moderating factor, influencing both workload and validity. This study examined whether changing the number of required peer assessments per student / number of raters per document affected peer assessment reliability and validity for tasks at different levels of task complexity. 181 students completed and provided peer assessments for tasks at three levels of task complexity: low complexity (dictation), medium complexity (oral imitation), and high complexity (writing). Adequate validity of peer assessments was observed for all three task complexities at low reviewing loads. However, the impacts of increasing reviewing load varied by reliability vs. validity outcomes and by task complexity.