Assessor cognition and inter-rater reliability in nursing objective structured clinical examinations
Scully, Conor
(2023)
Assessor cognition and inter-rater reliability in nursing objective structured clinical examinations.
Doctor of Science thesis, Dublin City University.
The consistency of judgements made by examiners of performance assessments is an important issue when high stakes are associated with the outcomes of such assessments for examinees. In order to minimize variance between assessors, it is imperative that designers and users of assessments understand and account for variations that may arise when different assessors observe the same performance.
Objective Structured Clinical Examinations (OSCEs) are high-fidelity performance assessments common in the health sciences, which require that students are judged by a range of different assessors. Despite the current prominence of OSCEs within undergraduate nursing programs, two problematic issues are highlighted in the research literature: relatively little is known about the specific cognitive processes that assessors employ when reaching judgements about the students they observe; and inter-rater reliability can be low.
This mixed-methods study sought to address both issues using a combination of semi-structured interviews and a think-aloud protocol, in which assessors (n=12) shared their thought processes with the researcher as they reviewed four videos of students completing two OSCEs: blood pressure measurement and naso-gastric tube insertion. Participants also completed the associated marking guides for each OSCE, the data from which were used to determine the percent agreement between assessors (inter-rater reliability) of the viewed student performances.
The results of the study indicated idiosyncrasy in the cognitive processes that assessors employed while judging the recorded performances. The data suggested that although each assessor watched the same four videos, they had different methods of determining how well or badly the students performed. Perhaps unsurprisingly, the completed marking guides revealed substantial variance in the scores the assessors awarded, with the harshest assessor awarding 29/52 checklist items across the videos compared to 45/52 for the most lenient assessor. Notably, there were discrepancies at the pass/fail decision for three out of the four performances.
03 Apr 2023 14:26 by
Michael O'leary
. Last Modified 03 Apr 2023 14:26
Documents
Full text available as:
PDF
- Archive staff only. This file is embargoed until 8 February 2024
- Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader Creative Commons: Attribution-Noncommercial-No Derivative Works 4.0 2MB