Alignment Coefficients, Assessment Example

Alignment coefficients for dimensionality ratings were low (.16 for faculty, .13 for teachers), showing that raters barely agreed above a chance level about whether an hours should be assigned only a primary topic or both primary and secondary topics; that is, whether an hours should be considered uni- or multidimensional. Applying the 65% agreement rule to determine the number of hours for which raters agreed on hours dimensionality yielded similar results. On fewer than half of the hours (45%) did a minimum of 13 out of 20 raters agree on hour dimensionality? Of the 18 hours that the raters agreed upon, they classified 9 as multidimensional (addressing multiple topics) and 9 as one-dimensional (addressing only one topic).

The decisions made on these 18 hours will serve as the benchmark for hour’s dimensionality to be compared with decisions produced by 6-rater subsets of the 20 raters, considered in a later section. Dimensionality was one of the few dimensions on which differences between faculty and high school teacher raters emerged. Teachers rated more hours as multidimensional than faculty did. Faculty agreed on hour dimensionality for 27 hours. Faculty classified 41% of these 27 hours as multidimensional. Teachers also agreed on hours dimensionality for 27 hours (although not all of the same hours as faculty), and classified 70% of them multidimensional. The difference between these proportions is statistically significant, 65,000= 117 = 311 hours.

The very large estimated variance component for the residual (117 = 311, 63% of the total variance) relative to the estimated variance component for hours suggests a large hours x rater interaction (raters rank-ordered hours differently on depth of knowledge), and/or other sources of error variability not captured with this design.

In summary, our study has identified and demonstrated standard techniques that we believe should be used to assure the measurement quality of alignment measures. Findings from our case study, while clearly limited in generalizability, raise important questions about the reliability of the alignment process and its implications for practice. A challenge for future research and development is the further exploration and solution of these knotty questions.

