Clarifying Calibration

Clarifying Observer Calibration

Calibration is a term used often when speaking about the need to ensure accuracy among observers, yet the multiple ways in which calibration is defined has resulted in various interpretations of what calibration is, and is not.

Calibration is used frequently in reference to measurement instruments, such as scales, and specifies a process used to maintain “instrument accuracy”. Most of us can recall seeing inspection stickers on gas pumps or deli market scales indicating that the scale or pump being used has been “calibrated”. The term “calibration” indicates a process used to measure and ensure accuracy against a standard; as in the example of a scale, weights in ounces, pounds, or grams, or in the example of a pump, liquid in units of gallons or liters.

When using the term calibration to describe teacher observer accuracy, it refers to a process that is used to measure the accuracy of an observer, or group of observers, against a specific standard of accuracy, usually a video that has been master-scored using a strict and rigorous process that ensures validity of the master-scored video.

Calibration for observers, similar to calibration of measuring instruments, requires the use of a standard; the master-scored video is that standard, and provides the measure of accuracy against which the observer ratings of the teacher will be assessed. In calibrations, the observers’ scores of a teacher’s practice are compared to the master-scores, and using statistical measures the degree of accuracy is determined. The most rigorous calibrations utilize several different quantifiable measures; each measure provides a different analysis of the observer’s accuracy. [1]

Calibration is not norming, or averaging observers’ ratings of a teacher (or video) to determine the teacher’s level of performance (calculated using the group average). Similarly, some school districts engage in processes such as “walk-throughs” where teachers are observed to “calibrate” the observers doing the “walk-through” of classes. This is a norming process, and while it has value in providing observers an opportunity to observe and discuss practice together, it does not “calibrate” their scoring of the teacher against a master-scored standard. In this and other similar norming processes, observers frequently norm their ratings of a teacher’s practice against their faculty’s practice as a whole – “that’s one of our best teachers, therefore s/he must be highly effective”, rather than calibrate their interpretations of effective teaching against a defined standard.

Finally, calibration should not be confused with “certification”. Many districts adopted a pass/fail assessment of an observer’s accuracy, that they required observers to complete prior to observing teachers – they were “certified” after passing the assessment. Calibration is not a one-time assessment, it is an ongoing process used by districts to regularly assess the degree to which observers are accurately interpreting teachers’ effectiveness using the district-selected criteria. It builds inter-rater reliability over time, while also providing data against which inter-rater agreement can also be assessed.

Calibration matters. Teachers need to be confident that the administrators charged to evaluate their effectiveness are meeting a consistent level of accuracy when interpreting the levels of performance articulated in the standards. Calibration establishes consistency among observers, and in the long run establishes trust in the observation and evaluation processes.

[1] Teaching Learning Solutions (TLS) utilizes 3 quantitative measures; item accuracy, score differential, and volatility index. The TLS measures area incorporated in to the Performance Matters calibration platform.

Let's Work Together! Contact Us.

Albert Miller (Duffy)

duffy@teachinglearningsolutions.com

Bernadette Cleland (Bernie)

bernie@teachinglearningsolutions.com

802-613-3169