Buros-Nebraska Series on Measurement and Testing


Date of this Version


Document Type



From: Assessment of Teaching: Purposes, Practices, and Implications for the Profession, edited by James Y. Mitchell, Jr., Steven L. Wise, and Barbara S. Plake; Series Editor Jane Close Conoley (Hillsdale, New Jersey, Hove & London: Lawrence Erlbaum Associates, 1990)


Copyright © 1990 by Lawrence Erlbaum Associates, Inc. Digital Edition Copyright © 2012 Buros Center for Testing. This book may be downloaded, saved, and printed by an individual for their own use. No part of this book may be re-published, re-posted, or redistributed without written permission of the holder of copyright.


School administrators, especially principals, are under great pressure to insure high levels of teacher competence. Because the school effectiveness research has demonstrated convincingly that effective schools begin with effective principals, Peterson and Finn (1985) drew a less than surprising conclusion by stating that "Practically never does one encounter a good school with a bad principal" (p. 42). A less pedantic east Texas superintendent put it this way, "Bad principals are like fish; you either can 'em or smell 'em for a long time." It is in the complex area of teaching assessment or teacher evaluation that principals draw the most criticism from classroom teachers and particularly from university pundits. As McLaughlin (1986), a longtime student of teacher evaluation, put it: "Teachers seldom respect principals as experts on classroom practice or as skilled classroom observers, and in the absence of principal credibility, teachers consider the evaluation an illegitimate comment on their performance and ignore the findings (p. 163). Teacher evaluation, in short, is an activity that most principals have little interest in or capacity to carry out" (p. 170). Epstein (1985) said that "Critics of current evaluation schemes complain that most are based on the principal's ratings on teachers that result from infrequent (sometimes just one) observations in teachers' classrooms; on cronyism, patronage, or other prejudicial decisions; or on seniority, credentials, and accumulated credits that do not involve the evaluation of teaching skills" (p. 3).

Principals and teachers vary greatly in how they perceive the principal's performance as an evaluator, according to a survey of teachers and principals in Massachusetts (Tirrell, 1986). The respondents were asked to rate the role of the principal in evaluation according to their current perceptions and ideal expectations. Principals and teachers disagreed on 28 of 37 statements concerning current perceptions. They disagreed whether or not the principal

clearly communicates the philosophy of the evaluation program to the staff; clearly states the purpose of the evaluation in writing to the teachers; ensures that the teachers know and understand the caliber of their work; ensures that teachers are not threatened by evaluation practices; and encourages teachers to experiment with new behaviors designed to address weaknesses indicated in previous evaluations. (pp. 31, 32)

Other studies raise questions about the accuracy of measurement instruments ' and their criteria to distinguish the truly outstanding teacher from the average or even minimally competent one. Young (1986) identified five major faults in most observation instruments. They are as follows: (a) high inference items, (b) too many items, (c) judgments based on teacher actions, (d) low interrater reliability, and (e) lack of research support. Other research suggests that various groups disagree on the criteria they use to judge teachers. Epstein (1985) found that parents judge teachers on the basis of the degree to which the teacher communicates with the child's family, whereas principals give much less weight to this factor.

In attempting to determine whether people evaluate teaching excellence with the same criteria as they use to evaluate incompetence in teaching, Carey (1986) found that,

Unlike minimal competence ratings, it might be more difficult to achieve consensus in judgments of excellence in teaching. If this contention is supported in further research it may be that merit pay and mentor teacher plans suffer an Achilles heel that will be difficult to remediate. (p . 10)

The use of student scores on standardized achievement tests has become the major criterion used by some evaluators to judge teacher competence. St. Louis, Missouri, teachers were told by the superintendent in 1985 that they would be rated unsatisfactory and lose their jobs unless their students reached specific levels of achievement or improvement on standardized achievement tests (Shanker, 1986, p . 3c). Other authorities, while urging evaluators to have multiple data sources for more accurate teacher evaluations, are calling for more testing to determine teacher effectiveness. According to Manatt (1986), evaluators are going to have to go,

deeper than inferences based on research on teaching. We want to look at student test data broken out by classrooms ... . That way and only that way, can you really narrow it down to a teacher rather than saying in general that the school got these achievements for these boys and girls. (p. 12)

Most researchers and practicing administrators agree 'that the better teacher evaluation systems can discriminate good teachers from dreadful teachers, and adequate teachers from bad teachers. However, few knowledgeable educators believe that they can segregate the master or clearly outstanding teacher from the really good teacher. This fine line appears to be the source of much of the heat and criticism generated by teacher groups and researchers about the state-of-the art in teacher evaluation.