Date of this Version
From: Assessment of Teaching: Purposes, Practices, and Implications for the Profession, edited by James Y. Mitchell, Jr., Steven L. Wise, and Barbara S. Plake; Series Editor Jane Close Conoley (Hillsdale, New Jersey, Hove & London: Lawrence Erlbaum Associates, 1990)
The validation of teacher-certification tests has become a snark hunt. The snark pursued by test contractors however, is not nearly as elusive, nor fanciful as Carroll's shadowy beast. Their snark is a "legally defensible" test. Test contractors are-merely from nervousness (not from goodwill)-marching along shoulder to shoulder with jurists. I will argue that hunting the shadowy snark through the dark wood of the court room cheapens the conceptualization of validity and delimits the conduct of validation studies to minimalist exercises designed to obtain a positive result.
I start from the basic premise that the most important feature of any test is the accuracy of the inferences or decisions made from the score. Countless generations of measurement students have been taught the old saw: A test can be reliable but not valid, but if it's valid, it has to be reliable. Here I argue that a test can be legally defensible but not valid. However, if the results from a well designed validation process, on balance support the inferences or decisions made on the basis of the test, legal defensibility should be satisfied. Legal defensibility is important. Tests must be able to stand court scrutiny. However, I contend, we have the cart before the horse-whatever courts will accept drives current validation practices. Legal precedent is circumscribing inquiry about the essential question underlying all testing-"how correct are the inferences made from a person's score?"
The vulnerability of testing companies engaged in this legal snark hunt consists in their readiness to view their task no more broadly than that of satisfying the desire of state agencies for legally defensible certification tests. Some contractors are in danger of becoming ambidextrous yes men for states. Testing companies, of course, insist on test validity. However, faced with the exigencies of the applied world- legal precedent, contract obligations, tenacious timelines, and budget limitations-they offer excuses to absolve themselves from gathering construct and criterion-related evidence related to their products. They have built a tunnel through a mountain of precepts about validity which are regularly taught students in test and measurement classes. They take the law of validity literally, but argue that the letter is elastic, pleading for leniency from basic precepts and strictures. The companies deny that their product purports to measure a complex construct- a subset of competence in the classroom.
This behavior of test contractors is a reaction to the following factors:
• the mandates of policy makers for a quick, visible, quantifiable, and administratively convenient fix to a perceived problem with the quality of the teaching corps
• a naive confidence on the part of policy makers and the public that multiple-choice tests can improve the quality of the teacher corps
• a lack of understanding on the part of policy makers and the public, concerning the kinds of evidence needed to support the inferences they wish to make about teacher candidates
• the bureaucratic nature of the state agencies charged with implementing the mandate of policy makers
• the funding level and lead time available to implement a systematic validation effort
• the applied, commercial nature of the testing industry
• the selection of contractors through the competitive, low-bid RFP process
• the allegiance of testing companies to the state agency that awards the contract, rather than to examinees who later pay to sit the exam, or the general public who often base their perceptions of educational quality on test scores
• the evolution of a set of tried and true procedures for gathering content validation evidence that redounds to confirmation rather than disconfirmation
• the vested interest of policy makers in vigorously defending their programs when legally challenged
• the competitive, adversarial nature of litigation that colors arguments on both sides about scientific-technical issues
• the existence of legal precedent about what is sufficient to justify the use of employment tests
I was asked to address the professional and legal implications of teacher-certification testing. This chapter, therefore, consists of two distinct sections. In the first section, I examine the legal challenges to teacher-certification tests. I offer examples of how testing experts operate within the legal arena. In this first section validity issues are interpreted from the legal rather than psychometric tradition. I begin by describing two types of legal challenges to teacher- certification tests and illustrate the part experts from the psychometric community play in these legal challenges. I conclude the first section with a description of the formula for a legally defensible teacher-certification test that has emerged from court decisions.
In the second section I leave behind the legal tradition and examine the validity issues surrounding teacher certification testing from the professional point of view. I begin by offering examples of the inferences and decisions made by various publics from scores on teacher certification. I then develop the psychometric implications for validation that flow from these inferences, and the problems associated with them in the applied situation. Next, I discuss issues related to the mix of content, construct, and criterion-related evidence that I feel is necessary if we are to understand what we are measuring and ' basing important certification decisions on. Finally, because the cut-score is the trigger for any inference or decision about teachers, I raise issues related to its validity.
Before I begin, two caveats. First, my treatment of litigation is strictly that of layman. My credentials consist of a single week in law school before leaving for the army, a decision I've never regretted. Second, I have a bias about high-stakes tests. If the use of a test has the potential to harm, or has serious consequences for individuals, the test should be subjected to a thoroughgoing validation before it is used operationally. Teacher-certification tests certainly trigger this bias. Large numbers of examinees are affected, particularly large numbers of minorities. From my perspective, the protection of the individual takes precedent over the state's interests until the state can demonstrate that the test, although not error free, does a reasonably good job of identifying individuals who in all likelihood lack the necessary skills or knowledge to be minimally successful in the classroom. The last clause anticipates my later argument that teacher certification tests are really about a candidate's competence in the classroom.