Buros-Nebraska Series on Measurement and Testing


Date of this Version


Document Type



From: The Computer and the Decision-Making Process, edited by Terry B. Gutkin and Steven L. Wise (Hillsdale, New Jersey, Hove & London: Lawrence Erlbaum Associates, 1991) .


Copyright © 1991 by Lawrence Erlbaum Associates, Inc. Digital Edition Copyright © 2012 Buros Center for Testing. This book may be downloaded, saved, and printed by an individual for their own use. No part of this book may be re-published, re-posted, or redistributed without written permission of the holder of copyright.


From preschool to graduate school, computer-based instruction (CBI) has become an increasingly common event in today's education and training community. The interactive characteristics of CBI and its ability to simulate advanced concepts and operations, such as patient management simulations for medical students (Whiteside & Whiteside, 1987/88) or the maneuvering of a jet airplane (Conkright, 1982), make CBI an attractive new instructional delivery system for educators working in many different fields .

Because of these qualities , the computer has tremendous potential in educational and psychological measurement. For example, Millman & Arter (1984) describe how the computer aids in maintaining test-item banks. Item forms can be used by test specialists to develop computer-generated items from a set of well-defined item characteristics (Hambleton, 1984), which saves valuable time in item construction. Millman and Outlaw (1978) suggest that an additional advantage of item forms is that more items can be produced than those stored on a computer. Computers can also be used to administer tests . The advantages of using computer-administered tests range from the ability to individualize testing to increasing the efficiency and economy of analyzing testing information (Ward, 1984). Finally, computers can be used to score tests, report results, and conduct statistical analyses on the scores (Noonan & Dugliss, 1985).

Although the computer has a wide variety of instructional applications , computer technology is not a panacea for solving all educational problems. For instance, although there are a number of ways in which the computer could possibly improve the quality of instruction in our schools, there is currently a paucity of high-quality courseware available for educational purposes. Some educational software evaluation specialists suggest that up to 90% of the educational software available today is not worth purchasing (Olds, 1983). Measurement and evaluation specialists face similar problems. The costs associated with the design and development of good computer-based testing (CBT) programs are often prohibitively expensive. For this reason, when the computer is chosen as the testing delivery system, careful analysis of implementation questions and issues must take place.

The purpose of this chapter is to identify a number of practical implementation decisions that must be made when designing and developing criterion-referenced tests (CRTs) as a part of a larger system of computer-based instruction. Many of the concepts discussed generalize beyond large-scale courseware development efforts and apply to areas such as CBT in professional certification or licensing examinations, minimal competency testing at the local or state level, and norm referenced testing. This chapter extends earlier guidelines that addressed microcomputer- based testing (Mizokawa & Hamlin, 1984) and computer use for various stages of the testing process (Noonan & Dugliss , 1985).

We have clustered CBT development decision areas into four categories: test construction, test security, item presentation, and response capturing and scoring. Many of the decisions are interrelated, since the actions resulting from one decision limit choices at another decision point (i.e., a decision to allow a student to preview items at the start of a test generally precludes the option of adaptive testing when deciding item sequencing, since item presentation strategies in adaptive testing are dependent on the student's history of responses to previous items). The chapter concludes by introducing a checklist (Appendix A) designed to aid courseware developers and measurement specialists in making appropriate CBT implementation decisions.