Sociology, Department of

 

Date of this Version

2-26-2019

Document Type

Article

Citation

Presented at “Interviewers and Their Effects from a Total Survey Error Perspective Workshop,” University of Nebraska-Lincoln, February 26-28, 2019.

Comments

Copyright 2019 by the authors.

Abstract

Deviations from reading survey questions exactly as worded may change the validity of the questions, thus increasing measurement error. Hence, organizations train their interviewers to read questions verbatim. To ensure interviewers are reading questions verbatim, organizations rely on interview recordings. However, this takes a significant amount of resources. Therefore, some organizations are using paradata generated by the survey software, specifically timestamps, to try to detect when interviewers’ deviate from reading the question verbatim.

To monitor interviewers’ question reading behavior using timestamps, some organizations estimate the expected question administration time to establish a minimum and maximum question administration time thresholds (QATT). They then compare the question timestamp to the QATTs to identify questions that violate the questions’ QATTs. Violations of minimum QATTs may indicate interviewers omitted from the question text. Conversely, violations of maximum QATTs may indicate interviewers added words to the question text. The questions that violated the QATTs are then flagged for further investigation. Investigations may include such things as listening to the recording for said question or aggregating the data (i.e., the flagged questions) up to the interviewer level to identify interviewers who repeatedly engage in question-reading deviations. Organizations can then make decisions about training needs or disciplinary actions based on empirical data.

However, there is no established method to calculate QATTs. Some organizations calculate QATTs by dividing the question words by an (x) reading pace (Sun & Meng, 2014) or a priori cutoff, such as one second (Mneimneh, Pennell, Lin, & Kelley, 2014). Further, there is little known about the level of accuracy of the methods currently used to detect question-reading deviations, or if a more accurate method is needed. Which QATT method is more accurate for detecting question-reading deviations? Should one construct QATTs using words per second (WPS) or use standard deviations of the mean reading-time? What WPS rate or standard deviation should be used? Is one detection method better for detecting certain types of deviations (e.g., skipping words or questions, adding words to the question, etc.)?

This study attempts to answer the above questions using interview recordings and paradata from Wave 3 of the Understanding Society Innovation Panel, United Kingdom. Using interview recordings allows a direct comparison of the different detection methods to how the interviewers actually administered the question and measure the accuracy of each detection method. In addition the interview recordings are coded for the extent (i.e., minor or major) and type of deviation. This analysis gives better insight on the scope and types of deviations interviewers are engaging in and practical guidance on how to best detect deviations using paradata.

Share

COinS