فصل 3 _قسمت1
Where and how should evidence be obtained? Erika might review crash statistics and police reports, which could reveal that smartphone use is not as prevalent in crashes even though the prevalence of use of these devices for talking, texting, and calling while driving seems high when collected from a self-reported survey. But how reliable and accurate is this evidence? Not every crash report may have a place for the officer to note whether a smartphone was or was not in use, and those drivers completing the survey may not have been entirely truthful about how often they use their phone while driving. Erika’s firm might also perform their own research in a costly driving simulator study, comparing the driving performance of people while the smartphone was and was not in use. But do the conditions in the simulator match those on the highway? On the highway, people choose when they want to talk on the phone. In the simulator, people are asked to talk at specific times. Erika might also review previously conducted research, such as controlled laboratory studies. For example, a laboratory study might show how talking interferes with computerbased “tracking task”, as a way to represent steering a car, and performing a “choice reaction task”, as a way to represent responding to red lights [59]. But are these tracking and choice reaction tasks really like driving? Y No one evaluation method provides a complete answer. These approaches to evaluation represent a sample of methods that human factors engineers can employ to discover “the truth” (or something close to it) about the behavior of people interacting with systems. Human factors engineers use standard methods that have been developed over the years in traditional physical and social sciences. These methods range from the true experiment conducted in highly controlled laboratory environments to less controlled, but more representative, quasi-experiment or descriptive studies in the world. These methods are relevant to both the consulting firm trying to assemble evidence regarding a ban on mobile devices and to designers evaluating whether a system will meet the needs of its intended users. In Chapter 2 we saw that the human factors specialist performs a great deal of informal evaluation during the system design phases. This chapter describes more formal evaluations to assess the match of the system to human capabilities.
be familiar with the range of methods that are available and know which methods are best for specific types of design questions. It is equally important for researchers to understand how practitioners ultimately use their findings. Ideally, this enables a human factors specialist to work in a way that will be useful to design, thus making the results applicable. Selecting an evaluation method that will provide useful information requires that the method be matched to its intended purpose.
3.1 Purpose of Evaluation In Chapter 2 we saw how human factors design occurs in the understand-create-evaluate cycle. Chapter 2 focused on understanding peoples’ needs and characteristics and using that understanding to create prototypes that are refined into the final system through iteration. Central to this iterative process is evaluation. Evaluation identifies opportunities to improve a design so that it serves the needs of people more effectively. Evaluation is both the final step in assessing a design and the first step of the next iteration of the design, where it provides a deeper understanding of what people need and want. Evaluation methods that serve as the first step of the next iteration of the design are termed formative evaluations. Formative evaluations help understand how people use a system and how the system might be improved. Consequently, formative evaluations tend to rely on qualitative measures—general aspects of the interaction that need improvement. Evaluation methods that serve as the final step in assessing a design are termed summative evaluations. Summative evaluations are used to assess whether the system performance meets design requirements and benchmarks. Consequently, summative evaluations tend to rely on quantitative measures—numeric indicators of performance. The distinctions between summative and formative evaluations can be described in terms of three main purposes of evaluation: • Understand how to improve (Formative evaluation): Does the existing product address the real needs of people? Is it used as expected? • Diagnose problems with prototypes (Formative evaluation): How can it be improved? Why did it fail? Why isn’t it good enough? • Verify (Summative evaluation): Does the expected performance meet design requirements? Which system is better? How good is it? Each of these questions might be asked in terms of safety, performance, and satisfaction. For Erika’s analysis, predicting the effect of smartphones on driving safety is most important: how dangerous is talking on a phone while driving?
Table 3.1
Table 3.1 shows the example evaluation techniques for three evaluation purposes. The first rows of this table show methods for understanding and diagnosing problems with qualitative data. Qualitative data are not numerical and include responses to openended questions, such as “what features on the device would you like to see?” or “what were the main problems in operating the device?” Qualitative data also include observations and interviews. These data are particularly useful for diagnosing problems and identifying opportunities for improvement. These opportunities for improvement make qualitative data particularly important in the iterative design process, where the results of a usability test might guide the next iteration of the design. The third row of the table shows methods associated with verifying the performance of the system with quantitative data. Quantitative data include measures of response time, frequency of use, as well as subjective assessments of workload. Quantitative data include any data that can be represented numerically. The table shows that quantitative data are essential for assessing whether a system has met its objectives and if it is ready to be deployed. Quantitative data offer a numeric prediction of whether a system will succeed. In evaluating whether there should be a ban of smartphones, quantitative data might include a prediction of the number of lives saved if a ban were to be adopted. The last two rows show how both quantitative and qualitative data can support understanding people’s needs and characteristics relative to the design. Although methods for understanding (Chapter 2) and methods for evaluation (Chapter 3) are presented in separate chapters, there is substantial overlap between them. In this chapter, we focus on diagnosing design problems and verifying its performance, but evaluations often produce data that can also enhance understanding and guide future designs. Beyond evaluating specific systems or products, human factors specialists also evaluate more general design concepts and develop design principles. Such concept evaluations include assessing the relative strengths of keyboard versus mouse or touchscreen or rotating versus fixed maps. Concept evaluation reflects the basic science that supports the design principles and heuristics that make it possible to guide design without conducting a study for every design decision.
P3.1 How is evaluation related to understanding in the human factors design cycle?
P3.2 What are the three general purposes of evaluation?
P3.3 Would qualitative or quantitative data be more useful in diagnosing why a design is not performing as expected?
P3.4 Would qualitative or quantitative data be more useful in assessing whether a design meets safety and performance requirements?
P3.5 What is the role of quantitative and qualitative data in system design?
P3.6 Why is qualitative data an important part of usability testing?
P3.7 Give examples of qualitative data in evaluating a vehicle entertainment system.
P3.8 Give examples of quantitative data in evaluating a vehicle entertainment system.
P3.9 Describe the role of formative and summative evaluations in design.
P3.10 Identify a method suited to formative evaluation and another more suited to summative evaluation