"[A] personalized and readable jaunt through the history of charting."--The Economist
"During a dairyman's strike in 19th century New England, when there was suspicion of milk being watered down, Henry David Thoreau wrote, 'Sometimes circumstantial evidence can be quite convincing; like when you find a trout in the milk.' Howard Wainer uses this as a metaphor in his entertaining, informative, and persuasive book on graphs, or the visual communication of information. Sometimes a well-designed graph tells a very convincing story."--Raymond N. Greenwell, MAA Online
"Graphic Discovery is a welcome addition to the literature on investigation and effective communication through graphic display. It contains a wealth of information and opinions, which are motivated and illustrated through a plethora of real life examples which can be easily incorporated into any educational setting: classroom, seminar, self-enhancement. . . . This book will be useful to and it can be mastered by a diverse readership."--Thomas E. Bradstreet, Computational Statistics
"The use of charts and graphs to make numbers both intelligible and memorable is a surprisingly modern idea. How this idea grew from a curiosity into a basic tool of modern science is a story of remarkable men and curious paradoxes, a story that Howard Wainer tells with zest and sympathetic understanding. Informative, readable, profoundly engaging."--George A. Miller, Princeton University, author of The Magical Number Seven, Plus or Minus Two
"I liked this book very much indeed. It will be very useful to the many who are interested in the interplay of forces that have yielded modern science."--Eric T. Bradlow, Wharton School of the University of Pennsylvania
"Fascinating. This book . . . the first to explore the chronological development of graphical data display . . . should be required reading for statisticians, applied researchers, scientists, and certainly for all journalists."--I. Elaine Allen, Babson College
"A delightful and thought-provoking book on statistical graphics. Wainer provides compact case studies of how graphical presentations such as bar charts, plots, and scattergrams can lead to important discoveries. The most compelling examples show how a published graphic could be dramatically improved to avoid misleading interpretations or make new discoveries. The most entertaining parts are his vignettes of historical figures, such as his twin heroes of William Playfair and John Tukey. I enjoyed Wainer's sardonic wit, personal anecdotes, and popular culture references, but the real gift was the clarity of thinking and the wise guidance about deep issues in statistics, data mining, and information visualization."--Ben Shneiderman, College Park, MD
One of Choice's Outstanding Academic Titles for 2005
"Wainer's wit and broad intellect make this a very entertaining book."--Linda Pickle, ,American Statistician
"This book may be seen as a chronology of graphic date presentation beginning with Playfair to the present and pointing toward the future. . . . It is a remarkable value that every practitioner of statistics can afford."--Malcolm James Ree, Personnel Psychology
"Well written and innovative. . . . The book is fascinating with its wide view, including introductions to historical personalities, analyses of statistical paradoxes, and well-documented discussions of actual uses of visual data to mislead viewers."--Choice Part I: William Playfair and the Origins of Graphical Display Chapter 1: Why Playfair? 9 Chapter 2: Who Was Playfair? 20 Chapter 3: William Playfair: A Daring Worthless Fellow 24 Chapter 4: Scaling the Heights (and Widths) 28 Chapter 5: A Priestley View of International Currency Exchanges 39 Chapter 6: Tom's Veggies and the American Way 44 Chapter 7: The Graphical Inventions of Dubourg and Ferguson: Two Precursors to William Playfair 47 Chapter 8: Winds across Europe: Francis Galton and the Graphic Discovery of Weather Patterns 52 Part II: Using Graphical Displays to Understand the Modern World Chapter 9: A Graphical Investigation of the Scourge of Vietnam 59 Chapter 10: Two Mind-Bending Statistical Paradoxes 63 Chapter 11: Order in the Court 72 Chapter 12: No Order in the Court 78 Chapter 13: Like a Trout in the Milk 81 Chapter 14: Scaling the Market 86 Chapter 15: Sex, Smoking, and Life Insurance: A Graphical View 90 Chapter 16: There They Go Again! 97 Chapter 17: Sex and Sports: How Quickly Are Women Gaining? 103 Chapter 18: Clear Thinking Made Visible: Redesigning Score Reports for Students 109 Part III: Graphical Displays in the Twenty-first Century Chapter 19: John Wilder Tukey: The Father of Twenty-first-Century Graphical Display 117 Chapter 22: Epilogue: A Selection of Selection Anomalies 142 Conclusion 150 Notes 173
Introduction 1
In the sixteenth century, the bubonic plague provided the motivation for the English to begin gathering data on births, marriages, and deaths. These data, the Bills of Mortality, were the grist that Dr. John Arbuthnot used to prove the existence of God. Unwittingly, he also provided strong evidence that data graphs were not yet part of a scientist's tools.
All of the pieces were in place for the invention of statistical graphics long before Playfair was born. Why didn't anyone else invent them? Why did Playfair?
by Ian Spence and Howard Wainer
William Playfair (1759-1823) was an inventor and ardent advocate of statistical graphics. Here we tell a bit about his life.
by Ian Spence and Howard Wainer
Audacity was an important personality trait for the invention of graphics because the inventor had to move counter to the Cartesian approach to science. We illustrate this quality in Playfair by describing his failed attempt to blackmail one of the richest lords of Great Britain.
The message conveyed by a statistical graphic can be distorted by manipulating the aspect ratio, the ratio of a graph's width to its height. Playfair deployed this ability in a masterly way, providing a guide to future display technology.
A recent plot of the operating hours of international currency exchanges confuses matters terribly. Why? We find that when we use a different graphical form, developed by Joseph Priestley in 1765, the structure becomes clear. We also learn how Priestley discovered the latent graphicacy in his (and our) audiences.
European intellectuals were not the only ones graphing data. During a visit to Paris (and prompted by letters from Benjamin Franklin), Thomas Jefferson learned of this invention and he later put it to a more practical use than the depiction of the life spans of heroes from classical antiquity.
Although he developed the line chart independently, Priestley was not the first to do so. The earliest seems to be the Parisian physician Jacques Barbeau-Dubourg (1709-1779), who created a wonderful graphical scroll in 1753. Graphical representation must have been in the air, for the Scottish philosopher Adam Ferguson (1723-1816) added his version of time lines to the mix in 1780.
In 1861, Francis Galton organized weather observatories throughout Western Europe to gather data in a standardized way. He organized these data and presented them as a series of ninety-three maps and charts, from which he confirmed the existence of the anticyclonic movement of winds around a low-pressure zone.
During the Vietnam War, average SAT scores went down for those students who were not in the military. In addition, the average ASVAB scores (the test used by the military to classify all members of the military) also declined. This Lake Wobegon-like puzzle is solved graphically.
The odd phenomenon observed with test scores during the Vietnam War is not unusual. We illustrate this seeming paradox with other instances, show how to avoid them, and then discuss an even subtler statistical pitfall that has entrapped many illustrious would-be data analysts.
How one orders the elements of a graph is critical to its comprehensibility. We look at a New York Times graphic depicting the voting records of U.S. Supreme Court justices and show that reordering the graphic provides remarkable insight into the operation of the court.
We examine one piece of the evidence presented in the 1998 murder trial of State v. Gibbs and show how the defense attorneys, by misordering the data in the graph shown to the judge, miscommunicated a critical issue in their argument.
Thoreau pointed out that sometimes circumstantial evidence can be quite convincing, as when you find a trout in the milk. We examine a fascinating graph that provides compelling evidence of industrial malfeasance.
We examine the stock market and show that different kinds of scalings provide the answers to different levels of questions. One long view suggests a fascinating conjecture about the trade-offs between investing in stocks and investing in real estate.
We examine two risk factors for life insurance--sex and smoking--and uncover the implicit structure that underlies insurance premiums.
The New York Times is better than most media sources for statistical graphics, but even the Times has occasional relapses to an earlier time in which confusing displays ran rampant over its pages. We discuss some recent slips and compare them with prior practice.
A simple graph of winning times in the Boston Marathon augmented by a fitted line provides compelling, but incorrect, evidence for the relative gains that women athletes have made over the past few decades. A more careful analysis provides a better notion of the changing size of the sex differences in athletic performances.
Too often communications focus on what the transmitter thinks is important rather than on what the receiver is most critically interested in. The standard SAT score report that is sent to more than one million high school students annually is one such example. Here we revise this report using principles abstracted from another missive sent to selected high school students.
The three chapters of this section grew out of a continuing conversation with John W. Tukey, the renowned Princeton polymath, on the graphical tools that were likely to be helpful when data were displayed on a computer screen rather than a piece of paper. These conversations began shortly after Tukey's eighty-fourth birthday and continued for more than a year, ending the night before he died.
Chapter 20: Graphical Tools for the Twenty-first Century: I. Spinning and Slicing 125
Chapter 21: Graphical Tools for the Twenty-first Century: II. Nearness and Smoothing Engines 134
Graphical displays are only as good as the data from which they are composed. In this final chapter we examine an all too frequent data flaw. The effects of nonsampling errors deserve greater attention, especially when randomization is absent. Formal statistical analysis treats only some of the uncertainties. In this chapter we describe three examples of how flawed inferences can be made from nonrandomly obtained samples and suggest a strategy to guard against flawed inferences.
Dramatis Personae 151
This graphical epic has more than one hundred characters. Some play major roles, but most are cameos. To help keep straight who is who, this section contains thumbnail biographies of all the players.
References 177
Index 185