A website's ranking on Google can spell the difference between success and failure for a new business. NCAA football ratings determine which schools get to play for the big money in postseason bowl games. Product ratings influence everything from the clothes we wear to the movies we select on Netflix. Ratings and rankings are everywhere, but how exactly do they work? Who's #1? offers an engaging and accessible account of how scientific rating and ranking methods are created and applied to a variety of uses.
Amy Langville and Carl Meyer provide the first comprehensive overview of the mathematical algorithms and methods used to rate and rank sports teams, political candidates, products, Web pages, and more. In a series of interesting asides, Langville and Meyer provide fascinating insights into the ingenious contributions of many of the field's pioneers. They survey and compare the different methods employed today, showing why their strengths and weaknesses depend on the underlying goal, and explaining why and when a given method should be considered. Langville and Meyer also describe what can and can't be expected from the most widely used systems.
The science of rating and ranking touches virtually every facet of our lives, and now you don't need to be an expert to understand how it really works. Who's #1? is the definitive introduction to the subject. It features easy-to-understand examples and interesting trivia and historical facts, and much of the required mathematics is included.
"[T]his book is a call to consciousness on the relevance of rating and ranking as well as an enjoyable start-up guide from the point of view of algebraic methods."--Francisco Grimaldo Moreno, JASSS
"The book could be used to supplement a course on linear algebra and/or numerical linear algebra. . . . The book could also be used as the basis for a short topics course or undergraduate research project on ranking, or it could be used in a modeling class as an example of how mathematical modeling is done. In addition to describing the mathematics of ranking, the book is full of interesting tidbits that add to the pleasure of its reading."--James Keener, SIAM Review
"Readers will find many interesting ideas as they grapple with the complexities of the science of rating and ranking."--Bob Horton, Mathematics Teacher
"[T]he book . . . provide[s] an excellent, accessible, and stimulating discussion of the material it does cover. Overall, the book makes a valuable addition to the canon of rating and ranking."--David J. Hand, Journal of Applied Statistics
"Who's #1? is an excellent survey of the fundamental ideas behind mathematical rating systems. Once a realm of sports enthusiasts, ranking things is becoming a vital tool in many information-age applications. Langville and Meyer compare and contrast a variety of models, explaining the mathematical foundations and motivation. Readers of this book will be inspired to further explore this exciting field."--Kenneth Massey, Massey Ratings
"Langville and Meyer provide a rigorous yet lighthearted tour through the landscape of ratings methodologies. This is an enjoyable read that looks at ratings through the lens of sports, but also touches on how ratings affect our everyday lives through movies, Web search, online shopping, and other applications."--Chris Volinsky, member of the winning Netflix Prize team
"Who's #1? provides a much-needed synthesis of the methods used for ranking and rating things like sports teams, movies, politicians, and more. There is a ton of interest in this topic, and readers now have one place to look for a comprehensive treatment of the different approaches."--Wayne L. Winston, author of Mathletics: How Gamblers, Managers, and Sports Enthusiasts Use Mathematics in Baseball, Basketball, and Football
"This highly accessible book gives readers a comprehensive account of the different mathematical ranking techniques across many different disciplines, and will appeal to everyone from researchers to sports statistics junkies."--Sep Kamvar, author of Numerical Algorithms for Personalized Search in Self-organizing Information Networks
Amy N. Langville is associate professor of mathematics at the College of Charleston. Carl D. Meyer is professor of mathematics at North Carolina State University. They are the authors of Google's PageRank and Beyond: The Science of Search Engine Rankings (Princeton).
"This book is an excellent read for everyone; readers might be sports enthusiasts, social choice theorists, mathematicians, computer scientists, engineers, and college and high school teachers. Teachers will find quite an easy way to extract material for a short module."--Valentina Dagiene, Zentralblatt MATH
"This book is a great introduction to the field (including its constituent parts in linear algebra and data mining) and contains enough depth to be used as a supplemental book in a data mining course or as a jumping off point for an interested researcher. . . . Overall this is a very nice, well written book that could be use in multiple ways by a wide variety of audiences."--Nicholas Mattei, SigAct News
"When I started this book I knew very little about American football. I was little the wiser after finishing it, but I had an excellent understanding of various methods used in the obtaining of the ranking of teams and their interrelationships. Langville and Meyer are to be commended for this collection, and anyone who is more conversant with North American sports than I am will most certainly be stimulated by reading Who's #1?"--Andrew I. Dale, Notices of the AMS
"This book provides an interesting overview of ranking various sports teams, chess players, politicians, and the like in real-life circumstances, which typically involve serious constraints on the time available to find the optimal ranking."--Choice
"Who's #1 provides a fascinating tour through the world of rankings and is highly recommended."--Richard J. Wilders, MAA Reviews
"The profit the scientometrics community can gain from this book is an indirect one: an attitude how to compile a systematic collection of potential methods, how to select carefully using theoretical tests and empirical examples and how to combine methods to get a comprehensive, multidimensional rating and ranking system. In this sense, it is a highly recommended reading for all readers of the journal Scientometrics."--Andras Schubert, Scientometrics
"[A] thorough exploration of the methods and applications of ranking for an audience ranging from computer scientists and engineers to high-school teachers to 'people interested in wagering on just about anything'."--Nature Physics Audience xiii Chapter 1. Introduction to Ranking 1 Chapter 2. Massey?s Method 9 Chapter 3. Colley?s Method 21 Chapter 4. Keener?s Method 29 Chapter 5. Elo?s System 53 Chapter 6. The Markov Method 67 Chapter 7. The Offense-Defense Rating Method 79 Chapter 8. Ranking by Reordering Methods 97 Chapter 9. Point Spreads 113 Chapter 10. User Preference Ratings 127 Chapter 11. Handling Ties 135 Chapter 12. Incorporating Weights 147 Chapter 13. "What If . . ." Scenarios and Sensitivity 155 Chapter 14. Rank Aggregation-Part 1 159 Chapter 15. Rank Aggregation-Part 2 183 Chapter 16. Methods of Comparison 201 Chapter 17. Data 217 Chapter 18. Epilogue 223 Glossary 231
Purpose xiii
Prerequisites xiii
Teaching from This Book xiv
Acknowledgments xiv
Social Choice and Arrow?s Impossibility Theorem 3
Arrow?s Impossibility Theorem 4
Small Running Example 4
Initial Massey Rating Method 9
Massey?s Main Idea 9
The Running Example Using the Massey Rating Method 11
Advanced Features of the Massey Rating Method 11
The Running Example: Advanced Massey Rating Method 12
Summary of the Massey Rating Method 13
The Running Example 23
Summary of the Colley Rating Method 24
Connection between Massey and Colley Methods 24
Strength and Rating Stipulations 29
Selecting Strength Attributes 29
Laplace?s Rule of Succession 30
To Skew or Not to Skew? 31
Normalization 32
Chicken or Egg? 33
Ratings 33
Strength 33
The Keystone Equation 34
Constraints 35
Perron-Frobenius 36
Important Properties 37
Computing the Ratings Vector 37
Forcing Irreducibility and Primitivity 39
Summary 40
The 2009-2010 NFL Season 42
Jim Keener vs. Bill James 45
Back to the Future 48
Can Keener Make You Rich? 49
Conclusion 50
Elegant Wisdom 55
The K-Factor 55
The Logistic Parameter ? 56
Constant Sums 56
Elo in the NFL 57
Hindsight Accuracy 58
Foresight Accuracy 59
Incorporating Game Scores 59
Hindsight and Foresight with ? = 1000, K = 32, H = 15 60
Using Variable K-Factors with NFL Scores 60
Hindsight and Foresight Using Scores and Variable K-Factors 62
Game-by-Game Analysis 62
Conclusion 64
The Markov Method 67
Voting with Losses 68
Losers Vote with Point Differentials 69
Winners and Losers Vote with Points 70
Beyond Game Scores 71
Handling Undefeated Teams 73
Summary of the Markov Rating Method 75
Connection between the Markov and Massey Methods 76
OD Objective 79
OD Premise 79
But Which Comes First? 80
Alternating Refinement Process 81
The Divorce 81
Combining the OD Ratings 82
Our Recurring Example 82
Scoring vs. Yardage 83
The 2009-2010 NFL OD Ratings 84
Mathematical Analysis of the OD Method 87
Diagonals 88
Sinkhorn-Knopp 89
OD Matrices 89
The OD Ratings and Sinkhorn-Knopp 90
Cheating a Bit 91
Rank Differentials 98
The Running Example 99
Solving the Optimization Problem 101
The Relaxed Problem 103
An Evolutionary Approach 103
Advanced Rank-Differential Models 105
Summary of the Rank-Differential Method 106
Properties of the Rank-Differential Method 106
Rating Differentials 107
The Running Example 109
Solving the Reordering Problem 110
Summary of the Rating-Differential Method 111
What It Is (and Isn?t) 113
The Vig (or Juice) 114
Why Not Just Offer Odds? 114
How Spread Betting Works 114
Beating the Spread 115
Over/Under Betting 115
Why Is It Difficult for Ratings to Predict Spreads? 116
Using Spreads to Build Ratings (to Predict Spreads?) 117
NFL 2009-2010 Spread Ratings 120
Some Shootouts 121
Other Pair-wise Comparisons 124
Conclusion 125
Direct Comparisons 129
Direct Comparisons, Preference Graphs, and Markov Chains 130
Centroids vs. Markov Chains 132
Conclusion 133
Input Ties vs. Output Ties 136
Incorporating Ties 136
The Colley Method 136
The Massey Method 137
The Markov Method 137
The OD, Keener, and Elo Methods 138
Theoretical Results from Perturbation Analysis 139
Results from Real Datasets 140
Ranking Movies 140
Ranking NHL Hockey Teams 141
Induced Ties 142
Summary 144
Four Basic Weighting Schemes 147
Weighted Massey 149
Weighted Colley 150
Weighted Keener 150
Weighted Elo 150
Weighted Markov 150
Weighted OD 151
Weighted Differential Methods 151
The Impact of a Rank-One Update 155
Sensitivity 156
Arrow?s Criteria Revisited 160
Rank-Aggregation Methods 163
Borda Count 165
Average Rank 166
Simulated Game Data 167
Graph Theory Method of Rank Aggregation 172
A Refinement Step after Rank Aggregation 175
Rating Aggregation 176
Producing Rating Vectors from Rating Aggregation-Matrices 178
Summary of Aggregation Methods 181
The Running Example 185
Solving the BILP 186
Multiple Optimal Solutions for the BILP 187
The LP Relaxation of the BILP 188
Constraint Relaxation 190
Sensitivity Analysis 191
Bounding 191
Summary of the Rank-Aggregation (by Optimization) Method 193
Revisiting the Rating-Differential Method 194
Rating Differential vs. Rank Aggregation 194
The Running Example 196
Qualitative Deviation between Two Ranked Lists 201
Kendall?s Tau 203
Kendall?s Tau on Full Lists 204
Kendall?s Tau on Partial Lists 205
Spearman?s Weighted Footrule on Full Lists 206
Spearman?s Weighted Footrule on Partial Lists 207
Partial Lists of Varying Length 210
Yardsticks: Comparing to a Known Standard 211
Yardsticks: Comparing to an Aggregated List 211
Retroactive Scoring 212
Future Predictions 212
Learning Curve 214
Distance to Hillside Form 214
Massey?s Sports Data Server 217
Pomeroy?s College Basketball Data 218
Scraping Your Own Data 218
Creating Pair-wise Comparison Matrices 220
Analytic Hierarchy Process (AHP) 223
The Redmond Method 223
The Park-Newman Method 224
Logistic Regression/Markov Chain Method (LRMC) 224
Hochbaum Methods 224
Monte Carlo Simulations 224
Hard Core Statistical Analysis 225
And So Many Others 225
Bibliography 235
Index 241