|
Loading... Super Crunchers: Why Thinking-by-Numbers Is the New Way to Be Smartby Ian Ayres
Interesting topic for all number crunchers accountants, engineers or statisticians. Brings up some new ideas on neural networks. Worthwhile reading for anyone in the data analysis, data mining, and predictive analytics fields - or who wants to understand something about those fields. Most valuable for me were the discussions on: data scraping - gathering information from online sources, the comparisons of human expertise versus simple models, and the use and uses of testing. Areas where conventional wisdom and human expertise rule the day will continue to come under attack. It's interesting to contrast this book with "The Black Swan", by N. Taleb. In my opinion, this author (Ayers) is a "true believer" in the value of statistics, regression models, and the like. Taleb, on the other hand, rails at those who put their faith in statistical models. I think there is some middle ground, in that Taleb may agree that statistical models are useful, but users need to be wary - of unlikely events and the *impact* of those events, and knowing where the models apply and where they don't. Thoroughly enjoyed this book -- it was a quick read at just over 200 pages (before the end notes kicked in). Some basic statistical concepts are provided for the statistically challenged but that doesn't detract to any large degreee from sharing of the various Super Cruncher projects that have taken place. Nearly overlooked in the book is the manner in which the data is obtained and the validation of the same. A regression model is only as good as the quality of the data and the manner in which the raw data is understood by the statistician. Anyone who is involved with model development will tell you that the real work is in the data prep phase. Ian Ayers is a surprisingly engaging writer, taking what many would consider a very dry topic — statistics — and turning it into a thought-provoking, but flawed, book. From the opening pages, Ayers pits the "super crunchers" (read: people applying statistics to large data sets) against experts in an area, be it viticulture, baseball, or marketing. With barely suppressed glee he describes how number crunching out-predicts the experts time and time again. The point being that as collecting, storing and analysing large amounts of data becomes cheaper and cheaper, more and more decision-making will take the results of "super crunching" into account, with experts either having to step aside or learn some statistical chops. To back arguments for the rise of "super crunching" Ayers draws on a large number of examples from a variety of areas and even experiments with the technique himself, describing how he used it to help choose the title of his book. Although I am more or less convinced by Ayers' arguments I found myself questioning his credibility in several places during the book. I think the main reason for this was due to the tone of the book occasionally crossing the fine line separating "enthusiastic, popular account" and "overly simplistic, gushing rave". The constant use of "super crunching" throughout the book got on my nerves after a while. It began to overemphasise the newness of what could as easily be called "statistical analysis". After a while I mentally replaced "super crunching" with the less sensational "statistical analysis" wherever I encountered it. Conversely, Ayers constantly refers to "regression" when talking about the techniques analysts use to make predictions. At first, I thought this was a convenient short-hand for a range of techniques that he didn't want to spend time distinguishing between. It was only when neural networks are described as "a newfangled competitor to the tried-and-true regression formula" and "an important contributor to the Super Crunching revolution" that I realised that Ayers may not know as much about the nuts and bolts of computational statistics as I first thought. This impression was confirmed when Ayers later confuses "summary statistics" for "sufficient statistics" and talks tautologically of "binary bytes". Stylistically, there is too much foreshadowing and repetition of topics throughout the book for my liking. This feels a little condescending at times, as does him directly asking the reader to stop and think about a concept or problem at various points. Overall, I wanted to like this book more than I did. It was a light, enjoyable read and I wholeheartedly agree with Ayers' belief in the continuing importance of statistics in decision-making and his call to improve the average person's intuition of statistics. Unfortunately, I found much of "Super Crunchers" substituting enthusiasm for coherence, as well as impressions and anecdote for any kind of meaningful argument. The presentation of facts were very simple & easy to understand - I would say that barring the very young, this book would also be suitable for kids (read: 'young adults') that are mathematical enough to like statistics (as a kid I knew a few people like these!). Whilst interesting & easy to read, I found the book lacked direction. What I liked: The many examples of applications in current day situations (both for & against the consumer), the current psychological barrier adopting this methodology in certain industries, highlighting how absolute judgement by human perception is a myth & encouraging verification of data crunchers. What could have been better: More tips on how to apply it in daily life (perhaps the very reason why I find it inadequate is because it promises to do this), clearer segmentation, better summaries, & I also found the author's way of describing his fellow associates was slightly excessive... Overall quite a good read, but I was slightly let down because of the promises at the start of the book. This is a terrific book with practical implications on almost every page. It jibes with my naturalist sympathies---it says to hell with intuition and arm chair analysis: put it to the empirical test. He's surprisingly unself-conscious, though, about the sophisticated objections to reliance on randomized control trials, especially in medicine. But perhaps that's a matter for another book. He's a lucid and lively writer who betrays true commitment to the argument of the book (i.e. that we ought to place more importance in empirical results than in expertise). Last year the law professor and Balkinization blogger Ian Ayres published a popular book on statistical reasoning that focused mainly on regressions. There's necessarily a tension between being popular and being about statistics, and so I should say up front that about a third of the book is not going to be comprehensible if you don't have a handle on what a probability distribution like a bell curve is all about. But even in that case, I think it might be worth a read just to get swept up in the excitement about the future and get in on the expert-bashing. And expert-bashing is a big and convincing part of the book. Wine, baseball, and medical professionals resist the idea that their intuition and discretion are in many cases outmoded. Ayres gently relates some of their lame and self-serving arguments for intuitive expertise, and that's fun to hear. In other cases the experts are simply inept at using the new techniques, and I think many readers will be scandalized by the statistical illiteracy and nearly superstitious traditionalism of some physicians. Readers of Robin Hanson's or Bryan Caplan's blogs will have heard this before, but it bears repeating since it's literally a life-or-death matter. The book is especially impressive for the large number of applications and implications Ayres discusses. I think it would have been made better by two additions, however. First, it's apparent that Ayres is economically literate, and so I think he should have anticipated and rebutted the readers' natural fears about jobs being lost. He suggests that humans will still have a role to play in the future of the revolutionized professions but only as intuitive guessers of variables to program the computers with. That sort of half-way answer stokes rather than extinguishes the make-work bias that most people will bring to the book. The second addition I'd make to the book is an explicit consideration of the moral assumptions that lie behind some of the medical and legal equations. Ayres briefly discusses how a doctor cannot just do the math to advise a pregnant mother to have or skip an intrusive test for Down Syndrome when the test also increases the risk of miscarriage. To do that he would have to know how much she fears a miscarriage relative to rearing a mentally disabled child. As crude as it may be to consider, the simple fact is that the doctor cannot advise, and his probabilistic equations will tell him nothing, until the patient gives some indication that she fears one outcome three times, for example, more than the other. (And of course the equation is not calculable if you think the child's (unknowable) preferences ought to figure in.) Ayres does even worse in his discussions of gun control, capital punishment, and violent criminal recidivism. At one point he says that a panel's decision to release a violent offender against the advice of an probability equation of recidivism was an "error." Only In a narrow, technical sense was it an "error," and to leave it at that is to sweep many moral judgments under the rug: does his freedom count for anything? is it the role of the law to minimize violence before it is committed? how many false positives are too many? etc, etc. Similarly his discussion of capital punishment's effects on the murder rate equates lives lost to state-sanctioned killing and acts of murder: It's contentious to suggest these things are even commensurable, let alone comparable at a one-to-one ratio. The economist Deirde McCloskey has fussed about these hidden moral judgments at length, so I am surprised Ayres couldn't manage to find room for even one sentence on the matter. For all its moral-blindness, it's still an entertaining and informative book. Highly recommended. I enjoyed this book because it talked about the coming world where folks use data to make decisions and this allows them to refine their ability to make better decisions later. I don't think the author gave enough focus on the issue of measuring all that's important (if you can't do that then you'll be doing yourself a disservice by deciding based on what you can measure). The influence this book had on me resulted in two interesting gains. 1. The insight into his use of the "2SD" rule to calibrate the range of one's confidence 2. The reference to Michael Lewis book Moneyball. A fascinating book which examines how manipulation of large datasets informs ever increasing faciets of our daily life. Lots of examples however few details until the author discusses impact on teaching and medicine. The book is a very positive and optimistic view of how statistics and polling are influencing our lives. Very interesting book about the ever-increasing role of statistics in a wide variety of fields. Ayres demonstrates how corporations are moving towards statistical analysis for all things from pricing of their products, to how much money a movie might make by evaluating the script alone. I found it very compelling and actually learned a few things about statistics. |
|
3.0 out of 5 stars An easy read on data-driven decision making, October 17, 2007
Ian Ayres book is another book extolling the virtues of data-driven decision making. In that regard it is very similar to Competing on Analytics: The New Science of Winning. The book focuses in on the power of data mining and other analytic techniques, especially when combined with random or double-blind studies and the kind of testing often called Adaptive Control (discussed, for instance, in my book Smart Enough Systems: How to Deliver Competitive Advantage by Automating Hidden Decisions. It asserts, and demonstrates with many studies and studies of studies, that this kind of data-driven decision making outperforms traditional experts essentially all the time.
While Ian is a little in love with the subject, and while he has created an unnecessary and irritating label (Super Crunchers) when he could have called these people Data Miners like everyone else, the book is well written and an easy read.
He has some fun examples - everything from the mathematical prediction of wine vintages to established stories like Harrahs and CapOne.
I liked the way in which he talks about the changing role of experts in this world. Not interpreting results but providing the subjective or face-to-face input that algorithms need to make better decisions. I think many organizations will go through a similar progression. First they might adopt a purely rules-driven or expert-centric approach. Gradually as their data, and their understanding of it, improves they might tune these rules with analytic models. Ultimately they may well find that the rules are definitively subordinate to the models with most or even all of the decision making power coming from the models. Unlike the experts in Ian's stories, one hopes the rules will not be upset by this!
One section also made a great point, highlighting in passing a potential advantage of adopting decision automation over more traditional forms of decision support. While people using decision support systems do better than people alone, they still don't do as well as the analytic model would on its own. Decision automation, with its reliance on the model, would obviate this problem.
He does not spend enough time discussing the difference between causation and correlation nor does he talk much about the constraints that can be imposed through regulation or explicit company policy. His focus is often on one-off insight that changes how organizations do something rather than on the use of this kind of decision making in high-volume, transactional systems.
Finally I agree with him that the rise of automation in decision making will force consumers to retaliate by getting access to data, and the implications of that data, to resist the ability of companies to use data to their advantage.
Overall a good book, though not perhaps as good as Competing on Analytics: The New Science of Winning. (