Author picture

About the Author

Kenneth Cukier is the deputy executive editor of The Economist and hosts its weekly technology podcast. Viktor Mayer-Schnberger is professor of internet governance and regulation at the Oxford Internet Institute at the University of Oxford. He is also a faculty affiliate of the Belfer Center for show more Science and International Affairs at Harvard University. Francis de Vricourt is professor of management science and director of the Center for Decisions, Models, and Data at the European School of Management and Technology. show less

Includes the name: Kenneth N. Cukier

Works by Kenneth Cukier

Associated Works

Megatech: Technology in 2050 (2017) — Contributor — 85 copies, 1 review

Tagged

Common Knowledge

Gender
male
Occupations
journalist
Nationality
USA
Associated Place (for map)
USA

Members

Reviews

34 reviews
[B]ig data is about three major shifts of mindset that are interlinked and hence reinforce one another. The first is the ability to analyze vast amounts of data about a topic rather than be forced to settle for smaller sets. The second is a willingness to embrace data’s real-world messiness rather than privilege exactitude. The third is a growing respect for correlations rather than a continuing quest for elusive causality.

That’s “big data” the concept, to which my reactions were, show more respectively, bogglement, disagreement, and suspicion. And then there’s Big Data the book, wherein the authors unpacked their ideas and transformed mine.

First, about the mind-numbing amount of data, coming from everywhere -- Google and Facebook and public surveillance cameras for sure, but suffice it to say that everything electronic is gathering data, and everything that connects to the Internet is uploading the data to someone. And about the format of data, which has morphed from 75% analog in 2000 to 93% digital in 2007 (estimated to be >98% in 2013). Second, that the tidy, structured data of relational databases is now miniscule (estimated at 5%) compared with the as-yet untapped, error-ridden stuff of real life, like blogs and video. And third, that conceiving hypotheses, gathering perfect, representative data, and reaching causal conclusions is nowhere near as valuable or timely as finding correlations (the “what, not why”) in a gigantic mess of data. The authors characterize big data as, “the equivalent of impressionist painting, wherein each stroke is messy when examined up close, but by stepping back one can see a majestic picture.” Fascinating!

Then they address the problems of big data and, unlike most “alarmist” book I’ve read, they propose solutions. They advise that the ship has sailed on individuals being in control of their private information and online footprints (e.g. via opting out or being anonymous), especially with the secondary and tertiary (and quaternary, and...) markets that re-analyze data long after it’s been collected. So they suggest that the data users be held accountable through law/regulation similar to what’s in place for other industries that hold potential for public harm. They suggest a new professional -- a “data scientist” or “algorithmist” -- who isn’t the do-er who queries big data but rather the outside-the-lines thinker with a big-data mindset who “peers into databases to make a discovery” that creates new value. And they caution against “what’s-past-is-prologue” thinking -- where personal history and the statistics of correlation drive everything from basing your credit score upon the credit scores of your Facebook friends, to Minority Report-like “predictive policing” -- arguing instead for safeguards that recognize free will and actual behaviors.

Here is a book with the awe I’ve been seeking! I turned every page with excitement about what would be on the next page. There’s some repetition, but it’s usually with a twist that enhances internalization and recollection, and there are dozens of fascinating business examples along the way. It’s optimistic not alarmist; rather than running to find a doomsday hidey-hole, I came away transformed. It’s the best book I’ve read so far this year.

(Review based on an advance reading copy provided by the publisher.)
show less
½
Big data, one of recent years' new buzzwords, has now gotten itself a book with said title. Mayer-Schönberger and Cukier's "Big data: a revolution that will transform how we live, work and think" focuses mostly on what businesses can do with big data, and you ain't gonna find no much material as a technological-oriented data scientist. The book is from 2013 and already seems dated in the light of the Snowdon revelations. The authors critique of personal big data collection does not mentions show more the dragnet operations of signal intelligence agencies besides an 8-line William Binney-paragraph.

The authors claim three features of big data ("three major shifts of mindset"): "More", messy and correlation rather than causality. I am not entirely convinced that these features distinguish big data. Interventional A/B-testing seems at least to some degree to probe causality rather than just correlation. Such tests are continuously done by major Internet companies on unsuspecting users on large scale. Thus I would say big data processing is indeed probing causality. I neither agree that the big data is more messy than old-time small data. Anyone working seriously with small data may easily find the handling of such data can be a considerable headache and require some processing and 'understanding'. Indeed big data technologies have brought us means for handling messy data in a more structured way (JSON, NoSQL, Semantic Web, Wikidata). The reason why small data may feel less messy could be because the clean-up of small data can be done manually in a spreadsheet by a non-programmer, while for big data you need automatic tools and probably a programmer.

The authors also claim that we will see a rise in the profession called 'the algorithmist' whose job it will be to review algorithms. I do not think this is likely. The closest will probably get is the Google advisor board on the 'right to be forgotten'.

The authors also fails to give us a proper critique of big data hype: Their initial example on Google Flu Trends is dated: A publication from March 2014 shows a wrong flu prevalence estimation from Google Flu Trends (see 'The Parable of Google Flu: Traps in Big Data Analysis'). The Zeo EEG big data ZEO mentioned in the book hailed back in 2013 as one of the "8 Best Sleep Tracking Apps and Devices" has run out of money, is 'out of business' and you won't find a response from www.myzeo.com.

While the authors tell us that companies collect vast amount of data and that "Companies may be powerful" they ensure us on page 156 that the companies "don't have the state's powers to coerce". Well, yes. But the states have the ability to coerce the company to hand over any personal data. Indeed U.S. companies are coerced to hand over oversea data. Loretta A. Preska of the United States District Court told that to Microsoft. And within the U.S. PRISM program the handover is determined in secret FISA courts.

But in general there is a good allround discussion of the issues of Big Data, e.g., the notion of "collect everything" for not necessarily a predefined purpose. It will be interesting to have the opinions of the authors in light of the Snowdon issues.
show less
The book gave me a new appreciation as to how we can go about making decisions in the future. It also helped me understand a bit more about the debate whirling around the NSA and its desire to get all the information that it can. On the other hand, it reminded me that making decisions based solely on data can be dangerous, as in the case of Robert McNamara and the Vietnam War.
These authors argue that in the era of Big Data, the idea of privacy is obsolete. We click, we search, we call, we charge it, and computers can process all that data faster than we imagine. While it takes the CDC doctors weeks to verify the outbreak of a flu epidemic, Google can detect the increase the the number of flu related Internet searches almost immediately! Their ability to categories and even identify people with the immensity of anonymous data will jolt the reader. Their enthusiasm show more comes through in the writing, whether you are a businessman hoping to use Big Data or a citizen concerned about the changing digital landscape. show less

Lists

Awards

You May Also Like

Associated Authors

Statistics

Works
11
Also by
1
Members
882
Popularity
#29,045
Rating
½ 3.6
Reviews
31
ISBNs
42
Languages
11
Favorited
1

Charts & Graphs