Book Review: Data-ism

New York Times reporter Steve Lohr borrowed the term “Data-ism” from fellow columnist David Brooks to express the increasingly important realization that the capabilities of big data have had a huge impact on both the practice of business and our daily lives. In this book, Mr. Lohr explores the fascinating human interest stories that lie behind this extraordinary wave of societal change.

Early on, the author states that much of the narrative will revolve around a leading data scientist named Jeffrey Hammerbacher, and a very familiar company named IBM. Though this isn’t quite true – there are lots of other individuals and companies discussed at length throughout the book – this tactic of looking at both individuals and entire companies provides a balanced picture of how organizations are being transformed by the promise of big data.

The tale told of Jeffrey Hammerbacher’s life and work is typical of the technology-driven craving for data that motivates the new breed of data scientists. His career has been varied, intense and even a bit reckless. After working at Bear Stearns, and then Facebook when it was still a small startup, he founded Cloudera, a leading big data firm that capitalizes on Hadoop, and is now involved with genotype research at Mount Sinai hospital in New York City. The book also relates numerous big data research projects that IBM has undertaken, from the first appearance of Watson on the game show Jeopardy! to its collaboration with the Gallo Winery to apply the principles of big data and “precision agriculture” to dramatically increase vine yields.

There’s not a lot of discussion of technical specifics, but one broad issue that’s mentioned relates to the distinction between correlation and association. It’s noted that a primary tool of big data is correlation, but that correlation isn’t always enough, particularly in more complex situations that require a deeper understanding of the issues. In that case, more powerful “associations” need to be formulated that take context into account. An example of such a system is Carnegie Mellon’s Never-Ending Language Learning (or NELL) that has scanned millions of web pages to discover textual patterns that can be used to add semantic meaning to mere facts. The author makes the case that the algorithms that are utilized in big data ultimately need to be explained to users so they understand how conclusions were reached.

Data-ism is a mostly optimistic view of the potential for big data, but in a later chapter titled “The Prying Eyes of Big Data,” the author offers a perfunctory caution regarding privacy concerns. As noted, there’s certainly a risk that data-driven mistakes can cause harm to individuals, particularly in cases when discrimination occurs due to incorrect assumptions, for example when an individual is categorized as a likely criminal due to their last name. However, the author observes that privacy concerns have long been a societal issue, even before the advent of the internet. In the early 1900s, many were upset when owners of cheap Kodak Brownie cameras used them to take photos of unsuspecting women on the beach. Mr. Lohr goes on to make the point that, in reality, one’s credit card purchases are a lot more revealing than their web-surfing habits.

As a newspaper reporter, Mr. Lohr has adopted an objective tone in the book. However, in the final chapter, he allows himself to offer a few cogent thoughts about the future of big data. Mainly, he wonders if twentieth century managerial capitalism will give way to a kind of data capitalism. As synthesized by Alfred D. Chandler Jr in The Visible Hand, managerial capitalism relies on financial metrics to measure the success of an organization. In a sense, this is a focus on the past, albeit the recent past. In contrast, a greater emphasis on data might allow managers to focus their attention on the present as they attempt to predict the future. Still, the author offers a degree of skepticism that the boasts of big data may turn out to be reminiscent of the mania for Frederick Taylor’s now discredited scientific management credos that were so popular in the early decades of the twentieth century. As such, the ultimate success of big data is by no means a certainty.

Data-ism: The Revolution Transforming Decision Making, Consumer Behavior, and Almost Everything Else
by Steve Lohr
Harper Business, March 2015
240 pages, $29.99

Posted in Analytics and Visualization. Comments Off on Book Review: Data-ism