description: an example of how Walmart used big data analytics to discover that Pop-Tarts sales increased before hurricanes
17 results
by Viktor Mayer-Schonberger and Kenneth Cukier · 5 Mar 2013 · 304pp · 82,395 words
as its first undergrad to major in computer science. From his perch at the University of Washington, he started a slew of big-data companies before the term “big data” became known. He helped build one of the Web’s first search engines, MetaCrawler, which was launched in 1994 and snapped
…
in the past they were only available to spy agencies, research labs, and the world’s biggest companies. After all, Walmart and Capital One pioneered the use of big data in retailing and banking and in so doing changed their industries. Now many of these tools have been democratized (although the
…
pairs. But the absolute number of data points alone, the size of the dataset, is not what makes these examples of big data. What classifies them as big data is that instead of using the shortcut of a random sample, both Flu Trends and Steve Jobs’s doctors used as much
…
noticed that prior to a hurricane, not only did sales of flashlights increase, but so did sales of Pop-Tarts, a sugary American breakfast snack. So as storms approached, Walmart stocked boxes of Pop-Tarts at the front of stores next to the hurricane supplies, to make life easier for customers dashing in
…
need to have an inkling of how airlines price their tickets. We don’t need to care about the culinary tastes of Walmart shoppers. Instead we can subject big data to correlation analysis and let it tell us what search queries are the best proxies for the flu, whether an airfare
…
much information will be worth to us in the era of big data. We’ve seen some of this potential realized already, as when Walmart searched its database of old sales receipts and spotted the lucrative correlation between hurricanes and Pop-Tarts sales. All this suggests that data’s full value is much
…
up with the most innovative uses for it. In the case of Walmart and Pop-Tarts, for example, the retailer turned to the specialists at Teradata, a data-analytics firm, to help tease out the insights. Third is the big-data mindset. For certain firms, the data and the know-how are not
…
are likely to change over time. To start, let’s examine each category—data holder, data specialist, and big-data mindset—in turn. The big-data value chain The primary substance of big data is the information itself. So it makes sense to look first at the data holders. They may not have
…
additional value. The challenge for the victors of a small-data world and for offline champions—companies like Walmart, Proctor & Gamble, GE, Nestlé, and Boeing—is to appreciate the power of big data and collect and use data more strategically. The aircraft engine-maker Rolls-Royce completely transformed its business over
…
find they no longer need to attain a threshold in size. Instead, they can remain small and still flourish (or be acquired by a big-data giant). Big data squeezes the middle of an industry, pushing firms to be very large, or small and quick, or dead. Many traditional sectors will eventually
…
in all sectors, but it will certainly place pressure on companies in industries that are vulnerable to being shaken up by the power of big data. Big data is poised to disrupt the competitive advantages of states as well. At a time when manufacturing has been largely lost to developing countries and
…
the office and going online to learn how to protect themselves, not because the searchers are ill themselves. The dark side of big data As we have seen, big data allows for more surveillance of our lives while it makes some of the legal means for protecting privacy largely obsolete. It also
…
and one fraught with risk for the rest of us. It is obviously impossible to foretell how a technology will develop; even big data can’t predict how big data will evolve. Regulators will need to strike a balance between acting cautiously and boldly—and the history of antitrust law points to
…
cost of reminding the sick to take their medication. Languages are translated and cars drive themselves on the basis of predictions made through big-data correlations. Walmart can learn which flavor Pop-Tarts to stock at the front of the store before a hurricane. (Answer: strawberry.) Of course, causality is nice when you
…
fallible. This doesn’t mean they’re wrong, only that they are always incomplete. It doesn’t negate the insights that big data offers, but it puts big data in its place—as a tool that doesn’t offer ultimate answers, just good-enough ones to help us now until better
…
Random House, 2008); for more, see Nassim Nicholas Taleb, The Black Swan: The Impact of the Highly Improbable (2nd ed., Random House, 2010). [>] Walmart and Pop-Tarts—Constance L. Hays, “What Wal-Mart Knows About Customers’ Habits,” New York Times, November 14, 2004 (http://www.nytimes.com/2004/11/14/business/yourmoney
…
Media, June 26, 2012 (http://strata.oreilly.com/2012/06/predictive-data-analytics-big-data-nyc.html). [>] Walmart and Pop-Tarts—Hays, “What Wal-Mart Knows About Customers’ Habits.” [>] Big data’s use in slums and in modeling refugee movements—Nathan Eagle, “Big Data, Global Development, and Complex Systems,” http://www.youtube.com/watch?v=yaivtqlu7iM. Perception
…
to understand data-reuse, [>] releases personal data, [>]–[>] Apple, [>], [>] and cell phone data, [>] Arabic numerals, [>]–[>] Arnold, Thelma, [>]–[>] artificial intelligence: big data and, [>]–[>] at Google, [>] Asthmapolis, [>] astronomy: big data in, [>] automobiles: anti-theft systems, [>] data-gathering by, [>]–[>], [>]–[>], [>], [>], [>]–[>] automobiles, electric: big data and, [>]–[>] IBM and, [>]–[>] automobiles, self-driving, [>], [>], [>], [>] Aviva, [>]–[>] Ayres, Ian: Super Crunchers, [>] Bacon, Francis, [>] Banko, Michele, [>], [>]
…
[>], [>]–[>] privacy and, [>]–[>], [>], [>], [>] psychological effects, [>]–[>] replaces statistical sampling, [>]–[>], [>], [>]–[>], [>]–[>] role of subject-area expertise in, [>]–[>] social & economic effects of, [>]–[>], [>]–[>], [>]–[>], [>]–[>], [>], [>], [>], [>]–[>], [>]–[>] as source of competitive advantage, [>]–[>] value chain, [>], [>], [>]–[>], [>]–[>] “big-data companies,” [>]–[>], [>] Billion Prices Project, [>]–[>] Bing, [>] Binney, William, [>] births, premature: McGregor and, [>]–[>], [>], [>] Bloomberg, Michael, [>]–[>] bookkeeping, double-entry: history of, [>]–[>] Pacioli and, [>]–[>] books. See also e-books
…
[>]–[>], [>], [>] of sales data, [>] vs. scientific method, [>]–[>] and subprime mortgage scandal (2009), [>] of text, [>]–[>] in video game design, [>]–[>] Coursera, [>], [>] Craigslist, [>] Crawford, Kate, [>] credit card fraud: big data and, [>]–[>], [>]–[>] Kunze on, [>] credit scores: correlation analysis and, [>] datafication and, [>] credit transactions: analysis of, [>] crime prevention: predictive policing and, [>]–[>] Crosby, Alfred, [>], [>] Cross, Bradford, [>]–[>] “culturomics,” [>]–[>]
…
data. See also big data; information; open data aggregation of, [>], [>], [>], [>], [>]–[>], [>]–[>], [>], [>], [>], [>], [>], [>]–[>], [>], [>]–[>] anonymization of, [>], [>]–[>] brokering, [>] compared to energy, [>] decision-making driven by, [>]–[>] depreciating value of, [>]–[>] “dictatorship” of, [>], [>]–[>], [>]–[>] economic value of reusing, [>]–[>], [>]–[>], [>]–[>], [>]–[>],
…
, [>] Davenport, Thomas, [>], [>] Decide.com, [>]–[>], [>] decision-making: driven by data, [>]–[>], [>] Delano, Robert, [>] Deloitte Consulting, [>] Derawi Biometrics, [>] Derwent Capital, [>] digitization: vs. datafication, [>]–[>], [>]–[>] revolution in, [>], [>], [>] DNA sequencing: big data in, [>] cost of, [>] Steve Jobs and, [>]–[>], [>] Domesday Book, [>]–[>] Dostert, Leon, [>] Duhigg, Charles: The Power of Habit, [>]–[>] Eagle, Nathan, [>]–[>] eBay, [>], [>] e-books. See also books Amazon
…
and, [>]–[>] and datafication, [>]–[>] and data-reuse, [>]–[>], [>]–[>] e-commerce: big data in, [>]–[>] economic development: big data in, [>]–[>] education: misuse of data in, [>] online, [>] edX, [>] Eisenstein, Elizabeth, [>] Elbaz, Gil, [>] election of 2008: data-gathering in, [>] electrical meters: data-gathering by, [>]–[>]
…
airline fare pricing patterns, [>]–[>], [>], [>], [>], [>], [>], [>], [>], [>] Euclid, [>] European Union: open data in, [>] Evans, Philip, [>] exactitude. See also imprecision and big data, [>]–[>], [>], [>], [>], [>] in database design, [>]–[>], [>] and measurement, [>]–[>], [>] necessary in sampling, [>], [>]–[>] Excite, [>] Experian, [>], [>], [>], [>], [>] expertise, subject-area: role in big data, [>]–[>] explainability: big data and, [>]–[>] Facebook, [>], [>], [>]–[>], [>]–[>], [>], [>], [>], [>] data processing by, [>] datafication by, [>], [>] IPO by, [>]–[>] market valuation of, [>]–[>] uses “data exhaust,” [>] Factual
…
, [>]–[>] insurance industry uses data, [>] UPS uses data, [>]–[>] Germany, East: as police state, [>], [>], [>] Global Positioning System (GPS) satellites, [>]–[>], [>], [>], [>] Gnip, [>] Goldblum, Anthony, [>] Google, [>], [>], [>], [>], [>], [>], [>], [>] artificial intelligence at, [>] as big-data company, [>] Books project, [>]–[>] data processing by, [>] data-reuse by, [>]–[>], [>], [>] Flu Trends, [>], [>], [>], [>], [>], [>] gathers GPS data, [>], [>], [>] Gmail, [>], [>] Google Docs, [>] and language translation, [>]–[>], [>], [>], [>], [>] MapReduce, [>], [>] maps, [>] PageRank,
…
-term analytics by, [>], [>], [>], [>], [>], [>] speech-recognition at, [>]–[>] spell-checking system, [>]–[>] Street View vehicles, [>], [>]–[>], [>], [>] uses “data exhaust,” [>]–[>] uses mathematical models, [>]–[>], [>] government: and open data, [>]–[>] regulation and big data, [>]–[>], [>] surveillance by, [>]–[>], [>]–[>] Graunt, John: and sampling, [>] Great Britain: open data in, [>] guilt by association: profiling and, [>]–[>] Gutenberg, Johannes, [>] Hadoop, [>], [>] Hammerbacher, Jeff, [>] Harcourt, Bernard, [>] health care
…
Good Enough” (Hellend), [>] Import.io, [>] imprecision. See also exactitude in data-processing, [>]–[>] nature of, [>]–[>] as positive feature of big data, [>]–[>], [>]–[>], [>]–[>], [>], [>], [>] and scale, [>], [>], [>], [>], [>] and truth, [>] In Retrospect (McNamara), [>] inflation: big data and calculation of, [>]–[>] information. See also big data; data; open data analysis of, [>]–[>], [>] as basis of the universe, [>]–[>] growth in amount of, [>]–[>], [>], [>]–[>], [>], [>], [>] Hilbert
…
, Douglas: The War Managers, [>] Koshimizu, Shigeomi: analyzes ergonomic data, [>], [>], [>], [>]–[>] Kunze, John: on credit card fraud, [>] Laney, Doug, [>], [>], [>] Large Synoptic Survey Telescope, [>] laws: against misuse of big data, [>], [>]–[>] protecting privacy, [>], [>] for use of information, [>] Leavitt, Stephen: Freakonomics, [>]–[>] Levis, Jack, [>]–[>] Lewis, Michael: Moneyball, [>] lexicology, computational, [>] Linden, Greg, [>]–[>] LinkedIn, [>], [>], [>], [>], [>] Luther, Martin, [>] Lytro camera,
…
[>] and language translation, [>] Word spell-checking system, [>]–[>] Minority Report [film], [>]–[>], [>] Moneyball [film], [>], [>]–[>], [>], [>] Moneyball (Lewis), [>] Moore’s Law, [>] Mydex, [>] nanotechnology: and qualitative changes, [>] Nash, Bruce, [>] nations: big data and competitive advantage among, [>]–[>] natural language processing, [>] navigation, marine: correlation analysis in, [>]–[>] Maury revolutionizes, [>]–[>], [>], [>], [>], [>], [>], [>], [>], [>], [>] Negroponte, Nicholas: Being Digital, [>] Netbot, [>] Netflix, [>] collaborative filtering at, [>]
…
data-reuse by, [>] releases personal data, [>] Netherlands: comprehensive civil records in, [>]–[>] network analysis, [>] network theory, [>] big data in, [>]–[>] New York City: exploding manhole covers in, [>]–[>], [>]–[>], [>], [>] government data-reuse in, [>]–[>] New York Times, [>]–[>] Next Jump, [>] Neyman, Jerzy: on statistical sampling, [>] Ng, Andrew, [>] 1984
…
UPS, [>] predictive policing, [>] and crime prevention, [>]–[>] price-prediction: for consumer products, [>]–[>], [>] PriceStats, [>] printing press: socioeconomic effects of, [>], [>], [>]–[>] Prismatic: analyzes online media, [>]–[>] privacy: and anonymization, [>]–[>] and big data, [>]–[>], [>], [>], [>] and cell phone data, [>], [>] Google and, [>]–[>] and Internet, [>]–[>] laws protecting, [>], [>] and notice & consent, [>], [>], [>]–[>] Ohm on, [>] and opting out, [>], [>] and personal data, [>]–[>], [>]–[>], [>], [>], [>] profiling:
…
Technologies, [>] Rolls-Royce, [>] Roman numerals, [>]–[>] Rudin, Cynthia, [>], [>] Rudin, Ken, [>] sabermetrics, [>] Saddam Hussein: trial of, [>] Salathé, Marcel, [>]–[>] sales data: analysis of, [>], [>], [>], [>] Salesforce.com, [>] sampling, statistical: big data replaces, [>]–[>], [>], [>]–[>], [>]–[>] exactitude necessary in, [>], [>]–[>] Graunt and, [>] limitations inherent in, [>]–[>], [>], [>] Neyman on, [>] in quality control, [>] randomness needed in, [>]–[>] scale in, [>] Silver on, [>] scale: in data, [>]–[>] imprecision
…
traffic-pattern analysis: by Inrix, [>]–[>], [>] translation, language, [>] Google and, [>]–[>], [>], [>], [>] IBM and, [>]–[>], [>] Microsoft and, [>] transparency: of algorithms, [>] truth: data as, [>], [>] imprecision and, [>] 23andMe, [>] Twitter, [>], [>], [>]–[>], [>] as big-data company, [>], [>]–[>] data processing by, [>] datafication by, [>]–[>] message analysis by, [>] Udacity, [>] Universal Transverse Mercator (UTM) system, [>] universe: information as basis of, [>]–[>] “Unreasonable Effectiveness of Data, The
…
.S. President’s Council of Advisors on Science and Technology, [>] value, economic: big data and creation of, [>], [>], [>], [>], [>]–[>], [>]–[>], [>]–[>], [>]–[>] of reusing data, [>]–[>], [>]–[>], [>]–[>], [>]–[>], [>], [>] Varian, Hal, [>] video game design: correlation analysis in, [>]–[>] Vietnam War: data misused in, [>], [>]–[>] Visa, [>] von Ahn, Luis: invents Captcha & ReCaptcha, [>]–[>] Walmart, [>] analyzes sales data, [>], [>], [>], [>] merchandising innovations by, [>]–[>] War Managers, The (Kinnard), [>]
by Seth Stephens-Davidowitz · 8 May 2017 · 337pp · 86,320 words
Data Stories 6. All the World’s a Lab The ABCs of A/B Testing Nature’s Cruel—but Enlightening—Experiments PART III: BIG DATA: HANDLE WITH CARE 7. Big Data, Big Schmata? What It Cannot Do The Curse of Dimensionality The Overemphasis on What Is Measurable 8. Mo Data, Mo Problems? What
…
, such as by offering beer money to the sophomores in our courses. This book is about a whole new way of studying the mind. Big Data from internet searches and other online responses are not a cerebroscope, but Seth Stephens-Davidowitz shows that they offer an unprecedented peek into people’s
…
searches and video views of anonymous people around the world. In other words, I have taken a very deep dive into what is now called Big Data. Further, I have interviewed dozens of others—academics, data journalists, and entrepreneurs—who are also exploring these new realms. Many of their studies will
…
among other things, their sexless marriages, their mental health issues, their insecurities, and their animosity toward black people. Most important, to squeeze insights out of Big Data, you have to ask the right questions. Just as you can’t point a telescope randomly at the night sky and have it discover Pluto
…
make for successful relationships. At that Thanksgiving table, for that question, my grandmother has access to the largest number of data points. My grandmother is Big Data. In this book, I want to demystify data science. Like it or not, data is playing an increasingly important role in all of our
…
If the methodology of the best data science is frequently natural and intuitive, as I claim, this raises a fundamental question about the value of Big Data. If humans are naturally data scientists, if data science is intuitive, why do we need computers and statistical software? Why do we need the
…
’s accomplishments, in other words, are even more exceptional than they appear to be at first. Data proves that, too. PART II THE POWERS OF BIG DATA 2 WAS FREUD RIGHT? I recently saw a person walking down a street described as a “penistrian.” You caught that, right? A “penistrian” instead
…
will eventually write “penistrian.” Freud’s theory that errors reveal our subconscious wants is indeed falsifiable—and, according to my analysis of the data, false. Big Data tells us a banana is always just a banana and a “penistrian” just a misspelled “pedestrian.” So was Freud totally off-target in all his
…
who dream of cucumbers versus those who dream of tomatoes. Allowing us to zoom in on small subsets of people is the third power of Big Data. Big Data has one more impressive power—one that was not utilized in my quick study of Freud but could be in a future one: it allows
…
Wide Web was invented. But his strategy was very much based on data science. And the lessons from his story are applicable to anybody using Big Data. For years, Seder’s pursuit produced nothing but frustration. He measured the size of horses’ nostrils, creating the world’s first and largest dataset
…
than normal in the days leading up to a hurricane. Based on their analysis, Walmart had trucks loaded with strawberry Pop-Tarts heading down Interstate 95 toward stores in the path of the hurricane. And indeed, these Pop-Tarts sold well. Why Pop-Tarts? Probably because they don’t require refrigeration or cooking. Why strawberry? No clue
…
into data wasn’t feasible. Now, with computers and digitization, tabulating words across massive sets of documents is easy. Language has thus become subject to Big Data analysis. The links that Google utilized were composed of words. So are the Google searches that I study. Words feature frequently in this book. But
…
in the methods of data collection. The truth may be different—and, sometimes, far darker. THE TRUTH ABOUT YOUR FACEBOOK FRIENDS This book is about Big Data, in general. But this chapter has mostly emphasized Google searches, which I have argued reveal a hidden world very different from the one we think
…
adult databases and correlates them with key childhood events. It can help us tackle this and related questions. We might call this increasing use of Big Data to answer psychological questions Big Psych. To see how this works, let’s consider a study I conducted on how childhood experiences influence which baseball
…
A small survey of a couple of thousand people won’t have a large enough sample of such men. This is the third power of Big Data: Big Data allows us to meaningfully zoom in on small segments of a dataset to gain new insights on who we are. And we can zoom in
…
tax policy; a survey of ten thousand people was plenty. Chetty and his team were understandably discouraged. And then, finally, the researchers realized their mistake. “Big Data is not just about doing the same thing you would have done with surveys except with more data,” Chetty explains. They were asking little data
…
questions of the massive collection of data they had been handed. “Big Data really should allow you to use completely different designs than what you would have with a survey,” Chetty adds. “You can, for example, zoom
…
we talking about one or two murders every decade or hundreds of murders every year? Anecdotes and experiments can’t answer this. To see if Big Data could, two economists, Gordon Dahl and Stefano DellaVigna, merged together three Big Datasets for the years 1995 to 2004: FBI hourly crime data, box
…
successfully predict which ways stocks are headed? The short answer is no. In the previous chapters we discussed the four powers of Big Data. This chapter is all about Big Data’s limitations—both what we cannot do with it and, on occasion, what we ought not do with it. And one place
…
behavior of the S&P Index for twenty years—and coins will struggle to keep up. The curse of dimensionality is a major issue with Big Data, since newer datasets frequently give us exponentially more variables than traditional data sources—every search term, every category of tweet, etc. Many people who
…
claim to predict the market utilizing some Big Data source have merely been entrapped by the curse. All they’ve really done is find the equivalent of Coin 391. Take, for example, a
…
are supplemented, in other words, by smaller data (“Do you want to see this post in your News Feed?” “Why?”). Yes, even a spectacularly successful Big Data organization like Facebook sometimes makes use of the source of information much disparaged in this book: a small survey. Indeed, because of this need for
…
Facebook employs social psychologists, anthropologists, and sociologists precisely to find what the numbers miss. Some educators, too, are becoming more alert to blind spots in Big Data. There is a growing national effort to supplement mass testing with small data. Student surveys have proliferated. So have parent surveys and teacher observations, where
…
over the millennia to understand the world. They complement each other. 8 MO DATA, MO PROBLEMS? WHAT WE SHOULDN’T DO Sometimes, the power of Big Data is so impressive it’s scary. It raises ethical questions. THE DANGER OF EMPOWERED CORPORATIONS Recently, three economists—Oded Netzer and Alain Lemaire, both of
…
the height of their earning power—than to students or senior citizens and why airlines often charge more to last-minute purchasers. They price discriminate. Big Data may allow businesses to get substantially better at learning what customers are willing to pay—and thus gouging certain groups of people. Optimal Decisions Group
…
how much pain they can withstand. It’s probably the same amount as Helen. Indeed, this is what the casino Harrah’s does, utilizing a Big Data warehouse firm, Terabyte, to assist them. Scott Gnau, general manager of Terabyte, explains, in the excellent book Super Crunchers, what casino managers do when
…
and better use of online data will give casinos, insurance companies, lenders, and other corporate entities too much power over us. On the other hand, Big Data has also been enabling consumers to score some blows against businesses that overcharge them or deliver shoddy products. One important weapon is sites, such as
…
not have been stabbed to death that evening. In the movie Minority Report, psychics collaborate with police departments to stop crimes before they happen. Should Big Data be made available to police departments to stop crimes before they happen? Should Donato have at least been warned about her ex-boyfriend’s foreboding
…
economists, sociologists, and psychologists are soft scientists who throw around meaningless jargon so they can get tenure. To the extent this was ever true, the Big Data revolution has changed that. If Karl Popper were alive today and attended a presentation by Raj Chetty, Jesse Shapiro, Esther Duflo, or (humor me)
…
one particular water pump. This suggested the disease spread through germ-infested water, disproving the then-conventional idea that it spread through bad air. Big Data—and the zooming in that it allows—makes this type of study easy. For any disease, we can explore Google search data or other digital
…
human sexuality. It took time for the natural sciences to begin changing our lives—to create penicillin, satellites, and computers. It may take time before Big Data leads the social and behavioral sciences to important advances in the way we love, learn, and live. But I believe such advances are coming.
…
mathematician at the University of Wisconsin, was curious about how many people actually finish books. He thought of an ingenious way to test it using Big Data. Amazon reports how many people quote various lines in books. Ellenberg realized he could compare how frequently quotes were highlighted at the beginning of the
…
time. His book Information Rules, written with Carl Shapiro, basically predicted the future. And his paper “Predicting the Present,” with Hyunyoung Choi, largely started the Big Data revolution in the social sciences that is described in this book. He is also an amazing and kind mentor, as so many who have worked
…
Pregnant Women Want?” 21 Friedman says: I interviewed Jerry Friedman by phone on October 27, 2015. 21 sampling of all their data: Hal R. Varian, “Big Data: New Tricks for Econometrics,” Journal of Economic Perspectives 28, no. 2 (2014). CHAPTER 1: YOUR FAULTY GUT 26 The best data science, in fact,
…
). The flaws in the original model were discussed in David Lazer, Ryan Kennedy, Gary King, and Alessandro Vespignani, “The Parable of Google Flu: Traps in Big Data Analysis,” Science 343, no. 6176 (2014). A corrected model is presented in Shihao Yang, Mauricio Santillana, and S. C. Kou, “Accurate Estimation of Influenza
…
Economics 117, no. 4 (2002). 239 Warren Buffett: Alice Schroeder, The Snowball: Warren Buffett and the Business of Life (New York: Bantam, 2008). CHAPTER 7: BIG DATA, BIG SCHMATA? WHAT IT CANNOT DO 247 claimed they could predict which way: Johan Bollen, Huina Mao, and Xiaojun Zeng, “Twitter Mood Predicts the Stock
…
92 Bezos, Jeff, 203 bias implicit, 134 language as key to understanding, 74–76 omitted-variable, 208 subconscious, 132 See also hate; prejudice; race/racism Big Data and amount of information, 15, 21, 59, 171 and asking the right questions, 21–22 and causality experiments, 54, 240 definition of, 14, 15 and
…
Capital in the 21st Century (Piketty), 283 casinos, and price discrimination, 263–65 causality A/B testing and, 209–21 and advertising, 221–25 and Big Data experiments, 54, 240 college and, 237–39 correlation distinguished from, 221–25 and ethics, 226 and monetary windfalls, 229 natural experiments and, 226–28 and
…
power of Big Data, 54, 211 and randomized controlled experiments, 208–9 reverse, 208 and Stuyvesant High School study, 231–37, 240 Centers for Disease Control and Prevention, 57
…
Hillary. See elections, 2016 A Clockwork Orange (movie), 190–91 cnn.com, 143, 145 Cohen, Leonard, 82n college and causality, 237–39 and examples of Big Data searches, 22 college towns, and origins of notable Americans, 182–83, 184, 186 Colors (movie), 191 Columbia University, Microsoft pancreatic cancer study and, 28–29
…
103 sources of, 14, 15 speed for transmitting, 55–59 and understanding the world, 280 what counts as, 74 words as, 74–97 See also Big Data; data science; small data; specific data data science as changing view of world, 34 and counterintuitive results, 37–38 economists role in development of,
…
life expectancy, 177 EPCOR utility company, 193, 194 EQB, 63–64 equality of opportunity, zooming in on, 173–75 Error Bot, 48–49 ethics and Big Data, 257–65 and danger of empowered government, 267 doppelganger searches and, 262–63 empowered corporations and, 257–65 and experiments, 226 hiring practices and, 261
…
sex/porn searches, 19 Indiana University, and dimensionality study, 247–48 individuals, predicting the actions of, 266–70 influenza, data about, 57, 71 information. See Big Data; data; small data; specific source or search Instagram, 99, 151–52, 261 Internal Revenue Service (IRS), 172, 178–80. See also taxes internet as addiction
…
, 28–29 Pandora, 203 Pantheon project (Massachusetts Institute of Technology), 184–85 parents/parenting and child abuse, 145–47, 149–50, 161 and examples of Big Data searches, 22 and prejudice against children, 134–36, 135n Parks, Rosa, 93, 94 Parr, Ben, 153–54 Pathak, Parag, 235–36 PatientsLikeMe.com, 205
…
notable Americans from, 186, 187 Runaway Bride (movie), 192, 195 sabermetricians, 198–99 San Bernardino, California, shooting in, 129–30 Sands, Emily, 202 science and Big Data, 273 and experiments, 272–73 real, 272–73 at scale, 276 soft, 273 search engines differentiation of Google from other, 60–62 for pornography, 61n
…
, 282n professional background of, 14 and writing conclusions, 271–72, 279, 280–84 Stern, Howard, 157 stock market data for, 55–56 and examples of Big Data searches, 22 Summers-Stephens-Davidowitz attempt to predict the, 245–48, 251–52 Stone, Oliver, 185 Stoneham, James, 266, 269 Storegard, Adam, 99–101
…
a BA in philosophy from Stanford, where he graduated Phi Beta Kappa, and a PhD in economics from Harvard. His research—which uses new, big data sources to uncover hidden behaviors and attitudes—has appeared in the Journal of Public Economics and other prestigious publications. He lives in New York City
…
that a man who searches for information about Judy Garland is three times more likely to search for gay porn than straight porn. Some stereotypes, Big Data tells us, are true. * I think this data also has implications for one’s optimal dating strategy. Clearly, one should put oneself out there,
by Michael P. Lynch · 21 Mar 2016 · 230pp · 61,702 words
The Internet of Us Knowing More and Understanding Less in the Age of Big Data Michael Patrick Lynch Liveright Publishing Corporation A Division of W. W. Norton & Company Independent Publishers Since 1923 New York London Copyright © 2016 by Michael Patrick
…
edition as follows: Names: Lynch, Michael P. (Michael Patrick), 1966– author. Title: The Internet of us : knowing more and understanding less in the age of big data / Michael Patrick Lynch. Description: First Edition. | New York : Liveright Publishing Corporation, 2016. | Includes bibliographical references and index. Identifiers: LCCN 2015051171 | ISBN 9780871406613 (hardcover) Subjects: LCSH
…
beings around the globe interact with one another, economically and otherwise.5 The Internet of Things is made possible by—and is also producing—big data. The term “big data” has no fixed definition, but rather three connected uses. First, it names the ever-expanding volume of data that surrounds us. You’ve heard
…
it is in the zetabytes. That’s hard to get your mind around. As Viktor Mayer-Schönberger and Kenneth Cukier estimate in their recent book, Big Data: A Revolution That Will Transform How We Live, Work, and Think, if you placed that much information on CD-ROMs (remember them?) it would stretch
…
ancient library of Alexandria.6 And by the time you are reading this, the numbers will be even bigger. So, one use of the term “big data” refers to the massive amount of data making up our digital form of life. In a second sense, it can be used to talk about
…
sort of information to further target the types of products they market. As a consequence of the increasing importance of data analytics, we might employ “big data” in a third sense—to refer to firms like Google or Amazon that utilize data analytics as an essential part of their business model, and
…
government agencies like the NSA that use these techniques as an essential part of, well, their business model. In this third sense, Big Data is like Big Oil. Large oil conglomerates are powerful because they control how the world’s major energy resource is not only distributed but how
…
it is extracted. The tech giants are similar. Energy is not information, but both are resources, and resources by which the world runs. And Big Data, like Big Oil, is big precisely because it can control access to data as well as the extraction of information and knowledge from that data
…
. Big Data refines data for information and knowledge, and we need to pay attention to that fact because knowledge, like energy, is not just a passive, inert
…
. And the power that comes with control over that fuel is therefore formidable. Knowledge, as Sir Francis Bacon said, is power. The big numbers behind big data, and the power inherent in those numbers, are impressive. Not long ago, it was said we were living in a time of information “glut”; we
…
become transparent. We look out the window of the Internet even as the Internet looks back in. Most of the data being collected in the big data revolution is about us. “Cookies”—those insidious (and insidiously named) little Internet genies—have allowed websites to track our clicking for decades. Now much more
…
sophisticated forms of data analysis allow the lords of big data, like Google and Amazon, to form detailed profiles of our preferences. That’s what makes the now ubiquitous targeted ad possible. Searching for new shoes
…
events at millions of locations across the globe. And, of course, data mining isn’t done just for business purposes. Arguably, the United States’ largest big data enterprise is run by the NSA, which was intercepting and storing an estimated 1.7 billion emails, phone calls and other types of communications every
…
removed from the NSA’s databases. This has not yet been done.9 Privacy and the Concept of a Person The potential dangers of abusing big data are one reason the storage of incidentally collected information is wrong. But there is another: the more insidious harm is not “instrumental” but “in principle
…
seems like memory—then what are we testing for when giving exams? What, in general, is the point of higher education in the age of big data? These questions come at a time when the idea of the university itself is often said to be in crisis—especially in the United States
…
, known to any user of the Internet, is called Google Complete. Search as I just did for “Web 3.0 and …” and Google will suggest “big data” and “education”; search for “knowledge and …” and you might get “power” and “information systems.” Complete is a familiar, if rather gentle, form of
…
big data analysis. It works because Google knows not only what much of the world is searching for on the Web, but also what you’ve been
…
has done more than perhaps any other single high-profile company or entity to usher in the brave new world of big data. As I noted in the first chapter, the term “big data” can refer to three different things. The first is the ever-expanding volume of data being collected by our digital
…
extracting information from that data. And the third is the firms like Google that employ them. One of the lessons of previous chapters is that big data and our digital form of life, while sometimes making it easier to be a responsible and reasonable believer, often makes it harder as well—while
…
and widely cited editorial called “The End of Theory: The Data Deluge Makes the Scientific Method Obsolete.” Anderson claimed that what we are now calling big data analytics was overthrowing traditional ways of doing science: This is a world where massive amounts of data and applied mathematics replace every other tool that
…
occurred as it did. Anderson’s point was that the traditional view assumes that the data is always limited. That, he says, is the assumption big data is overthrowing. In 2013, the data analytics expert Christian Rudder (and cofounder of the dating website OkCupid) echoed Anderson’s point. In talking about the
…
all. Similarly, Google Flu Trends doesn’t care why people are searching as they do; it just correlates the data. And Walmart doesn’t care why people buy more Pop-Tarts before a hurricane, nor do insurance companies care why certain credit scores correlate with certain medication adherences; they care only that they
…
do. As Viktor Mayer-Schönberger and Kenneth Cukier put it, “predictions based on correlations lie at the heart of big data. Correlation analyses are now
…
used so frequently that we sometimes fail to appreciate the inroads they have made. And the uses will only increase.” 4 Does the use of big data in this way however, really signal the end of theory, as Anderson alleged? The answer is no. And, as we’ll see, that is a
…
very good thing. Start with Rudder and Anderson’s remarks. As Rudder puts it, big data seems to allow us to investigate by direct inspection. We don’t have to look through the lens of a model or theory; we can
…
let the numbers speak for themselves. Big data brings us to the real-life correlations that exist, and because those correlations are so perfectly … well, correlated, we can predict what happens without having
…
productive years? Descartes, for example, died in Sweden, but he spent most of his productive life in France. Once again, theoretical assumptions drive work in big data as much as they do in any other field. Kuhn would not be surprised. None of this diminishes the importance of network analyses as tools
…
method, which they demonstrate has routinely overestimated the amount of flu cases by as much as 30 percent. They ascribe this to what they call “big data hubris,” or the assumption that sheer data size alone will always result in more predictive power. The researchers’ point is not that
…
big data techniques aren’t helpful, but that the Google algorithm is not likely to be a good stand-alone method for predicting the spread of the
…
flu. Given our argument above, this is not surprising. Big data techniques are going to assist our models and explanations, not supplant them. The creativity of understanding helps to explain our intuitive sense that understanding is
…
manipulate the threads.5 The threads are strings of information. They are the ties that bind us to one another, and society to us. What big data and the hyperconnectivity of knowledge are doing is making these connections brighter, more numerous, stronger and fundamentally easier to pluck. And so our respect—if
…
-the-biggest-business-in-the-history-of-electronics. Accessed September 4, 2015. 5. Rifkin, The Zero Marginal Cost Society, 11. 6. Mayer-Schöneberger and Cuker, Big Data, 9. 7. Floridi, The Fourth Revolution. 25–58. 8. Cavell, Must We Mean What We Say?, 52. 9. Wittgenstein, Philosphical Investigations, 226. 10. Recent books
…
. 4. For an in-depth discussion of some of the complexities here, I recommend Nissenbaum, Privacy in Context, 67ff. See also Lane et al., Privacy, Big Data and the Public Good. 5. Barton Gellman, Julie Tate, and Ashkan Soltani, “In NSA-intercepted Data, Those Not Targeted Far Outweigh Those Who Are,” Washington
…
: June 23, 2008. 2. Rudder, Dataclysm, 10–11. 3. Ginsberg et al., “Detecting Influenza Epidemics Using Search Engine Query Data.” 4. Mayer-Schöneberger and Cukier, Big Data, 55–56. The examples just above also come from this interesting and informative book. 5. Kuhn, The Structure of Scientific Revolutions, 59. 6. Bruner and
…
Value of Knowledge and the Pursuit of Understanding. Cambridge, UK: Cambridge University Press, 2003. Lane, Julia, Victoria Stodden, Stefan Bender, and Helen Nissenbaum, eds. Privacy, Big Data and the Public Good: Frameworks for Engagement. Cambridge, UK: Cambridge University Press, 2014. Lazer, David, R. Kennedy, G. King, and A. Vespignani. “The Parable of
…
Google Flu: Traps in Big Data Analysis.” Science 343 (March 14, 2014): 1203–05. Levy, Steven. In the Plex: How Google Thinks, Works and Shapes Our Lives. New York: Simon and
…
Matters for Democracy. Cambridge, MA: MIT Press, 2012. _________. True to Life: Why Truth Matters. Cambridge, MA: MIT Press, 2004. Mayer-Schönberger, Viktor, and Kenneth Cukier. Big Data: A Revolution That Will Transform How We Live, Work, and Think. Boston: Houghton Mifflin, 2013. Mercier, Hugo, and Dan Sperber. “Why Do Humans Reason? Arguments
…
also confirmation bias socially embedded, 116 undesired, 198 Bentham, Jeremy, 91, 92, 97 Berkeley, George, 68–69 Berlin Monthly, 58 Bible, 48, 49, 61, 66 big data: analysis in, see data analysis, data analytics definitions of, 8–9, 156 in digital form of life, 155–78 hyperconnectivity of, 184–88 limitations of
…
in, 111–32 political economy of knowledge in, 133–54 privacy and autonomy issues of, 89–109 Big Data: A Revolution That Will Transform How We Live, Work, and Think (Mayer-Schönberger and Cukier), 8 “big data hubris,” 183 big knowledge, 155–63 “big man” theory, 162 Big Oil, 9 Bing, 30 blogs, blogosphere
…
so” stories, 27–28 Kahneman, Daniel, 29, 51 Kant, Immanuel, 34, 58–60, 62, 85 Kitcher, Philip, 182 knowing-which, as term, 171 knowledge: in big data revolution, 87–190 changing structure of, 125–32 common, 117–19 defined and explained, xvii, 12–17 democratization of, 133–38 digital, see digital knowledge
…
–49 in skeptical argument, 59–60 value judgment and, 57, 62 Science, 161 science fiction, 75 scientific method, in reasoning, 59–62 scientific theory, in big data analysis, 156–63 secondary qualities, defined, 68, 70, 74 Second Life (SIM game), 20 security cameras, 91, 97 self, narrative construction of, 73–74 self
by Cathy O'Neil · 5 Sep 2016 · 252pp · 72,473 words
growing tyranny of an arrogant establishment.” —Ralph Nader, author of Unsafe at Any Speed “Next time you hear someone gushing uncritically about the wonders of Big Data, show them Weapons of Math Destruction. It’ll be salutary.” —Felix Salmon, Fusion “From getting a job to finding a spouse, predictive algorithms are silently
…
a trademark of Penguin Random House LLC. Library of Congress Cataloging-in-Publication Data Name: O’Neil, Cathy, author. Title: Weapons of math destruction: how big data increases inequality and threatens democracy / Cathy O’Neil Description: First edition. | New York: Crown Publishers [2016] Identifiers: LCCN 2016003900 (print) | LCCN 2016016487 (ebook) | ISBN 9780553418811
…
(hardcover) | ISBN 9780553418835 (pbk.) | ISBN 9780553418828 (ebook) Subjects: LCSH: Big data—Social aspects—United States. | Big data—Political aspects—United States. | Social indicators—Mathematical models—Moral and ethical aspects. | Democracy—United States. | United States—Social conditions—21st century. Classification: LCC
…
Journey of Disillusionment CHAPTER 3 ARMS RACE: Going to College CHAPTER 4 PROPAGANDA MACHINE: Online Advertising CHAPTER 5 CIVILIAN CASUALTIES: Justice in the Age of Big Data CHAPTER 6 INELIGIBLE TO SERVE: Getting a Job CHAPTER 7 SWEATING BULLETS: On the Job CHAPTER 8 COLLATERAL DAMAGE: Landing Credit CHAPTER 9 NO SAFE
…
were studying our desires, movements, and spending power. They were predicting our trustworthiness and calculating our potential as students, workers, lovers, criminals. This was the Big Data economy, and it promised spectacular gains. A computer program could speed through thousands of résumés or loan applications in a second or two and sort
…
of that gap is due to her teacher? It’s hard to know, and Mathematica’s models have only a few numbers to compare. At Big Data companies like Google, by contrast, researchers run constant tests and monitor thousands of variables. They can change the font on a single advertisement from blue
…
Pandora, the ideal job on LinkedIn, or perhaps the love of their life on Match.com. Think of the astounding scale, and ignore the imperfections. Big Data has plenty of evangelists, but I’m not one of them. This book will focus sharply in the other direction, on the damage inflicted by
…
finding and holding a job. All of these life domains are increasingly controlled by secret models wielding arbitrary punishments. Welcome to the dark side of Big Data. It was a hot August afternoon in 1946. Lou Boudreau, the player-manager of the Cleveland Indians, was having a miserable day. In the first
…
would also include parameters, or constraints. I might limit the fruits and vegetables to what’s in season and dole out a certain amount of Pop-Tarts, but only enough to forestall an open rebellion. I also would add a number of rules. This one likes meat, this one likes bread and
…
misreading some of them. Here we see that models, despite their reputation for impartiality, reflect goals and ideology. When I removed the possibility of eating Pop-Tarts at every meal, I was imposing my ideology on the meals model. It’s something we do without a second thought. Our own values and
…
that, rather than the movement of markets, I was now predicting people’s clicks. In fact, I saw all kinds of parallels between finance and Big Data. Both industries gobble up the same pool of talent, much of it from elite universities like MIT, Princeton, or Stanford. These new hires are ravenous
…
success, and growing feedback loops. Those who objected were regarded as nostalgic Luddites. I wondered what the analogue to the credit crisis might be in Big Data. Instead of a bust, I saw a growing dystopia, with inequality rising. The algorithms would make sure that those deemed losers would remain that way
…
economy, raking in outrageous fortunes and convincing themselves all the while that they deserved it. After a couple of years working and learning in the Big Data space, my journey to disillusionment was more or less complete, and the misuse of mathematics was accelerating. In spite of blogging almost daily, I could
…
to get the same or better policing out of a smaller force. So in 2013 he invested in crime prediction software made by PredPol, a Big Data start-up based in Santa Cruz, California. The program processed historical crime data and calculated, hour by hour, where crimes were most likely to occur
…
includes risk terrain analysis, which incorporates certain features, such as ATMs or convenience stores, that might attract crimes. Like those in the rest of the Big Data industry, the developers of crime prediction software are hurrying to incorporate any information that can boost the accuracy of their models. If you think about
…
crime. That’s where it is, they say, pointing to the highlighted ghetto on the map. And now they have cutting-edge technology (powered by Big Data) reinforcing their position there, while adding precision and “science” to the process. The result is that we criminalize poverty, believing all the while that our
…
’t incorporate information about how the candidate would actually perform at the company. That’s in the future, and therefore unknown. So like many other Big Data programs, they settle for proxies. And as we’ve seen, proxies are bound to be inexact and often unfair. In fact, the Supreme Court ruled
…
on the hunt for employees who think creatively and work well in teams. So the modelers’ challenge is to pinpoint, in the vast world of Big Data, the bits of information that correlate with originality and social skills. Résumés alone certainly don’t cut it. Most of the items listed there—the
…
faith in the science of phrenology. Phrenology was a model that relied on pseudoscientific nonsense to make authoritative pronouncements, and for decades it went untested. Big Data can fall into the same trap. Models like the ones that red-lighted Kyle Behm and blackballed foreign medical students at St. George’s can
…
hire a part-timer to help with the Saturday crush. These changes add a bit of intelligence to the dumb and inflexible status quo. With Big Data, that college freshman is replaced by legions of PhDs with powerful computers in tow. Businesses can now analyze customer traffic to calculate exactly how many
…
Microsoft use in-house programs to do just this. It’s very similar to a dating algorithm (and often, no doubt, has similarly spotty results). Big Data has also been used to study the productivity of call center workers. A few years ago, MIT researchers analyzed the behavior of call center employees
…
. Investors double down on scientific systems that can place thousands of people into what appear to be the correct buckets. It’s the triumph of Big Data. And what about the person who is misunderstood and placed in the wrong bucket? That happens. And there’s no feedback to set the system
…
still make a mountain of money. That was Douglas Merrill’s idea. A former chief operating officer at Google, Merrill believed that he could use Big Data to calculate risk and offer payday loans at a discount. In 2009, he founded a start-up called ZestFinance. On the company web page, Merrill
…
whether they’ve been in an accident—and not by their consumer patterns or those of their friends or neighbors. Yet in the age of Big Data, urging insurers to judge us by how we drive means something entirely new. Insurance companies now have manifold ways to study drivers’ behavior in exquisite
…
the same politician, one vowing to protect wilderness and the other stressing law and order. Direct mail was microtargeting on training wheels. The convergence of Big Data and consumer marketing now provides politicians with far more powerful tools. They can target microgroups of citizens for both votes and money and appeal to
…
be heading up the data team for Obama’s campaign. In his previous position, at Accenture Labs in Chicago, Ghani had developed consumer applications for Big Data, and he trusted that he could apply his skills to politics. The goal for the Obama campaign was to create tribes of like-minded voters
…
adapt, we change, and so do our processes. Automated systems, by contrast, stay stuck in time until engineers dive in to change them. If a Big Data college application model had established itself in the early 1960s, we still wouldn’t have many women going to college, because it would have been
…
men, the people paid by rich patrons to create art. The University of Alabama’s football team, needless to say, would still be lily white. Big Data processes codify the past. They do not invent the future. Doing that requires moral imagination, and that’s something only humans can provide. We have
…
to explicitly embed better values into our algorithms, creating Big Data models that follow our ethical lead. Sometimes that will mean putting fairness ahead of profit. In a sense, our society is struggling with a new
…
important decisions about the people we trust to teach our children. That’s a job that requires subtlety and context. Even in the age of Big Data, it remains a problem for humans to solve. Of course, the human analysts, whether the principal or administrators, should consider lots of data, including the
…
needs an update. The bill currently prohibits medical exams as part of an employment screening. But we need to update it to take into account Big Data personality tests, health scores, and reputation scores. They all sneak around the law, and they shouldn’t be able to. One possibility already under discussion
…
and Accountability Act (HIPAA), which protects our medical information, in order to cover the medical data currently being collected by employers, health apps, and other Big Data companies. Any health-related data collected by brokers, such as Google searches for medical treatments, must also be protected. If we want to bring out
…
Section 8 was put up, an extremely awkward and brief conversation took place. Someone demanded the slide be taken down. The party line prevailed. While Big Data, when managed wisely, can provide important insights, many of them will be disruptive. After all, it aims to find patterns that are invisible to human
…
/data-mining-moves-to-human-resources. MIT researchers analyzed the behavior of call center employees: Joshua Rothman, “Big Data Comes to the Office,” New Yorker, June 3, 2014, www.newyorker.com/books/joshua-rothman/big-data-comes-to-the-office. A Nation at Risk: National Commission on Excellence in Education, A Nation at Risk
…
/01/31/your-money/credit-and-debit-cards/31money.html. Douglas Merrill’s idea: Steve Lohr, “Big Data Underwriting for Payday Loans,” New York Times, January 19, 2015, http://bits.blogs.nytimes.com/2015/01/19/big-data-underwriting-for-payday-loans/. On the company web page: Website ZestFinance.com, accessed January 9, 2016
…
, www.zestfinance.com/. A typical $500 loan: Lohr, “Big Data Underwriting.” ten thousand data points: Michael Carney, “Flush with $20M from Peter Thiel
…
, ZestFinance Is Measuring Credit Risk Through Non-traditional Big Data,” Pando, July 31, 2013, https://pando.com/2013/07/31/flush-with-20m-from-peter
…
-thiel-zestfinance-is-measuring-credit-risk-through-non-traditional-big-data/. one of the first peer-to-peer exchanges, Lending Club: Richard MacManus, “Facebook App, Lending Club, Passes Half a Million Dollars in Loans,” Readwrite, July
…
.1014098. each campaign developed profiles of American voters: Sasha Issenberg, “How President Obama’s Campaign Used Big Data to Rally Individual Voters,” Technology Review, December 19, 2012, www.technologyreview.com/featuredstory/509026/how-obamas-team-used-big-data-to-rally-voters/. Four years later: Adam Pasick and Tim FernHolz, “The Stealthy, Eric Schmidt-Backed
by Steve Lohr · 10 Mar 2015 · 239pp · 70,206 words
it was the mysteries of life down to the cellular level. Just as modern telescopes transformed astronomy and modern microscopes did the same for biology, big data holds a similar promise, but more broadly, in every field and every discipline. Far-reaching advances in technology are engines of economic change. The Internet
…
communication. Then other technologies, like the Web, were built on top of the Internet, which has become a platform for innovation and new businesses. Similarly big data, though still a young technology, is transforming the economics of discovery—becoming a platform, if you will, for human decision making. Decisions of all kinds
…
has held for so many years because of human ingenuity, endeavor, and investment. Scientists, companies, and investors made it happen. The same is true of big data. It has become technically possible thanks to a bounty of improvements in computing, sensing, and communications. But the steady advance in software and hardware, and
…
computing. More recently, software programs for using data to make better-informed decisions were given the label “business intelligence.” It is a predecessor term to big data, and one still in use. Business intelligence tends to focus on collection, reporting, and basic analysis but not on the predictive or experimental features of
…
imperative. A program of fundamental research, as in the materials science of computer hardware, will continue. Along with cloud computing, research will be concentrated on big-data projects in specific industries and the underlying machine-learning technologies used to find answers and insights in data, as Watson does. IBM refers to these
…
data points, in their way. But there was plenty of uncertainty, and this was the kind of decision that points to the limits of the big-data approach. Big data is good at interpolation—figuring out what happens next when the outcome is, most likely, a continuation of the current trend. It is far
…
threat that Kelly describes. And a sizable swath of IBM’s services business involves engineers writing applications, using traditional software, for corporate customers. Today’s big-data applications typically use cloud-style computing in which processing and software are delivered remotely, from distant data centers, over the Internet. Under Rometty, IBM is
…
making huge investments in the future—big-data technology and cloud computing. But the dilemma facing the company is whether the new business will grow faster than the old business erodes. In early
…
clarity. The McKesson drug distribution case—$1 billion less inventory, an efficiency gain of roughly 13 percent—is a dollars-and-cents success story in big data. The idea of applying modern data science to a complex product distribution network originated with Kaan Katircioglu, an IBM research scientist at the time, who
…
, digesting other viewpoints and data, to recognize and overcome their biases. Tetlock calls them “super forecasters.” A clever helper. That is the benevolent portrayal of big data, as a tireless digital assistant to see what its overworked human boss missed. Dr. Herbert Chase, a professor of clinical medicine at Columbia University’s
…
of such data-animated nudges to sharpen decision making, repeated countless times, up and down corporations, throughout the economy, is the why Erik Brynjolfsson believes big data will bring a “management revolution.” Brynjolfsson is an economist at the Massachusetts Institute of Technology’s Sloan School of Management, director of the MIT Center
…
, an amalgam of factors affects performance, including business cycles, financial crises, and demographic trends, not just technology. But Brynjolfsson sees a pattern playing out with big data that is comparable to past technologies. Innovations that have been percolating for years in research labs are making their way into products. An industry or
…
their early software for handling Web data was good at only one task. What was needed was a more general-purpose tool, tailored for handling big data and that could become the software foundation on which other programs run. In computing, such tools are called “platforms.” The complementary programs add new
…
the proprietary offerings from Microsoft and Apple, for example. Programmers can tweak and modify open-source programs within certain rules. In late 2005, a potential big-data platform emerged, called Hadoop, an open-source project, begun by two engineers, Mike Cafarella and Doug Cutting. The quirky name was what Cutting’s toddler
…
former executive at Oracle, the largest supplier of corporate database software. Hammerbacher, Zeyliger recalls, portrayed the start-up as a force for digital democracy, bringing big-data power tools to the rest of the economy. He was convincing, and Zeyliger was excited. But Zeyliger had a reservation. He was recently married and
…
is correlation—typically, some data pattern is linked to some action or behavior in the real world. Exploiting correlation is the first wave of the big-data phenomenon, and it can be extremely powerful. Indeed, useful and profitable observations increasingly do come from “listening to the data” to find correlations. A
…
large corporations have been at this for years, using their own data. A canonical example of this kind of data discovery is the Pop-Tarts-and-beer case at Walmart from a decade ago. The giant retailer, mining the historical purchasing data from its stores, found that consumers in the path of a
…
the social benefit of lower costs for borrowers in the subprime consumer market. The missing ingredient, Merrill concluded, was Google-style data analytics. “Underwriting, Meet Big Data” is the ZestFinance corporate motto. A typical payday loan, Merrill explains, is for a few hundred dollars for two weeks, and rolls over ten times
…
” rather than a stand-alone forecasting tool. Still, respected authors and academics often pointed to Google Flu Trends as proof of the triumph of the big-data approach. Tracking forty-five flu-related search terms over billions of searches, monitoring trends, and making correlations would win out. Google, it was said,
…
. But there is a lively debate among data enthusiasts as to whether the pursuit of causes is even necessary. In their timely and authoritative book Big Data, Viktor Mayer-Schönberger and Kenneth Cukier forcefully state the case for correlation supremacy. “The ideal of identifying causal mechanisms,” they write, “is a self-congratulatory
…
illusion; big data overturns this.” Not everyone agrees. One of them is Richard Berner, former chief economist at Morgan Stanley. In 2013, Berner became the first director of
…
financial microscope. The goal is to see the inner workings of markets in illuminating detail to inform understanding and guide action. So Berner is a big-data proponent, but not without qualification. He is skeptical of the uncompromising data-ists who celebrate correlation as plenty good enough without theory, without a model
…
build databases of knowledge with little human help, almost autonomously, their algorithms scanning vast stores of digital data at lightning speed, changes the game. “With big data and machine learning, the knowledge bottleneck is no longer the problem it once was,” Campbell says. In fact, he predicts, “The knowledge bottleneck will reverse
…
account for a sizable portion of the farm food basket of fruits and vegetables including apples, oranges, peaches, almonds, avocados, strawberries, broccoli, and garlic. Harnessing big data to increase food production could be vital to cope with a world of increasingly limited supplies of land and water, but more mouths to feed
…
Using satellite data to predict local weather and crop yields had been done before. Climate Corporation, a start-up founded by former Google engineers, applied big-data weather prediction to crop insurance so impressively that in 2013 the agribusiness giant Monsanto bought the young company for nearly $1 billion. Yet the Gallo
…
Annunziata and Peter Evans, then its director of global strategy and analytics (Evans left GE later that year) concludes that the combination of intelligent machines, big data, and changed work practices should bring “enormous economic benefits” over the long term. The effect, they say, will be to boost the global economy
…
.” As we talked, Felten raised several concerns about navigating through what he called “this increasingly observed and classified world.” One of his themes was how big-data technology is outrunning public understanding and policy, as in the failure of the old “notice and choice” approach. Another example, he notes, is the definition
…
get beyond this concept of personally identifying information,” Felten says, “because the rest is deemed by default to be harmless.” Through the inferential engines of big data, companies can often accurately predict if a person has a chronic disease or is financially strapped. Felten compares the current state of affairs to the
…
the good discrimination of seeing differences and allocating time, energy, and money accordingly. Modern marketing is about clustering consumers into smaller and smaller groups. And big-data techniques, as we’ve seen, only accelerate that trend by knowing more and inferring more about ever-smaller groups, even individuals. Yet the line that
…
systematically exclude those people from your marketing? Discrimination, as a legal concept, focuses on the treatment of people in groups, by ethnicity, gender, or age. Big-data methods make it possible to assemble people by interests and characteristics that are far more detailed than traditional demographics. The technology also affords the opportunity
…
. Otherwise, the company, he says, has “violated a personal relationship” by using sensitive information “that’s between an individual and his or her doctor.” As big-data technology advances, corporate executives are going to have to make judgments about what kinds of discrimination they will and will not allow. For companies, privacy
…
answers without having direct access to the data. At Princeton, Arvind Narayanan is leading a project that seeks to reverse engineer what marketers do with big data to eventually create a “census” of corporate online privacy and discrimination practices. Narayanan calls it a “web transparency project,” inspired by the recognition that companies
…
blinkered, distorting behavior and incentives shortsightedly? The replies I heard were much the same. A fair point, they said, but the answer is no, because big-data measurement and analysis is qualitatively different from the measurement of finance. They used different words and phrases to try to capture the difference. Erik Brynjolfsson
…
have also increased personal mobility and freedom, and stimulated the development of regional and national markets for goods. The outlook for the technology we call big data is not fundamentally different. Its advance is probably inevitable, the risks seem manageable, and the benefits, by adding a layer of data-driven intelligence to
…
85,000 watts. Still, the virtuous cycle of more and more varied data and smarter and smarter algorithms, written by human programmers, is delivering a big-data-fueled renaissance in artificial intelligence. But the more machine learning can do, the more humanity may learn about itself. “What is actually intrinsically human?” Smarr
…
asks. “In the next couple of decades, this technology will increasingly force us to confront that issue.” NOTES 1: How Big Is Big Data? Just outside Memphis: Information for the paragraphs on McKesson come from several sources. An interview on Oct. 25, 2013, with Donald Walker. An interview on
…
and erroneous diagnoses. “Watson or something similar”: An interview on Oct. 28, 2013, with Dr. Martin Kohn. an intellectual champion for the transformative power of big data: I’ve interviewed Erik Brynjolfsson several times over the years, but most of the quotes and descriptions in this section come from an interview on
…
wgbh/nova/tech/yoky-matsuoka.html. Mark Malhotra, a young Stanford-educated engineer: His descriptions and quotes come from an interview on Nov. 21, 2013. “Big data is the next stage”: An interview on May 10, 2012, with Randy Komisar. In another industry, Michael Haydock: His descriptions and quotes come from interviews
…
Tim O’Donnell was doing broadly similar research: His descriptions and quotes come from an interview on Dec. 24, 2013. 10: The Prying Eyes of Big Data the Kodak-wielding “camera fiend”: Information on the Kodak section comes largely from the online essays, written by David Lindsay, that accompanied the PBS American
…
Bear Stearns, Hammerbacher as quant at, 8, 14, 31–33 behavior. See human behavior research; social networks Benjamin, Joel, 118–19 Berner, Richard, 112–14 Big Data (Mayer-Schönberger and Cukier), 112 Bisciglia, Christophe, 101 bit, origin of term, 96 Blue Gene machine, of IBM, 45 Botts, Thomas, 81 Brahe, Tycho,
…
data science and, 157, 158–62 Malhotra, Mark, 149–50 management trends, of the past, 209–11 “Man-Computer Symbiosis” (Licklider), 119 marketing, uses of big data, 195–97 Maslow, Abraham, 155 Matsuoka, Yoky background of, 143, 145–47 Nest learning thermostat and, 143, 145, 147–49 Mayer-Schönberger, Viktor, 112 McAfee
…
,” in data collection of personal information, 186, 187–88 Noyes, Eliot, 49 “numerical imagination,” of Hammerbacher, 13–14 Oak Ridge National Laboratory, 176 Obama administration, big data and, 203–4 O’Donnell, Tim, 180–81 OfficeMax, 188–89 Olmo, Harold, 126 Olson, Mike, 101 online advertising, 84–85 as “socio-technical construct
…
code, IBM and, 9 operations research, 154 optimization, at IBM, 46 Packard, Vance, 184 Palmisano, Samuel, 49–51, 53 “Parable of Google Flu: Traps in Big Data Analysis, The” (Science), 108 Pattern Recognition (Gibson), 154 Paul, Sharoda, 135 payday lending market, 104–7 Pennebaker, James, 199 Pentland, Alex, 15, 203–4,
…
, 136 PricewaterhouseCoopers, 44 Principles of Scientific Management, The (Taylor), 208 Privacy Act (1974), 185 privacy concerns, 183–206 balancing privacy and data collection, 202–6 big data and personally identifying information, 187–92 cameras and, 183–86 data correlation and, 113 discrimination by statistical inference, 192–95 early computers and, 185–87
…
University, 211–12 Starbucks, 157 Stockholm, rush-hour pricing in, 47 storytelling, computer algorithms and, 120–21, 149, 165–66, 205, 214 structural racism, in big data racial profiling, 194–95 Structure of Scientific Revolutions, The (Kuhn), 175 Sweeney, Latanya, 193–95 System S, at IBM, 40 Tarbell, Ida, 208 Taylor, Frederick
by Foster Provost and Tom Fawcett · 30 Jun 2013 · 660pp · 141,595 words
Foster Provost Tom Fawcett Beijing • Cambridge • Farnham • Köln • Sebastopol • Tokyo Praise “A must-read resource for anyone who is serious about embracing the opportunity of big data.” — Craig Vaughan Global Vice President at SAP “This timely book says out loud what has finally become apparent: in the modern world, Data is Business
…
Foundation Innovation Award Grand Winner (2013) “A foundational piece in the fast developing world of Data Science. A must read for anyone interested in the Big Data revolution." —Justin Gapper Business Unit Analytics Manager at Teledyne Scientific and Imaging “The authors, both renowned experts in data science before it had a name
…
to material for learning these additional skills and concepts (for example, scripting in Python, Unix command-line processing, datafiles, common data formats, databases and querying, big data architectures and systems like MapReduce and Hadoop, data visualization, and other related topics). Sections and Notation In addition to occasional footnotes, the book contains boxed
…
. Thanks to Chris Volinsky for providing data from his work on the Netflix Challenge. Thanks to Sonny Tambe for early access to his results on big data technologies and productivity. Thanks to Patrick Perry for pointing us to the bank call center example used in Chapter 12. Thanks to Geoff Webb for
…
and found that the stores would indeed need certain products—and not just the usual flashlights. ‘We didn’t know in the past that strawberry Pop-Tarts increase in sales, like seven times their normal sales rate, ahead of a hurricane,’ Ms. Dillman said in a recent interview. ‘And the pre-hurricane
…
huge increase in the amount of time consumers are spending online, and the ability online to make (literally) split-second advertising decisions. Data Processing and “Big Data” It is important to digress here to address another point. There is a lot to data processing that is not data science—despite the impression
…
data-driven decision-making, such as efficient transaction processing, modern web system processing, and online advertising campaign management. “Big data” technologies (such as Hadoop, HBase, and MongoDB) have received considerable media attention recently. Big data essentially means datasets that are too large for traditional data processing systems, and therefore require new processing technologies. As
…
seem to help firms (Tambe, 2012). He finds that, after controlling for various possible confounding factors, using big data technologies is associated with significant additional productivity growth. Specifically, one standard deviation higher utilization of big data technologies is associated with 1%–3% higher productivity than the average firm; one standard deviation lower in terms
…
%–3% lower productivity. This leads to potentially very large productivity differences between the firms at the extremes. From Big Data 1.0 to Big Data 2.0 One way to think about the state of big data technologies is to draw an analogy with the business adoption of Internet technologies. In Web 1.0, businesses busied
…
a web presence, build electronic commerce capability, and improve the efficiency of their operations. We can think of ourselves as being in the era of Big Data 1.0. Firms are busying themselves with building the capabilities to process large data, largely in support of their current operations—for example, to improve
…
the incorporation of social-networking components, and the rise of the “voice” of the individual consumer (and citizen). We should expect a Big Data 2.0 phase to follow Big Data 1.0. Once firms have become capable of processing massive data in a flexible fashion, they should begin asking: “What can I now
…
” early on, in the rating of products, in product reviews (and deeper, in the rating of product reviews). Similarly, we see some companies already applying Big Data 2.0. Amazon again is a company at the forefront, providing data-driven recommendations from massive data. There are other examples as well. Online advertisers
…
high throughput (real-time bidding systems make decisions in tens of milliseconds). We should look to these and similar industries for hints at advances in big data and data science that subsequently will be adopted by other industries. Data and Data Science Capability as a Strategic Asset The prior sections suggest one
…
data-analytic skills The consulting firm McKinsey and Company estimates that “there will be a shortage of talent necessary for organizations to take advantage of big data. By 2018, the United States alone could face a shortage of 140,000 to 190,000 people with deep analytical skills as well as 1
…
.5 million managers and analysts with the know-how to use the analysis of big data to make effective decisions.” (Manyika, 2011). Why 10 times as many managers and analysts than those with deep analytical skills? Surely data scientists aren’t
…
mentioning data mining techniques (e.g., random forests, support vector machines), specific application areas (recommendation systems, ad placement optimization), alongside popular software tools for processing big data (Hadoop, MongoDB). There is often little distinction between the science and the technology for dealing with large datasets. We must point out that data science
…
thinking, and to make it more systematic and therefore less prone to errors and omissions. There is convincing evidence that data-driven decision-making and big data technologies substantially improve business performance. Data science supports data-driven decision-making—and sometimes conducts such decision-making automatically—and depends upon technologies for
…
“big data” storage and engineering, but its principles are separate. The data science principles we discuss in this book also differ from, and are complementary to, other
…
have their own books and classes). The next chapter describes some of these differences in more detail. * * * [2] Of course! What goes better with strawberry Pop-Tarts than a nice cold beer? [3] Target was successful enough that this case raised ethical questions on the deployment of such techniques. Concerns of ethics
…
from their data assets. [6] OK: Hadoop is a widely used open source architecture for doing highly parallelizable computations. It is one of the current “big data” technologies for processing massive datasets that exceed the capacity of relational database systems. Hadoop is based on the MapReduce parallel processing framework introduced by Google
…
=22.92; Leverage=0.0104 Selena Gomez -> Miley Cyrus Support=0.011; Strength=0.443; Lift=22.54; Leverage=0.0105 Reese's & Starburst -> Kelloggs Pop-Tarts Support=0.011; Strength=0.493; Lift=22.52; Leverage=0.0102 Skittles & SpongeBob SquarePants -> Patrick Star Support=0.012; Strength=0.590; Lift=22
…
of data science underlying web search, as well as Amazon’s product recommendations and other offerings). Both of these companies eventually built subsequent products offering “big data” and data-science related services to other firms. Many, possibly most, data-science oriented startups use Amazon’s cloud storage and processing services for some
…
ask. However, this book as a whole really can be seen as a proposal review guide. Here are some of the most egregious flaws in Big Data’s proposal: Business Understanding The target variable definition is imprecise. For example, over what time period must the migration occur? (Chapter 3) The formulation of
…
Erlbaum Associates, Mahwah, NJ. Swets, J. A., Dawes, R. M., & Monahan, J. (2000). Better decisions through science. Scientific American, 283, 82–87. Tambe, P. (2013). Big Data Investment, Skills, and Firm Value. Working Paper, NYU Stern. Available: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2294077. WEKA (2001). Weka machine learning
…
Centroids modeling, A General Method for Avoiding Overfitting Amazon, The Ubiquity of Data Opportunities, Data Science, Engineering, and Data-Driven Decision Making, From Big Data 1.0 to Big Data 2.0, Data and Data Science Capability as a Strategic Asset, Similarity, Neighbors, and Clusters Borders vs., Achieving Competitive Advantage with Data Science cloud
…
Ensemble Methods Big Data data science and, Data Processing and “Big Data”–Data Processing and “Big Data” evolution of, From Big Data 1.0 to Big Data 2.0–From Big Data 1.0 to Big Data 2.0 on Amazon and Google, Thinking Data-Analytically, Redux big data technologies, Data Processing and “Big Data” state of, From Big Data 1.0 to Big Data 2.0 utilizing, Data Processing and “Big Data” Big
…
, Example: Jazz Musicians Bruichladdich single malt scotch, Understanding the Results of Clustering Brynjolfsson, Erik, Data Science, Engineering, and Data-Driven Decision Making, Data Processing and “Big Data” budget, Ranking Instead of Classifying budget constraints, Profit Curves building modeling labs, From Holdout Evaluation to Cross-Validation building models, Data Mining and Its Results
…
Curves constraints budget, Profit Curves workforce, Profit Curves consumer movie-viewing preferences example, Data Reduction, Latent Information, and Movie Recommendation consumer voice, From Big Data 1.0 to Big Data 2.0 consumers, describing, Example: Targeting Online Consumers With Advertisements–Example: Targeting Online Consumers With Advertisements content pieces, online consumer targeting based on, Example
…
knowledge and, Dimensionality and domain knowledge early stages, Supervised Versus Unsupervised Methods fundamental ideas, Supervised Segmentation with Tree-Structured Models implementing techniques, Data Processing and “Big Data” important distinctions, Data Mining and Its Results matching analytic techniques to problems, Other Analytics Techniques and Technologies–Answering Business Questions with These Techniques process of
…
preparation, Data Preparation, Representing and Mining Text data preprocessing, Data Preprocessing–Data Preprocessing data processing technologies, Data Processing and “Big Data” data processing, data science vs., Data Processing and “Big Data”–Data Processing and “Big Data” data reduction, From Business Problems to Data Mining Tasks–From Business Problems to Data Mining Tasks, Data Reduction, Latent Information
…
–Data and Data Science Capability as a Strategic Asset baseline methods of, Summary behavior predictions based on past actions, Example: Hurricane Frances Big Data and, Data Processing and “Big Data”–Data Processing and “Big Data” case studies, examining, Examine Data Science Case Studies classification modeling for issues in, Generalizing Beyond Classification cloud labor and, Final Example
…
mining and, The Ubiquity of Data Opportunities, Data Mining and Data Science, Revisited–Data Mining and Data Science, Revisited data processing vs., Data Processing and “Big Data”–Data Processing and “Big Data” data science engineers, Deployment data-analytic thinking in, Data-Analytic Thinking–Data-Analytic Thinking data-driven business vs., Data Processing and
…
and, Chemistry Is Not About Test Tubes: Data Science Versus the Work of the Data Scientist evolving uses for, From Big Data 1.0 to Big Data 2.0–From Big Data 1.0 to Big Data 2.0 fitting problem to available data, Changing the Way We Think about Solutions to Business Problems–Changing the Way We
…
Scientist–Chemistry Is Not About Test Tubes: Data Science Versus the Work of the Data Scientist understanding, The Ubiquity of Data Opportunities, Data Processing and “Big Data” data science maturity, of firms, A Firm’s Data Science Maturity–A Firm’s Data Science Maturity data scientists academic, Attracting and Nurturing Data Scientists
…
with Unbalanced Classes for business strategies, Thinking Data-Analytically, Redux–Thinking Data-Analytically, Redux data-driven business data science vs., Data Processing and “Big Data” understanding, Data Processing and “Big Data” data-driven causal explanations, Data-Driven Causal Explanation and a Viral Marketing Example–Data-Driven Causal Explanation and a Viral Marketing Example data
…
, and Neural Networks using, Nonlinear Functions, Support Vector Machines, and Neural Networks New York Stock Exchange, The Data New York University (NYU), Data Processing and “Big Data” Nissenbaum, Helen, Privacy, Ethics, and Mining Data About Individuals non-linear support vector machines, Support Vector Machines, Briefly, Nonlinear Functions, Support Vector Machines, and Neural
…
, semantic vs., The news story clusters T table models, Generalization, Holdout Data and Fitting Graphs tables, Models, Induction, and Prediction Tambe, Prasanna, Data Processing and “Big Data” Tamdhu single malt scotch, * Using Supervised Learning to Generate Cluster Descriptions Target, Data Science, Engineering, and Data-Driven Decision Making target variables, Models, Induction, and
…
-Driven Causal Explanation and a Viral Marketing Example Tatum, Art, Example: Jazz Musicians technology analytic, Data Preparation applying, Other Analytics Techniques and Technologies big-data, Data Processing and “Big Data” theory in data science vs., Chemistry Is Not About Test Tubes: Data Science Versus the Work of the Data Scientist–Chemistry Is Not About
by Franklin Foer · 31 Aug 2017 · 281pp · 71,242 words
uncovering the patterns that undergird the construction of sentences. They can find coincidences that humans might never even think to seek. Walmart’s algorithms found that people desperately buy strawberry Pop-Tarts as they prepare for massive storms. Still, even as an algorithm mindlessly implements its procedures—and even as it learns to
…
data without hypotheses”: Chris Anderson, “The End of Theory: The Data Deluge Makes the Scientific Method Obsolete,” Wired, June 23, 2008. Walmart’s algorithms found that people desperately buy strawberry Pop-Tarts: Constance L. Hays, “What Wal-Mart Knows About Customers’ Habits,” New York Times, November 14, 2004. Sweeney conducted a study that
…
Out,” New Republic, June 1, 2014. “On election night he was in our boiler room”: Joshua Green, “Google’s Eric Schmidt Invests in Obama’s Big Data Brains,” BloombergBusinessweek, May 31, 2013. “Early on, [the Obama campaign] turned to Google Analytics”: “Obama for America uses Google Analytics to democratize rapid, data-driven
by Hal Niedzviecki · 15 Mar 2015 · 343pp · 102,846 words
?” The same approach is increasingly being used in the rise of what has come to be called predictive policing. Predictive policing is essentially the ESRI big-data-meets-mapping idea adjusted to meet the needs of law and order. Chris Ovens, a leader in ESRI’s Location Analytics Department, joins me to
…
the prior year. We attribute this success rate, in part, to moving officers around based on the calculated probability of shooting incidents.”6 The book Big Data cites another Richmond PD discovery: “Richmond police long sensed that there was a jump in violent crime following gun shows; the
…
into cash. These companies rely on using our now almost endless computing power to near-instantaneously store and process what we have come to call big data. Big data underpins almost everything that happens at ESRI and in the fields of analytics, BI, and, increasingly, IT in general. It is emerging as the essential
…
technological framework. Both a philosophy and an application, big data is the fuel that powers the ESRIs and other engines of our future-first era. Consider the definition found in the book
…
That Will Transform How We Live, Work, and Think. “At its core,” write the authors, Professor Viktor Mayer-Schönberger and Economist magazine editor Kenneth Cukier, “big data is about predictions.”9 It’s a frank, useful admission. As they write: “The possession of knowledge, which once meant an understanding of the past
…
future first requires “huge quantities of data” and these quantities require systemic implementation of data collection on just about every facet of daily life. The Big Data authors write about “systems” as in, “systems perform well because they are fed with lots of data on which to base their predictions.”12 This
…
is the implementation of an overall resetting of priorities around, “the ability of society to harness information in novel ways.”13 The Big Data authors talk about data as the “raw material of business, a vital economic input, used to create a new form of economic value.” They talk
…
world made better by default deference to a system that knows what is going to happen before it happens. According to the new purveyors of big data, our systems are only as good as the information they can collect. As such, everyone can and should be getting into the game of turning
…
can only be useful insights or valuable new products, the better off our systems will be. The structure—the system—is recalibrating to embrace the big data approach in which the everyday is viewed as fodder for the future. Both because of what it is and how it works
…
, big data should be seen as the first technological application unique to the era of future. It’s the first truly twenty-first-century technological application and
…
it has only one purpose; wherever and however it is applied, the goal is to know the future. The more big data enters into our every interaction, the more its ubiquity helps to usher in the ideology of treating the present as nothing more than a portal
…
to tomorrow. As such, the big data gold rush is on. You’ve got to have the data, if you want to own the future. In 2006, Microsoft bought
…
big data commercial pioneer Oren Etzioni’s Farecast (which used data to predict airfare oscillations) for $110 million. “Two years later Google paid $700 million to acquire
…
selling that software; they will have access to a vast new repository of data about how we live and work.”18 Another article, about the big-data-enabled race—currently being led by Google—to map every road, river, and even footpath in the world makes the same point: “Tomorrow’s map
…
; it’s a contest over the future itself.”19 Let’s delve into this a bit more, since it’s crucial to understanding where the big-data-future-first model is taking us. In the almost-here future, everything will be mapped in real time to such an extent that our entire
…
“ultra-precise digitizations of the physical world, all the way down to tiny details like the position and height of every single curb.” All this big-data map tracking creates a prediction algorithm of immense speed and complexity. Google preloads “the data for the route into the car’s memory before it
…
the fact. And systematically controlling what is going to happen is most efficient of all. ° ° ° ° ° ° In the first phase of big data and information technology, Walmart deduced that before big storms, sales of Pop-Tarts skyrocket. Google analyzed millions of search terms and correlated searches for products and symptoms related to flu with actual flu
…
like “You’re Almost Here!”)22 Now we are moving into the next, even more granulose phase of big data future prediction. In this phase, the lessons Google, Walmart, and Target have applied to big data are being adopted by almost every conceivable field from medicine to insurance to traffic. Pacemakers and other medical implants
…
predictive power of harnessing ubiquitous daily activity it’s mind-boggling. Providing several different services to various stakeholders, often with competing interests, is what makes big data such a wonderful treasure trove of possibility. ° ° ° ° ° ° My final meeting with ESRI is the most fascinating and most revealing in terms of harnessing granular data
…
a few pennies from every household in America every single day, there would be a massive outcry, a congressional inquiry, heads would roll. But because big data is categorized as a benevolent offshoot of our will to arrive at the future, we are not just accepting of it, we are lining up
…
to contribute to the project. With only the weakest infrastructure as protection for the consumer, big data is expanding in all directions much faster than any person could ever keep up with. Even if you wanted to opt out of
…
big data, it would be impossible. Jaron Lanier makes this point at length in his book Who Owns the Future, arguing that what is stifling opportunity in
…
” for our lack of social certainty by “scientific means.”9 Our obsession with knowing and owning the future—reflected in everything from the rise of big data to the frenzy around technological entrepreneurship—finds its roots in our unconscious desire to return to a time when we didn’t have to know
…
that this whole process has more to do with belief than it is has to do with science or engineering. Even in the age of big data, instead of actual, meaningful knowledge of the future, we have for the most part the promise of that knowledge. We have the sense that schemes
…
what was formerly an abstract unknowable into controllable data. IARPA and Philip Tetlock’s network of collaborators aren’t the only group trying to incorporate big data and psychology and crowdsourcing to get us to the point where we can actually turn future into information. In January 2014, George Mason University in
…
focus on the future? Well for starters, it’s a case of supply and demand. As we saw visiting ESRI and with the rise of big data, there’s huge demand. Philip Tetlock puts it to me this way: “There’s such an insatiable demand for forecasting and we have such a
…
with. Death, like so many other things, is the future. And the future is just chaos waiting to be shaped into the information. In the big-data age, the future, far more than the past and the present, is open to quantification, manipulation, alteration and disruption. This—not technological progress or the
…
be expressed in the prismed refraction of virtual totality—the information. “One of the defining features of modern times,” write the authors of the book Big Data, “is our sense of ourselves as masters of our fate; this attitude sets us apart from our ancestors, for whom determinism of some form was
…
the norm.”27 Big data, as the fuel source for the permanent future, is the information expanding beyond anything we could have ever have hoped for or imagined. It epitomizes
…
from the information age?—a promise: that we can arrive back (not a return, a going forward!) at the irrefutable continuity of life. Only, the big-data permanent-future promise is even more expansive than what the magicians could ever offer. Once we were assured that life as we know it would
…
go on forever. Now we are assured that our specific lives will go on forever. Big data is another step on the journey, another incarnation of magic into the information. It’s something from nothing. It’s the holy grail. It’s
…
, they can feel the burn as it rockets past them. They are on the frontlines, relentlessly controlled and manipulated by the new IT systems of big data now being used to track how many boxes they lump from a truck per hour. They are at the front of the line to lose
…
power the increasingly more programmable robots and automated technologies being installed in cutting-edge factories to replace human workers; they’re behind every innovation in big data from predictive to persuasive. IT is what allows, as the Race Against the Machine authors put it, “digital technologies” to execute “mental tasks that had
…
with IT? Well Walmart is, in fact, a pioneer in the kind of predict-the-future technology we talked about in our examination of the increasingly successful ways different actors are trying to own the future by knowing it sooner than everyone else. In fact, as the Big Data authors tells us, it
…
the Walmart Supercenter where the real technological changes have affected billions of people—eliminating their jobs, supporting outsourcing to countries where it’s normal to have entire weeks when the smog is so thick you can’t see the sun, and all to give us shoppers slightly cheaper toasters and the Pop-Tarts
…
in poor countries get micro-loans to buy goats; or a newly designed mosquito net saving millions of kids; a home desalinization kit; doctors using big data to cure infections in premature babies before the fever even gets a chance to register on the thermometer; a tiny Northern European country using wizard
…
. 6. “Richmond, Virginia Police Department,” accessed April 16, 2015, http://www.cwhonors.org/Search/his_4.asp. 7. Viktor Mayer-Scho¨nberger and Kenneth Cukier, Big Data: A Revolution That Will Transform How We Live, Work, and Think (Boston: Houghton Mifflin Harcourt, 2013), 152. 8. Ryan Prox, “How Vancouver Tapped
…
to Fight Crime,” A Smarter Planet Blog, accessed April 16, 2015, http://asmarterplanet.com/blog/2013/07/how-vancouver-tapped-big-data-analytics-to-fight-crime.html. 9. Mayer-Scho¨nberger and Cukier, Big Data, 6. 10. Ibid., 183. 11. Ibid., 15. 12. Ibid. 13. Ibid., 6. 14. Ibid., 9. 15. Ariana Eunjung Cha
…
Social Media, Elsewhere Online Redefines Trend-Watching,” The Washington Post, June 7, 2012, sec. Business, http://www.washingtonpost.com/business/economy/big-data-from-social-media-elsewhere-online-take-trend-watching-to-new-level/2012/06/06/gJQArWWpJV_story_2.html. 16. Evgeny Morozov, To Save Everything, Click
…
Here: The Folly of Technological Solutionism (PublicAffairs, 2013), 7. 17. Mayer-Scho¨nberger and Cukier, Big Data, 135. 18. Will Knight, “Why Is Google Buying So Many Robot Startups?,” MIT Technology Review, December 4, 2013, http://www.technologyreview.com/view/522251/why
…
/technology/archive/2014/05/all-the-world-a-track-the-trick-that-makes-googles-self-driving-cars-work/370871/. 21. Mayer-Scho¨nberger and Cukier, Big Data, 57. 22. “The Pregnancy Is Gone, but the Promotions Keep Coming,” Motherlode Blog, accessed February 4, 2014, http://parenting.blogs.nytimes.com/2014/02/02
…
/the-pregnancy-is-gone-but-the-promotions-keep-coming/. 23. Michael Beyman, “Big Data’s Powerful Effect on Tiny Babies,” CNBC, accessed April 17, 2015, http://www.cnbc.com/id/101032950. 24. Mayer-Scho¨nberger and Cukier
…
, Big Data, 57. 25. Ibid., 132. 26. Ibid., 57. 27. Ibid., 132. 28. Quentin Hardy, “How Urban Anonymity Disappears When All Data Is Tracked,” The New York
…
/19/how-urban-anonymity-disappears-when-all-data-is-tracked/. 29. Wolfgang Hall, Wolfgang Hall Interview, July 16, 2013. 30. Mayer-Scho¨nberger and Cukier, Big Data, 89. 31. Peck, “They’re Watching You at Work.” 32. Ibid. 33. Ibid. 34. Ibid. 35. Ibid. 36. Ibid. 37. Ibid. 38. Ibid. 39. Ibid
…
New Republic, October 26, 2012, http://www.tnr.com/book/review/are-we-getting-smarter-rising-IQs-james-flynn. 27. Mayer-Scho¨nberger and Cukier, Big Data, 183. 28. Ariel Garten, Ariel Garten Interview, June 14, 2013. 29. Mika Turim-Nygren, “Meet the Woman Making Brainwave Control Look More like Meditation and
…
, 2015, https://www.oxfam.org/en/research/working-few. 36. Ibid. 37. Brynjolfsson and McAfee, Race Against the Machine. 38. Mayer-Scho¨nberger and Cukier, Big Data, 140. 39. Jeremy Rifkin, “The Rise of Anti-Capitalism,” The New York Times, March 15, 2014, http://www.nytimes.com/2014/03/16/opinion/sunday
…
://www.thestar.com/news/world/2013/11/24/black_friday_and_the_digital_rumble_of_americas_working_poor.html. 53. Mayer-Scho¨nberger and Cukier, Big Data, 55. 54. Michael Maiello, “Walmart’s Wage Hike Still About Greed,” The Daily Beast, February 20, 2015, http://www.thedailybeast.com/articles/2015/02/20
…
and Future World: Nature as It Was, as It Is, as It Could Be. Toronto: Random House Canada, 2013. Mayer-Schönberger, Viktor, and Kenneth Cukier. Big Data: A Revolution That Will Transform How We Live, Work, and Think. Boston: Houghton Mifflin Harcourt, 2013. McLuhan, Marshall. Understanding Media: The Extensions of Man. 1st
by Barry Libert and Megan Beck · 6 Jun 2016 · 285pp · 58,517 words
These assets can have significant economic value if provided with the proper platforms, some of which are already available and are integrating cloud data with big data analytics and abundant access. The authors provide a blueprint for business leaders as to how to transform their enterprises to operate successfully in the digital
…
Customers: From Customers to Contributors • Principle 6, Revenues: From Transaction to Subscription • Principle 7, Employees: From Employees to Partners • Principle 8, Measurement: From Accounting to Big Data • Principle 9, Boards: From Governance to Representation • Principle 10, Mindset: From Closed to Open Aspire to These Ten Principles Part Three THE PIVOT Five Steps
…
remarkable economic returns by capitalizing on network advantages, such as co-creation with their customers (Facebook); digital platforms (Amazon); shared assets (Uber and Airbnb); and big data insights (Netflix and Google). Leaders and investors who want to participate in the network revolution need to envision their future, and the future of their
…
assets Actively allocate your capital Lead through co-creation Invite your customers to co-create Focus on subscriptions, not transactions Embrace the freelance movement Integrate big data Choose leaders who represent your customers Open your mind to new possibilities Following each chapter, you can rate your organization on each of the ten
…
technology piecemeal into various parts of their organizations, few are creating business models that take advantage of digital technology such as social, mobile, cloud, big data analytics, and the internet of things. Digitally enabled business models offer many advantages to organizations and those they serve. Here are a few of them
…
with digitally enabled platforms—essentially network orchestrators. In the middle are companies beginning to make the transition to digital by building digital product lines, using big data analytics, and leveraging social media for marketing and communication. Not every firm needs to be on the far right side of the scale, but
…
ten spectra in one place.) Are our core products physical or digital? Are we innovators, average users, or laggards in terms of mobile, social, cloud, big data analytics, and the internet of things? Do we have the right capabilities (technology, vision, talent) in-house to develop or improve our digital presence?
…
remain a market leader, IBM must continue to shift away from its historical role in manufacturing physical goods and move toward newer digital technologies like big data and the cloud. When reviewing IBM’s evolution over the past few decades for Forbes, Bridget van Kralingen, general manager for IBM North America,
…
Threadless and Lego use online forums to gather data from their customers on the products they want to see next. Uber uses mobile technology and big data to locate drivers and price services. Nike uses the latest manufacturing technology to produce shoes cost-effectively in batches of one. Most of these
…
to meet worker needs will see their best and brightest head off in search of their next great role. PRINCIPLE 8 MEASUREMENT From Accounting to Big Data Not everything that counts can be counted, and not everything that can be counted counts. —William Bruce Cameron, sociologist YOU MIGHT NOT EXPECT a
…
team Roland Dickey (CEO) and Laura Dickey (CIO) run Dickey’s Barbecue Pit, with 514 restaurants across the United States. They wanted to bring big data to barbecue, so they partnered with an external business intelligence firm to provide and develop a custom solution they call Smoke Stack.1 Smoke Stack
…
exclusively. It’s also extremely difficult to measure intangible assets. For example, measuring customer sentiment was much harder in the days before social media and big data, and it’s still difficult to value definitively. However, if you’re not measuring your people, ideas, and networks, you’re thwarting yourself competitively.
…
that a downturn in a local economy leads to belt-tightening and retailers start seeing reduced foot traffic. A retailer with a real-time, integrated, big data dashboard might notice the decline in sales after a week and start adjusting staffing and product shipping. Another retailer doesn’t react to this trend
…
if Caesars presents that customer with a free meal coupon or some other token while he is still in the casino. A real-time, integrated, big data system allows Caesars to take advantage of these opportunities. These three factors—measuring all assets, looking outward, and using real-time data—create great competitive
…
close to real time, but also track their external, intangible assets and use this data to improve the speed and quality of decision making. Big data is one of the hardest principles to implement because implementing it requires infrastructure as well as specific technical skill sets. Consider the following habits of
…
companies that use big data well, and rank yourself from accounting (1) to big data (10). START WITH CLEAR GOALS. The essential first step is to understand exactly what data would be useful to you
…
to the task. Analyzing the data, however, is a significant task, requiring specialized data analysts and statisticians. You may have a great spreadsheet expert, but big data specialists can do language parsing, self-evolving algorithms, cluster analysis, and much more. There are many full-service options, particularly for simple, common requests such
…
reacting to the wealth of granular information yourself (or within your leadership team), you will most likely become a bottleneck, limiting the potential of big data in your organization. Make sure your organization is able to actually use the insights in a timely way. It’s Not Only for Facebook and
…
, physical assets. Farming is a great example, because it’s very physical, and not very brand- or customer-oriented. One of the most exciting big data start-ups of 2015 was Granular, which creates farm-management software that integrates all parts of the business, from hardware such as tractors, drones, and
…
each crop and each field and even each cloud in real time. Continuing in the agriculture theme, John Deere is another surprise player in the big data revolution. The company created a self-driving vehicle long before Google did, although navigating a field is admittedly easier than driving on city streets.
…
serve our customers and the world. It’s time to figure out what opportunities are most exciting for you and your organization, and start the big data adventure. PRINCIPLE 9 BOARDS From Governance to Representation We need diversity of thought in the world to face the new challenges. —Tim Berners-Lee,
…
Web IT’S NOT EASY TO MANAGE A COMPANY YOU DON’T UNDERSTAND. In 2000, Kellogg Company, known for popular brands such as Froot Loops, Pop-Tarts, Frosted Flakes, and Pringles, purchased a small food company called Kashi. Kashi was a start-up that played in a similar part of the food
…
diversification and balance, your organization should leverage a mix of new ideas and methods, including tangible and intangible assets, employees and freelancers, accounting and big data analytics, and so on. You can develop this degree of openness even within a single core business—by using different approaches to serving the same
…
sometimes command, but they also have the capability and the desire to co-create and to help their workforce interact and innovate. Accounting data and big data analytics are used together to drive insight and decision making. Open organizations may not create network capability in every dimension discussed in this book,
…
go much further, connecting individuals to education, jobs, health, and other foundational building blocks of opportunity. Enterprise has begun exploring how it could use big data analytics and technology-enabled social and mobile networks to serve its mission. Its leaders are asking new, what if questions that are changing both how
…
or platform What does top talent usually do at your firm? Plant, production, and operations Client or customer services Research and development Digital development (cloud, big data analytics, social, and mobile) What risks are of greatest concern to your organization? Damage to PPE, loss of inventory Loss of key employees Inability to
…
on all major social media platforms (Starbucks has more than a million followers on Instagram—pretty good for a coffee company), and they use big data analytics to learn about and better serve their customers. You might think that the whole world is moving online and to the digital network and
…
that make them difficult to replace. Pay careful attention to technological capability during the human capital assessment. Recall that several digital technologies—social, mobile, cloud, big data analytics, and the internet of things—are closely associated with the network orchestrator business model. Note carefully how you currently leverage each of these technologies
…
e-commerce platforms) that will suffice for the first year or more of a new network initiative. Determine which digital technologies—social, mobile, cloud, big data analytics, and the internet of things—will be important for success. With your specifications written, think critically about the current talents and capabilities within your
…
.com/insights/organization/ges_jeff_immelt_on_digitizing_in_the_industrial_space. INDEX Accenture, 133 accounting, 116 external data sources and, 97–98 move to big data from, 99–101 traditional approaches to, 41–42, 97, 194 Adidas, 109 Adobe, 76, 80 affinity customer contributors and, 68, 70, 71 networks and,
…
key technology, 32 leaders’ access to information and, 57–58 leaders’ openness to using, 115, 116 network orchestrators and, 148 shift from basic accounting to big data and, 99, 102 social media data and, 143 technology creators producing, 14, 133 Andreessen Horowitz, 101 Angie’s List, 8, 60, 80, 197 Apple,
…
65 Barna, Hayley, 76 Beauchamp, Katia, 75 Beck, Megan, 7 Bergdorf Goodman, 76 Bergemann, Rosalind, 90 Best Buy, 45–46, 110 Bezos, Jeff, 119, 177 big data analytics using (see analytics) Dickey’s Barbecue Pit example of use of, 95–96 examples of use of, 101–102 goals for using, 99–100
…
of assets requiring new approaches to, 96–97 subscription model and, 78, 81–82 timeliness of data in, 98–99 Davidson, Adam, 86 decision making big data use and, 100 co-creators and, 61 network management and, 173 Deere & Company, 101 DeHart, Jacob, 68 Deloitte, 23, 57, 91 Dickey, Laura, 95,
…
accounting principles (GAAP), 97 General Mills Worldwide Innovation Network (G-WIN), 73 General Motors (GM), 113–114, 197 Gerstner, Lou, 47 GlossyBox, 76 goals for big data collection, 99–100 for boards, 109 for capital allocation, 53 Google, 3, 43, 91, 101, 110, 114, 118, 119, 148, 167–168, 183, 190
…
86, 88, 190 ideas. See intellectual capital IMD, 24 Immelt, Jeff, 199–200 industry sectors, business model adoption comparison by, 22–23 information. See also big data; data collection; intellectual capital subscription model using, 80 Innocentive, 15, 73 innovation Google and, 167–168, 183, 190 new core beliefs needed for, 196–197
…
in open organizations, 116, 118 Instagram, 21, 42, 60, 78, 79, 143 intangible assets big data collection from, 100 categories of, 41–42 digital technology for producing, 42, 45 importance of shifting to, 46 inventory of, 144–145 management practices for
…
68 Kodak, 46, 49–50, 54 law of increasing returns, 12 leaders and leadership, 27, 55–63 accessibility of, 60 assessing business model with, 131 big data use and, 100 capital allocation strategy and, 49, 50, 51, 53 change leader in PIVOT process and, 132 decision making and, 61 employee loyalty and
…
, Terri, 128, 140, 141, 163, 164, 183, 184 Lundgren, Terry, 110 Lyft, 44, 113–114, 155, 197 Macy’s, 109–110, 144 management practices big data analysis and use and, 100–101 for intangible assets, 42–44 management team business model assessment and, 131 See also leaders and leadership Mankiw, Gregory
…
128, 131 change leader in, 132 Enterprise Community Partners example for, 127 five steps of, 126–127 introduction to, 125–128 Pixar, 68 plans for big data use, 99–100 for filling technology, talent, and capital gaps in platforms, 171–172 for growth, on OpenMatters website, 10 for network management, 174–175
…
in, 82 recurring revenue from, 76–77 surprising and delighting the customer in, 81 themes in implementing, 80–82 types of offerings in, 80 talent big data collection and, 100 customer contribution of, 69 for digital platform operation, 170–171 experience in digital technologies needed by, 35 innovation and, 168 in
…
innovation, and leadership. OPENMATTERS is a data science company. It focuses on analyzing business models and the underlying sources of value. The firm harnesses technology, big data and analytics to categorize and measure business model performance. OpenMatters uses proprietary research to build indices and ratings for investors and strategies and rankings for
by Christopher Caldwell · 21 Jan 2020 · 450pp · 113,173 words
also harvesting and correlating information on them. Around 2010, this process came to be called Big Data. It was, at first, an entertaining curiosity. Walmart discovered through its algorithms that, when storms are coming, people buy more strawberry Pop-Tarts. Target could identify pregnant women from their tendency to buy unscented lotion in the third
…
to reality, undermined all types of thinking aimed at understanding systems from the outside—not just religion but also science, political ideology, and deductive reasoning. Big Data worked by correlation, not by logic. As the Oxford technology expert Viktor Mayer-Schönberger put it, “Society will need to shed some of its obsession
…
for causality in exchange for simple correlations: not knowing why but only what.” Big Data was a reassertion by powerful corporations of a right that had been stripped from other Americans: the right to stereotype. If you’re the sort
…
Society (London: George Allen & Unwin, 1952), 95. In hundreds of cities: Jeffrey Weiss, “Lunch Rush,” Dallas Morning News, October 12, 1993. Walmart discovered: Viktor Mayer-Schönberger and Kenneth Cukier, Big Data: A Revolution That Will Transform How We Live, Work, and Think (New York: Houghton Mifflin Harcourt, 2013), 54. Target could identify: Ibid
…
., 58. “Society will need”: Mayer-Schönberger and Cukier, Big Data, 7. Google claimed to predict: Ibid., 15. SWIFT: Ibid. When pundits sought new ways: Fareed Zakaria, “Sanctions Russia Will Respect,” Washington Post, February 13, 2015
…
. The Twilight of the American Enlightenment: The 1950s and the Crisis of Liberal Belief. New York: Basic Books, 2014. Mayer-Schönberger, Viktor, and Kenneth Cukier. Big Data: A Revolution That Will Transform How We Live, Work, and Think. New York: Houghton Mifflin Harcourt, 2013. McDougall, Walter A. The Tragedy of U.S
…
Bell, Alexander Graham, 157 Bell, Derrick, 10 Bendix Corporation, 83–84 Berman, Paul A Tale of Two Utopias, 161 Bezos, Jeff, 224 Bible, 57, 100 Big Data, 191, 192 bilingual education, 171 Bird, Caroline, 44 birth control, 47–51, 56, 58 Black History Month, 16, 152–153 Black Lives Matter (political movement
by Brad Stone · 10 May 2021 · 569pp · 156,139 words
by Luke Dormehl · 4 Nov 2014 · 268pp · 75,850 words
by Sangeet Paul Choudary, Marshall W. van Alstyne and Geoffrey G. Parker · 27 Mar 2016 · 421pp · 110,406 words
by Sara Wachter-Boettcher · 9 Oct 2017 · 223pp · 60,909 words
by Richard Susskind and Daniel Susskind · 24 Aug 2015 · 742pp · 137,937 words
by Jonathan Taplin · 17 Apr 2017 · 222pp · 70,132 words
by Jeff Lawson · 12 Jan 2021 · 282pp · 85,658 words