Does the internet know us better than we know ourselves?

Aug 14, 2017

[email protected] staff

Presented here for discussion is a synopsis of a current article published with permission from [email protected], the online research and business analysis journal of the Wharton School of the University of Pennsylvania.

In his book, “Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are,” Seth Stephens-Davidowitz explores in part how Big Data can help brands understand consumers better than consumers understand themselves.

“What people click on, what people purchase, what people search — that’s more valuable than many of the other sources that you might consider,” the former Google data scientist said on the [email protected] Show on Sirius XM.

The problem with survey-driven data gleaned from surveys is that it is unreliable. He stated, “You can’t trust what people tell you. With a lot of the traditional data sources, there are incentives for people giving you that data.”

Netflix, for example, initially asked subscribers which videos they wanted to watch in the weekend ahead and documentaries or art films often came back as responses. However, when the weekend comes, subscribers ignore the suggestions and go for the lowbrow comedies or romances they usually watch.

“Netflix just realized they should ignore what people tell them, and instead focus on what they actually do, and let the algorithm tell the story,” stated Mr. Stephens-Davidowitz. “We tend to make horrible predictions about what we’re going to do in the future. Almost all of us are way too over-optimistic. I think data can ground us much better.”

Mr. Stephens-Davidowitz believes more academic research needs to be done around understanding people from their internet behavior and around Big Data’s ultimate ramifications.

“The pessimistic scenario is that companies would use this to take advantage of people, to get them to spend more money that they don’t have, or spend more time on their websites even though they don’t need to be on those websites. The optimistic scenario is that we would have insights into really, really important areas — health, racism, sexuality — and really learn how to improve society.”

But he believes surveys will lose influence as a data source. He stated, “Surveys have been dramatically overvalued, and really are going to play a much smaller role in the future as some of these new internet data sources become more accessible.”

DISCUSSION QUESTIONS: What are the upsides and downsides of Big Data as a consumer research tool? Do you see Big Data replacing or supplementing consumer surveys? When are surveys more valuable than Big Data sources?

"The ability to link clicks to other consumer characteristics opens up a vista of consumer insight we could only imagine and strive for 20 years ago."
"Big Data from online behavior solves most of the bias but is more difficult to generalize."
"...this year alone humans will create eight zetabytes of information (equivalent to 133 billion iPads-worth of data)."

Join the Discussion!

25 Comments on "Does the internet know us better than we know ourselves?"

Notify of

Sort by:   newest | oldest | most voted
Mark Ryski

As a consumer research tool, Big Data has changed the game and will continue to be increasingly critical. It’s not hard to understand that actual behavior is far more indicative and insightful of future actions than merely asking people about their intentions or opinions alone. That said, I believe there will always be a place for qualitative input like focus groups, for example, where additional color and context can be captured to supplement and augment data-driven findings.

Phil Masiello

Ever since I founded my first e-commerce site in the early days, I recognized the power of live consumer behavior data and have never looked back. I have never relied on focus groups or consumer surveys. These are indicators of what people want you to believe they will do, but not indicators of what they really do.

I do not see a downside to Big Data. I see a downside to how people use the data. With so much real behavioral information existing, I cannot see a reason anyone would ever use a survey again.

The simplest pieces of data that exist are keyword and semantic searches. These simple little words and phrases tell us the issues and problems consumers are trying to solve.

Dick Seesel

Consumer panels, focus groups and other traditional survey tools are losing favor because they are less predictive than actual transaction data. Clearly companies like Amazon and Netflix figured this out a long time ago and have continued to refine their data mining skills. But Big Data isn’t a perfect tool, especially in the case of a new product category lacking any history that can be used for predictive forecasting. Even Netflix often uses variations of consumer panels to test-market pilots of new TV series before making a multi-episode commitment.

Nir Manor

Learning from people’s behavior online can create important insights and understanding about consumer preferences and much more. However this should be used as a complementary approach to consumer research via survey. Surveys can yield more statistically representative data that can be easily generalized, however they can be biased. Big Data from online behavior solves most of the bias but is more difficult to generalize.

Ben Ball

The fundamental premise of the Wharton assertions are undeniable. Tracking actual consumer behavior — traditionally expressed in purchase data — has been considered more reliable than the reported behavior based on recollection or statement of intention captured in surveys and focus groups by most researchers for some time. The twist is that clicks now constitute actual consumer behavior, and clicks are much more readily tracked at the individual consumer level than in-store purchases. The ability to easily link those clicks to other consumer characteristics via online behavior opens up a vista of consumer insight we could only imagine and strive for (at great expense) 20 years ago.

Tom Erskine
1 month 5 days ago

The power of Big Data is the ability to find new predictors of customer preference and behavior that were previously invisible to the naked eye. These new predictors enable retailers to make better recommendations, improve product availability and significantly improve the customer experience. Instead of relying on survey responses, gut feel and heuristic approaches, these new systems explore massive amounts of data and help make connections that even consumers might not recognize. The hard part is doing it without being creepy. Just ask Target

Lee Kent

Big Data can get us a lot further into actual insights about the consumer however we still have to be careful. Not just because of the creepy factor but also in remembering that much of what consumers do is buy for others. How does Big Data sort that out? As for surveys, I do believe that Big Data will replace many surveys, but short surveys that ask the consumer how their shopping experience went or how well the customer service person helped can’t really be done without asking a few question. And that’s my 2 cents.

Lyle Bunn (Ph.D. Hon)

The absolutes of our behavior, even when seen in the smallest of rear view mirrors, are a truth in trending. Marketers do well to expect the expected. But marketing is not about about filling needs, it is also about helping consumers to create the new reality of their better world. Marketing is influencing and accurate behavioral data is a primary input.

Sterling Hawkins

Data is extremely important. However, when locked into a personalization engine like Netflix, it can keep us on a cycle of continuing to do what we’ve always done. There’s a balance between catering to the past and offering consumers something new and different. Businesses have to balance those channels of feedback with being able to create something entirely new. True innovation. Had Henry Ford asked people what they wanted, they would have said faster horses.

Ian Percy

So exactly how well do we “know ourselves?”

My expectation is that many will reply in mechanistic terms about data, data collection, algorithms and so on. And yes, there is huge advantage in looking at what people actually DO to understand what their motivations and desires might be. We are all motivated to do exactly what we do. Our being is indeed revealed by our doing! At least in part.

We still need to look deeper. No one says of another “You are my statistically significant mate.” We all long for our “SOUL” mate. So how do we measure that? Many thought they’d found such a one, but it turned out differently usually ruined by someone’s doing. The ancient prophet Jeremiah said that no one can know the heart. I don’t think algorithms and analysis would change his mind.

In the short time I have to respond this morning, I can only say that no consumer researcher will ever know me. Not ever. I know that because I will never truly and fully know myself. Been working on that for a long time.

Neil Saunders

Online behavior data is actual: it provides an accurate view of what people did rather than what they think they did, or what they would like you to think they did. If used appropriately, this data is invaluable in shaping and targeting products, offers and propositions in an effective way.

However, such data are not a panacea. They only provide a view of those within any given ecosystem: what about all the consumption activity that is not online, what about consumers who do not use a particular brand? Neither do such data always answer “why” people do certain things, something which can be useful in marketing.

There’s no doubt that Big Data, mostly from online, is changing, and will continue to change the retail and consumer industries — and that it will play an ever larger role. However, there is still plenty of room for other sources.

Stefan Weitz

As I wrote in my 2014 book, “Search: How the Data Explosion Makes Us Smarter”, the amount of data that is created every day by our latent and explicit actions on the web dwarfs all rational understanding; this year alone humans will create eight zetabytes of information (equivalent to 133 billion iPads-worth of data).

The beauty of machine learning and Big Data analytics is that it allows us to see connections and hidden figures in the data that humans otherwise would never be able to see on their own. For example, one of the researchers in the book was able to analyze tweets of expectant mothers and predict with exceedingly high levels of accuracy which ones would have post-partum depression — allowing doctors to preemptively offer care before the onset of the condition. Overall the mass of data and increasingly sophisticated analysis enables us to see patterns and meaning in an otherwise messy and seemingly random world.

Tom Dougherty

The danger is in believing that because we know what actions consumers have taken that we understand why they do what they do.

So online data gives you a detailed record of previous actions. How do you ascertain WHY they did what they did? Great research is dependent on two things. The methodology and the insights that the researcher brings to the questions asked.

Our expertise is in understanding human behavior so that we might align brand positioning with the underlying REASON why a choice was made. These fundamental questions remain regardless of the projection ability and confidence in the means by which the data was created.

What I see is an increasing dependence on analytics and statistical experts. As if knowing numbers will provide you with insights as to how to influence behaviors.

I love the data provided by online sources of consumer navigation. But I also know that there is a difference between information and knowledge.

Doug Garnett

There’s tremendous value to be gained by adding Big Data analysis into our mix of research opportunities. And it’s really not a new thing – direct marketers have relied on Big Data for decades. And with direct marketing experience, there are huge hidden risks that are being ignored.

First, Big Data is almost always secondary data — gathered for a different purpose. That means it will have holes (I recommend re-reading Deming about what it takes to have reliable data).

Second, with big data we never know what we DON’T know. And that error leads to horrendous failures. Here’s a blog post I wrote about what we don’t know — especially “why.”

Finally, in terms of applying Big Data, it’s best for micro-management — small tweaks that have small impact. The ability for Big Data to make a big difference is rare.

That’s why it needs to be clearly put in place: added to qualitative and quantitative research as another way to learn about a business. But retailers should take care — Big Data is not nearly as powerful as many will promise.

Ralph Jacobson

This reminds me of the fact we have known for a long time that shoppers tell you what they think they want, but their actions don’t exhibit what they told you in surveys. For instance, they tell you they want more product variety, yet they only purchase 20 percent of your assortment, typically. So yes, technology will take that human emotion component out of the equation and provide real-time insights to shopper journey behavior in ways that will not make surveys obsolete, but augment the intelligence gained from them with machine learning capabilities.

Kenneth Leung

I’d say for sure companies like Google remember more about me than I do. Whether they know me is a different discussion. I recently tried to find a restaurant I visited in Carmel in the past and started Googling and browsing on the map to look for it, and it actually showed me the restaurant with a note “You were here 4 hours ago” when I zoomed in. That tells you the power of Big Data and Big memory. Does it give better insight for research; that’s where the art of insight is different than the science of data collection.

Min-Jee Hwang

Big Data is able to provide insight into actual consumer behavior. On the other hand, surveys are reflective of what consumers THINK they will do, potentially containing biased answers. However, surveys, focus groups, and qualitative research data can provide feedback and insights on user interface, shopping experience, or product use that Big Data cannot uncover. While Big Data is growing and its uses are increasing, qualitative data still holds some valuable uses.

Cynthia Holcomb
Individuals purchase products based on individual sensory preferences. Unfortunately, words are subjective to the individual, resulting in consumer focus groups and surveys being skewed translations of the subjective preferences or views of the individual researchers involved. We make the decision to buy a product because we have an emotional connection to the sensory aspects of a product. Like a car, for instance. I recently bought a new car and knew immediately what car I wanted to buy after trying out 5 other cars. Another example, a woman shopping through a rounder of dresses and pulling out one or two dresses. She knows instinctively, without thinking, what she likes. Retailers are struggling with Big Data because an agnostic system, eliminating human subjectivity, does not exist to process disconnected data points into relevant “customer preference” intelligence. We retailers still use subjective, antidotal evidence to make expensive business decisions. Yes, the retail industry does need to change its thinking. It is not about the technology, yet technology is the only vehicle that can process and humanize the massive amounts of data collected by most large-scale retailers. Which brings me to Big Data. To truly uncover valuable “customer research” I suggest retailers change their… Read more »
Ricardo Belmar

At times of great disruption in a market or segment under study, surveys can still be a valuable tool to understand future consumer intent. The trick is in how you apply the insight gained from the survey. Do you take it at face value as as gospel, or with a grain of salt? Big data can be a fantastic predictor of intent based on actual history of activity, but take for example what is happening in deployment of technology in retail stores. If we fed past technology purchase history in the retail industry to a big data algorithm, how well would it predict the level of disruption that is happening today? Would it predict what is leading retailers to spend dramatically more than ever before on technology to improve their store experience? I wonder how accurate that prediction would be.

Kai Clarke

This is an easy quandary to fall prey to when examining data and its strengths and weaknesses. The ability to assume that everyone uses their cell phones and computers the same is one which we have tremendous distortions on. This does not truly show up in examining ourselves through the Internet because many segments are still very well informed through other communications channels (like TV and newspapers) instead of the Internet. This really applies to our aging population who are still not comfortable with mobile phones, let alone using these for anything else but a phone … Big Data beware….

Dave Bruno

Certainly Big Data-driven analytics are an essential and increasingly important part of modern decision making processes. However, the fundamental flaw in relying only upon behavioral analytics exists when the behavior in question is not yet an option. If customers do not yet have the option to, say, buy online, pickup in store, then we can’t evaluate their behaviors. We can use analytics in an attempt to discern intent, of course, but in my opinion, focus groups are still an important part of the mix for situations like these.

Christopher P. Ramey

Actions will always be more valuable to feelings. And, if you’re a student of Dr. Clotaire Rapaille, you already know customers aren’t capable of telling you what they really think.

Shep Hyken

Big data gives retailers info to spot trends and predict what a consumer will buy, their interests, etc. When your profile matches thousands of other customers, it’s much easier to predict and anticipate customers’ next moves, purchases, problems, etc. Surveys can’t be confused with Big Data. They can contribute, but surveys should be looked at and responded to, if necessary, on an individual basis.

Hilie Bloch

Big Data can take a lot of the art out of marketing, which may be good or bad. On the plus side, Big Data quickly validates what’s working and what’s not to decrease resources chasing failing ideas. On the negative side, it may inhibit new demand creation ideas that haven’t tested well in the past but may now be appropriate for the marketplace. Also on the downside, Big Data can cause retailers and brands to over-complicate their strategies, confusing or even scaring away would be consumers with too much info on themselves and their purchases.

As for the last questions, surveys are only more valuable than Big Data sources when there are no Big Data sources, which is rarely the case now.

Scott Magids
1 month 3 days ago

Consumer surveys can be useful, so long as they gather the right type of information. The traditional “unsatisfied – very satisfied” continuum often used in consumer surveys provides very little insight into buying behavior. That said, Big Data isn’t going to replace surveys, rather, those surveys will need to incorporate more meaningful questions, and the results combined with transactional and other big data points. In our own survey of more than a million consumers, we were able to go beyond that satisfaction continuum to get to the heart of the emotional motivators behind customer behavior, which is of much more value.

"The ability to link clicks to other consumer characteristics opens up a vista of consumer insight we could only imagine and strive for 20 years ago."
"Big Data from online behavior solves most of the bias but is more difficult to generalize."
"...this year alone humans will create eight zetabytes of information (equivalent to 133 billion iPads-worth of data)."

Take Our Instant Poll

Do you agree that Big Data will largely replace survey taking in the years ahead?

View Results

Loading ... Loading ...