Polling Failures Don’t Mean Big Data is Bunk
After Donald Trump scored electoral victories in the states of Wisconsin and Michigan, the recrimination of the massive infrastructure around political polling and aggregation began in earnest. Republican operative Mike Murphy tweeted, for example, that “data died tonight.”
And on the morning after the election, Jim Rutenberg of the New York Times wrote, “All the dazzling technology, the big data and the sophisticated modeling that American newsrooms bring to the fundamentally human endeavor of presidential politics could not save American journalism from yet again being behind the story, behind the rest of the country.”
All this insight could give pause to insurers and other enterprises that are making big bets on big data and analytics to change their businesses. It’s fair to ask: If big data can’t work when the stakes are this high, when can it work?
But what’s lost in the sea of buzzwords and Monday-morning quarterbacking is the fact that polls with sample sizes that average about 1,000, according to the National Council on Public Polls, aren’t really big data. Mark Breading, partner for Strategy Meets Action, says that when he is talking to insurance companies about big data, he’s talking about considerably larger sample sizes.
“That’s the whole power of big data: You don’t have to try to gain insights about customers through sampling,” Breading explains. “In theory, if you were going to use a big data approach, you get at least 100,000 people.”
The kinds of data sources that qualify as big data in insurance include large data clearinghouses, like the prescription drug databases that life insurers leverage, or the ever-increasing proliferation of smart and connected devices. For example, Progressive’s Snapshot program has more than 2 million users sending data with every trip they make.
“We’re moving to this connected, real-time world with all this data and all kinds of shifting needs and risks that are changing all the time,” Breading says. “Historically, insurers haven’t used it because the tech wasn’t there, but now it is. If you have a data set with 10 million records, you just look at the whole thing.”