Big Data, Big Storage

Storage has always been and under-appreciated field of endeavor. And it remains so, especially in the era of big data.

But without adequate storage, there is no big data. The question is, do you rely on onsite storage resources, as has been the case for the past four decades, or is cloud ready to take on the big data storage challenge?

Jeff Vance recently tackled this question in an article in Datamation, noting that “by its very definition, big data's vast trove of data requires a storage capacity that grows ever larger all the time.”

There's plenty of discussion these days about the strategic and tactical advantages big data provides to organizations. For insurance companies, it's a no-brainer — big data analytics paves the way to fraud detection, telematics, greater risk analysis and enhanced customer retention. However, as Vance points out, “in order to gain these new insights and to challenge our misconceptions, we must find ways to access all of that data, hidden away in all of those proprietary applications and databases. That’s not just a big data problem. It’s also a management problem, and it’s most certainly a storage problem.”

So how will organizations manage and back up 100 petabytes of data? The time will soon come when that much data will be the norm, and a lot of it will be in unstructured formats.

What’s needed are smarter approaches to the big data storage conundrum. For onsite storage — still preferable to the cloud by many due to security concerns — the solution has typically been a brute-force approach. In a survey I helped develop and analyze last year, half of the companies we surveyed simply throw more disk at the challenge. Data stores were growing at an average rate of plus-25 percent a year, so this means that much more disk space has to be acquired as well. Just as many organizations sought to ramp up their processor power as well to accommodate greater levels of data.

Also see Defining Big Data

More efficient approaches to the big data challenge include the use of tiered storage strategies, in which fresh data is made available immediately to end users, while older data is moved back to less-accessible storage systems. In addition, what’s needed are data lifecycle management methodologies, in which certain types of data can be disposed of, rather than stored away. Of course, the business needs to identify what data is most valuable for the approaches.

Vance suggests three primary approaches to tackling the challenge. First, it's essential to keep breaking down the silos within enterprises. Silos only result in a lot of duplication, and are hard for decision makers to access. Second, despite any security reservations, a lot of storage will be moving to the cloud.

“Big Data storage is quickly becoming a subset of cloud storage,” says Vance. “As data centers are virtualized, and as more data is moved into third-party data centers, big data and cloud storage challenges (and opportunities) will begin to merge.”

The move to data storage in the cloud is likely to start with using cloud for backups, he adds. Third, there are new technologies that enable faster and more comprehensive data storage, including Flash, SSD (solid-state drives) and in-memory storage.

The bottom line is that storage needs to be recognized as a key piece of the big data picture.

Joe McKendrick is an author, consultant, blogger and frequent INN contributor specializing in information technology.

Readers are encouraged to respond to Joe using the “Add Your Comments” box below. He can also be reached at joe@mckendrickresearch.com.

This blog was exclusively written for Insurance Networking News. It may not be reposted or reused without permission from Insurance Networking News.

The opinions of bloggers on www.insurancenetworking.com do not necessarily reflect those of Insurance Networking News.

For reprint and licensing requests for this article, click here.
Analytics Policy adminstration
MORE FROM DIGITAL INSURANCE