Defining Big Data

Published March 07, 2014, 9:18 a.m. EST

1 Min Read

Last September, two computer science students from the University of St. Andrews in the U.K. attempted to pin down a definition of Big Data, publishing “Undefined by Data: A Survey of Big Data Definitions” in the open-source journal arxiv.org. Their round-up included:

Processing Content

Gartner Group: The “Four V’s” definition: volume, velocity, variety, veracity
Oracle: The derivation of value from traditional relational database-driven business decision-making, augmented with new sources of unstructured data such as blogs, social media, sensor networks, and image data.
Intel: Generating a median of 300 terabytes of data weekly. Includes business transactions stored in relational databases, documents, e-mail, sensor data, blogs and social media
Microsoft: The process of applying serious computing power, the latest in machine learning and artificial intelligence, to seriously massive and often highly complex sets of information.
The application definition (arrived at by analyzing the Google Trends results for “big data”): Large volumes of unstructured and/or highly variable data that require the use of several different analysis tools and methods, including text mining, natural language processing, statistical programming, machine learning, and information visualization.
The Method for an Integrated Knowledge Environment (MIKE2.0) definition: A high degree of permutation and interaction within a dataset, rather than the size of the dataset. “Big Data can be very small, and not all large datasets are Big.”
NIST: Data that exceeds the capacity or capability of current or conventional [analytic] methods and systems.

Doug Fridsma, M.D., chief science officer for the ONC, has a definition that will resonate with almost everyone: “More data than you're used to--some people deal with petabytes and it's easy, but if you're a small practice, just your own data is more data than you're used to,” he says.
This piece was originally published by Health Data Management.

Defining Big Data

Also see 4 Tips for Defining Your Approach to Big Data