Big Data is the fuel to re-invent the enterprise. And that fuel is no longer the exclusive franchise of the enterprise.
Like other assets in America these days, it’s being re-distributed. Here, it’s to small/medium business, to the healthcare industry and to consumers.
The explosion of data and the need to index, discover, integrate and analyze it represents a tectonic shift the likes of which we’ve not seen since the rise of the Internet.
Roman Stanek, Founder and CEO of GoodData, calls this fuel and the predictive analytics opportunity “The Oil of this Century.”
We know it’s impossible to get through a day without hearing “Big Data” tossed around. But it’s as much about “Big Analytics”.
The Democratization of Data is not about collecting data — its about about predictive analysis which enables real-time, actionable insights. Insights put into the hands of more business managers and fewer IT managers.
In the enterprise, this gets done across legacy silos where it’s been held for years, and increasingly, outside enterprise walls with trading partners and customers. The goal is to generate insights which might improve performance and relationships. Or to generate benchmarks which compare performance to peers and competitors.
In the healthcare industry, this gets done across patient, practice management and health plan claims silos. The goal is to help provide accountable care for patients, improved health outcomes, and higher care quality.
For consumers, this gets done across the silos created by multiple social graphs. Recognizing location, preference, content, and even intent. The goal is to create better and better user experiences and granular, actionable data for monetization.
Data grows in size not just because of the internet, email, business applications, and enterprise legacy growth, but also because it’s being gathered every moment by iPhones, server logs, medical equipment — you name it.
Most of that data is unstructured. When we invested in Archivas in 2006, the view was unstructured data would amount to a $6B addressable market by 2015. Today, most estimates are that it will be $20B by that time. The world’s technological capacity to store information has doubled every 40 months since the 1980’s. The volume of data stored out there is so large it’s measured not in pedabytes — but zettabytes.
The bottleneck to getting value from all of this volume is analytics. See Einstein: “Information is not knowledge.”
Big Data is not new. Enterprises have been working with massively-big data sets for decades. They’ve sifted through mountains of information on premise to gain insight into customers, purchasing behavior, and demographics. But they did so using rudimentary tools, where discovery + analysis = manually clicking through reams of files, then cataloging results into spreadsheet trending models.
The information that social interaction and opt-in behavior represents is a tsunami at work. The Social Web’s rise is staggering. There are over a billion people sharing personal experiences on Facebook. Social media is creating arguably the most valuable source of information in human history.
Each day, 300 million photos are uploaded to Facebook and 400 million tweets are shared. Every person, experience, place, event, topic and organizational affiliation is being documented and discussed. There are as many objects created in Facebook and Twitter each month as there are web pages in Google’s entire index. The social web has eclipsed Google’s index as the most dynamic source of information. As Metcalfe’s Law tells us, this will only increase as even more users come online and next generation user applications become even more intuitive.
Spindle developed its social discovery engine to aim squarely at that opportunistic reality. They harvest the graph across social silos and leverage signals like time and location to deliver content sorted by preferences, themes and relationships.
- The Democratization of Data
In the Old School, “Big Data” was limited to big companies who had BI functions staffed through massive IT investments.
Now, it’s democratized to the point where one encounters it every day in every business.
SaaS is the Great Communicator. SaaS is the major trend that has made data available and more useful to more constituencies. Business processes have become homogenous. Packaged SaaS applications that reach extreme mass are what make Big Data solutions available to all.
SaaS has enabled hundreds of millions of users of Salesforce.com, Marketo, Workday, WordPress and other ubiquitous apps to think about how to discover, curate, integrate and analyze their data to add value to daily business and personal process. Upstarts like Nimble and Cloze collect troves of data at the nexus of personal, social and business connections. These applications throw off their own data mountains — which in turn can be integrated into other packaged arrays and processes.
Democratization means taking Big Data Analysis out of the exclusive hands of Fortune 1000 IT and placing into the hand of the Fortune 5M.
- The Democratization of Infrastructure
The Great Enablers are the Cloud, Amazon Dynamo, Microsoft Azure, and Open Source.
Hadoop puts scalable, distributed computing for large data sets in the hands of the masses. Designed to scale up from single servers to thousands of machines, it’s a certified game-changer.
Disk-based relational database products built on SQL (think Oracle) can’t handle the tsunami. The era of Big Data demands new strategies and new approaches to indexing, search and interrogation.
NoSQL distributed database models handle the massive data quantities that traditional RDMS can’t. MongoDB and Couch are two great examples. Both are document-oriented, open source solutions that dramatically make web and mobile developers’ lives easier, improving time to market and productivity. All are available to developer teams large or small.
But platform technology is only part of the New Democracy. IT developers require access to tool sets that help them design for their data business requirements. To get entire communities to use them, they need to incredibly easy to use and customizable, easy to manage, and easy to connect with mobile platforms.
Cloudant is delivering a “data layer” which facilitates mobile developers tapping into a back-ended “database as a service”.
Logentries, enables companies large and small to gain operational insight into machine data. Log data thrown off from web servers, application servers. They create actionable insights which enable better software development, better application performance, better website efficacy, better business decisions.
- Government Reform & The New World Order
The Great Stimulator. Government Reform. Risk management is on the minds of everyone in the Healthcare industry. In The Brave New World of Accountable Care, providers (physicians) are motivated to improve, measure and compare patient outcomes. Payers (health plans) are motivated to arrive at personalized health plans which lower costs.
Phytel, makes its bones by enabling physicians, hospitals and payers to compare patient health data to relevant populations. They can cross-mine the clinical silos with insurance claims silos to predicatively analyze treatment courses, minimize readmissions, and improve outcomes. That data also enables hospitals to compare care-giving performance against peers.
Big, game-changing stuff. Bringing actionable data to the people to improve lives.
Over the next 12 months, more unstructured data will be generated than in all previous years combined. Think about that.
The value of big data comes from the knowledge gained from it and what you do with it. The promise of big data lies within the ability to make predictions based on it. That’s what gets people excited.
There’s no dispute as to the magnitude of the tectonic shift that’s occurring. Driven by the cloud, SaaS models and mobile devices, it has the potential for the Old School and for the nimble startup to each participate in the $1trillion transfer of wealth predicted to come.