By Jay Jakosky, Senior Consultant
“With enough data, you can discover patterns and facts using simple counting that you can’t discover in small data using sophisticated statistical and machine learning approaches.” http://www.flickr.com/photos/revdancatt/5485645641/
I used to assume that big data and data mining and statistics were inseparable. But the reality–companies making a killing transforming data into value–is far from complex. In fact, it’s as simple as counting.
Big data is not hard. Statistics are not required. Neither are complex algorithms. Google’s Marissa Mayer attributed the company’s intelligence to the volume of data available for cross-referencing and not to clever algorithms. Google translate leveraged massive volumes of cross-referenced text in multiple languages rather than a finely tuned understanding of grammar. Voice translation uses much the same technique based on huge volumes of recorded, transcribed text.
Right now our two best tools for naive analysis are visualization and data exploration (business discovery). Both are simple, easy to demonstrate and easy to grasp. The big data revolution’s message to the masses is that simple correlation will outstrip them both as long as enough data can be crunched. And much of this can be automated, pre-calculated, and even anticipated. Imagine the analysis system analyzing itself: these people tend to ask these questions at these times!
Data can be correlated post-hoc. Correlation does not equal causation, but simple correlation is ample evidence on which to take action. Correlation is immediately perceived visually. Correlation is relative and easy to compare. Correlation can look at 2, 3, 4 or more factors at once. Correlation is business friendly. It is easily understood. Correlation is gut-instinct compatible. Kids understand it: mom gets upset when I put peanut butter on the cat. If I do it right now, she’ll probably be mad.
The business opportunity is really that so much big data is simply thrown away. The opportunity to store all this data didn’t exist, so we have an old habit of simply letting it vaporize. Every server message, every website click, every customer contact and interaction, every manufacturing activity, temperature, time clock action, phone call received, phone call placed, security video, email sent. Every bit of data can be analyzed, and from multiple perspectives: employee, employer, customer, vendor, shipper, receiver, and on and on.
We don’t know what we’ll find. It’s uncertain and that will stop some people. But as more and more stories of big data at little(er) companies emerge, the snowball will become an avalanche.
About the Author
Jay Jakosky, a Senior Consultant at Bardess, has been working with business intelligence, business software and databases for over 20 years. He is a passionate advocate for technology and business. For him, business intelligence is about seeing reality and driving action. He would love to talk your ear off about the coming leap forward in business software made possible by big-data technologies, social business intelligence systems and advances in human-computer interaction. “I live at the intersection of business and technology where a geek like me gets to transform companies and help people every day. I love understanding my customers–their goals, challenges, and long-shot hopes–and building tools that make new things possible. Any company could achieve this with focus and time. I do it lightening fast, which means more iterations, more exploration and faster payback. I have an outstanding set of tools. And I’m extraordinarily fortunate to have the passion for technology and the experience in so many companies.”