By Rich Hunsicker, Senior Consultant
As I was reviewing an article on Big Data, one of the points that caught my eye was that “companies were spending too much time preparing the data, when more time should be spent analyzing it”. Reflecting on a couple of meetings within the last few weeks where data quality had impacted several processes, sales and margin errors, incomplete contract analysis and updates; perhaps this prep time was being spent cleansing the data rather than organizing it. If we are going to use Big Data to provide insights and pattern analytics, we had better make sure that the information we are collecting adheres to a critical standard. One of the possible uses of Big Data is providing analysis and forecast of contract and subscription attrition and erosion. If we cannot accurately identify the Customers, Regions, Segments, Expiration Dates, etc., how can we provide the breakdown of this information to the decision makers? If we are attempting to target a segment or region, are we sure that the analysis has accurately represented our objectives? The alternative, we may just as well “throw a dart at a board” to decide what opportunities our business should pursue.
The old saying “Garbage In, Garbage Out” now has a louder and “Bigger” ring. All the information visualization and analytics cannot overcome the impacts of the underlying corrupt, incomplete, or erroneous data. Before we undertake any project with this scope, we need to ask several questions. How confident are we with our data and data sources? Are there consistent standards for collecting and updating information applied across enterprise applications, especially where multiple applications provide the same resulting data? When critical elements cannot be provided or are not relevant, are default values being consistently applied? Given that there is going to be some pain bringing disparate sources together, transforming this data will be easier if standards are established and followed. Data governance needs to be a prime directive throughout the organization to provide consistent and complete data, or Big Data will be a Big Mess.
We will always live with some data quality issues, sorry no nirvana, with the introduction of new processes or products that do not fit the current standard, bringing siloed data together into a usable structure, unstructured / unformatted data, etc. Edicts and clear guidelines for delivering and gathering data have to start at the top, CEO / CIO, and be embraced throughout the business in order for Big Data and its promises to be sucessful.
About the Author
Rich is a Senior Consultant with over 30 years experience in IT, working with both large and small businesses. More than ever, Business Intelligence is a critical component that allows companies of any size to understand their underlying data and direct their business plans and projects. Often these perspectives lead to finding over expenditures, under utilization, duplication, etc., which impacts a company’s bottom line. Working with our clients to find these “gems” is part of our focus.