Info Diversity is the future.
Big data has never been clearly defined in the market place. For those who think it has been, you failed. Few to none are translating this concept into action. The problem is one word. “BIG” as described in many documents and websites is the wrong word. We need to stop this conversation changing it into information and diversity. With enough diversity of small information sets you can accumulate large data sets to develop ever more powerful insights. Large, not big.
I speak with technology leaders and creators every day, none of them really understand the term or it’s underlying value. While a few do “get it”, it’s treated like a trade secret or a skunk works inside large corporations. Others have a real clear idea and are making millions while some are speaking on TED, but that’s not reaching the masses. Many projects and companies pitch me on what boils down to more, faster, better “big” data crunched to give some answer to a question. I said “some answer” to “a question” because it’s very often a question we crate to source answers from “Big Data”.
Big data is not a Guru with self determination. It can only answer the question you ask. A true Guru provides insights you were not expecting. A Guru has more diverse views and information than you do, that’s why they provide insights. Big data is really predisposed to some answer from our question; and that’s bad. Worse, it can become a guess based on algorithms chosen that few truly understand. What programmers do about this? Nothing, we want them to execute on orders. We need more business and data architects doing this work
Amazon, Uber, and others are starting to providing services anticipating your desires. This is fantastic, and comes from a long history in manufacturing. SAP has spent decades sourcing and organizing data for optimized supply chains. Just in time strategies anticipate final assemblies notifying suppliers in advance for needed raw materials. We are just extending these tools to our end customer now. New data, new target, same old strategies.
The problem I have with “Big Data”:
- Big is being translated to mean large homogeneous data
- Input validation and organization is not a business process
- Diversity is not being discussed in a meaningful way. If diversity results trying to build a large data set(s), great. Passivity to planning
- Discovery is assumed to be a result of the process when predisposition is the process we follow
- That is my short list. I have a longer one
The newest trend is “small big data”, or something like that. The real problem is how we framed big or small data against the technology used to process it. Big data was not only a term for the size of information repositories, but what computing power was required to manipulate it. That further confused our education, or lack of it, on what big means. Now we have new terms that implies “Big” data on “Small” computing systems is small big data. Seriously????
Here is what I would like the conversation to focus on.
- Well defined input and acquisition models for the average users.
- Create new methods and tools allowing users to deploy the value. (Rise of the Data Architects, a sequel)
- Heterogeneous data sets that are defined to build big data sets. No more homogenous Zetabyte hunting or disqualifying data that is not “Big” based on the technology or definition required
- Treat data as a corporate asset from building automation to sales and marketing automation
- Be ever more mindful of GIGO. Garbage in / Garbage out. If you don’t spend the time determine what you want, anything will do. Frighteningly you will believe it even more
- Set “discovery” as the goal for your data architect. Don’t spell out a conclusions and discard “discovered” data results as irrelevant. I see this all the time and it’s stunning how much “bad” data there is tossed out. Baby with bathwater?
Long live Info Diversity! Find more information to drive larger insights. Unlock discovery but allowing the association guide you to the customer, and their needs. Model what you want BEFORE you build it and leave options for discovery. And by all means – please stop talking about Big Data. No one knows what it really means anyway.