Why 'Big Data' is wearing thin
It is common knowledge that, if you want to get an IT budget approved, there are certain phrases you need to throw into the proposal that hit such a nerve that the Chief Financial Officer will immediately approve the budget. One such phrase is “Big Data”.
However, the phrase is wearing thin. Touted as the key to solving all of a large company’s ills , it has been at the peak of the Gartner Hype Cycle since 2013 and is now falling into the so-called “trough of disillusionment”.
In 2014, according to the IDC, $125-billion was spent by companies on hardware, services and software to deal with Big Data. However it “has not delivered on massive promises”, so these projects are starting to fall out of favour, says SurveyMonkey chef executive Dave Glodberg, speaking at at CeBIT 2015 in Hannover this week.
It would easy to dismiss Big Data as failing due to too much data being collected in too many silos which are still not connected to each other. However, there is another side to Big Data which Goldberg believes is missing from the discussion. He talks about “Implicit Data” and “Explicit Data”, arguing that both distinct types of data are needed.
Implicit Data vs Explicit Data
Implicit Data is what we currently think of when we talk about Big Data. It is data that is routinely collected such as that from clicking on a mouse, listening to song, entering and analysing search queries. We get this type of data on a massive scale, we measure everything and it is this type of data that concerns people about their security and data privacy.
However Implicit data gets it wrong.
The credit card companies analyses everyone’s purchasing habits and suddenly discovers expenses like teeth-whitening, gym membership and hotel rooms. Therefore an assumption is made, based on this Implicit data, that this person is getting divorced. The data does not take into account surrounding circumstances, such as upcoming business travel.
Shopping engines recommend products based on historical implicit data purchases. However, if shopping for a gift for a friend, this would throw off the personal recommendation engine.
“More data doesn’t lead to better information and this is how Implicit Data gets it wrong,” says Goldberg. “If you want to know how someone is feeling, what music they like – you need to ask them. Analysing their searches doesn’t lead you to the correct answer as much as asking.”
Asking is what one can call Explicit data.
Explicit data is data we get by asking questions and receiving answers.
The internet has allowed companies to get this type of data en masse by asking the right questions to the right groups of people.
At one stage, Google realised it was losing female staff, which concerned the company. Instead of relying on Implicit data of female search history, it surveyed the staff and realised it didn’t have a “female vs male” issue but a “new mother” issue. Females who were pregnant or thinking about becoming moms did not like Google’s maternity policy. As soon as Google discovered this, it fixed the issue and retained its female staff.
SurveyMonkey uses Implicit and Explicit data in order to make decisions. Mining the data and asking customers directly allows the company to thrive. And it is not the only one. In a single day SurveyMonkey has 3 million survey responses, answering 29 million questions, which results in 80GB of new data per day
The key message behind Goldberg’s talk was that relying on causation without correlation didn’t paint a true picture. Rather, companies must have the conversation with their customers and listen to what they have to say.
“Encourage customers and employees to give you feedback. People feel powerful and loyal when their voices are heard.”
- Liron Segev is also known as The Techie Guy. You can read his blog here or follow him on Twitter on @Liron_Segev
- Follow Gadget on Twitter on @GadgetZA