Hadoop is Dead Long Live Cloud Dataflow
When you consider the relentless advance of the Big Data Machine, it looks like it will make “Moore’s Law” for processors pale into insignificance.
The rate of velocity of data growth that the enterprise is now managing and analysing is such that technology is evolving to cope with it at a similar rate to the data growth itself.
DSA were at the Dell Connected enterprise event in Singapore last week, and one of the topics was big data, it transpired that many of the IT professionals attending were still just looking for clarification on what Hadoop actually is. At the same time the major vendors are building and promoting Hadoop based solutions. Hadoop is being touted as the technology for the future, but it may turn out that the time of Hadoop may pass by before many people even understand what it is.
As a case in point it is worth reading a Google blog published on 25th June in which they introduce Cloud Dataflow, and inform us that "Cloud Dataflow is a successor to MapReduce, and is based on our internal technologies like Flume and MillWheel."
Our aim here is not to discuss and explain different big data technologies but it is clear that Google who arguably process as much data as any other organisation on the planet have already moved away from map reduce which underpins technology like Hadoop. They have done so based on what can be referred to as a “needs must” approach. Google do not accept the boundaries of status quo they constantly innovate to solve problems and break limitations.
Big Data is an ever-escalating problem and one that will require constant product evolution. For IT professionals managing corporate IT the idea of mastering or backing specific big data technology may actually be a futile task. Perhaps the trick is going to be to build lasting partnerships with technology innovators and accept that “Big Data as a service” amongst other things will become part of a hybrid infrastructure approach to managing IT.