Gan Chun Yee, Fusionex’s Head, Big Data and Analytics
Hadoop, formally called Apache Hadoop, is an Apache Software Foundation project and open source software platform for scalable, distributed computing. Hadoop can provide fast and reliable analysis of both structured data and unstructured data. Given its capabilities to handle large data sets, it's often associated with the phrase big data.
In late 2013, Fusionex, an international provider of enterprise software specialising in Analytics and Big Data solutions, announced the official launch of its Big Data Analytics software (GIANT), considered to be the first Big Data Analytics software of its level of comprehensiveness in Asia.
Through GIANT, all processing and integration of data sources are done with high performance, speed, and agility yet displayed in a user-friendly, intuitive and interactive way.
The driving principles behind the creation of GIANT is to shield end-users from the complexities surrounding Big Data, low-level plumbing, hard-core Apache Hadoop and MapReduce programming, thus empowering end-users with the much needed insight from Big Data to run their business without being burdened by in depth programming language and technological complexities.
To cement GIANT's ability to carry out the requisite highly scalable / high speed processing tasks to handle vast amounts of structured, semi-structured and unstructured data in a user-friendly manner, Fusionex also established strategic partnerships with two of the most established Hadoop distribution (distro) platforms providers in the world, namely Cloudera and Hortonworks.
These strong partnerships are complementary and synergistic in nature; both Cloudera and Hortonworks are reputed for their world class distros, while Fusionex GIANT provides the last-mile solution (business solution) to organisations via Big Data Analytics.
With this, not only are end-users and organisations provided with attractive options on platforms, but these platforms also enhance GIANT's reach to a wider audience as well as increases GIANT's scalability and processing capabilities, leveraging on the world's best Apache Hadoop-powered platforms.
In this week’s executive interview series, Gan Chun Yee, Fusionex’s Head, Big Data and Analytics, shares his company’s views on Hadoop.
DataStorageAsean: In simple terms, what's the difference between Hadoop and a normal database?
Gan Chun Yee: A relational (normal) database stores only structured data. It usually appears in the form of tables and columns and is mainly used as data storage for mission critical systems where data integrity is important. Unlike a relational database, Hadoop is a distributed system that can store any kind of data (structured, semi-structured and unstructured) and is usually used to process massive amounts of data.
DataStorageAsean: Can you tell us about the newest developments in Hadoop and where you think the technology is heading?
Gan Chun Yee: Hadoop started as a distributed batch-processing system. The introduction of Apache Spark as a distributed memory architecture has brought Hadoop to the next level into a real-time interactive platform for big data analytics. As Apache Spark is gaining attention and popularity from open source and commercial communities, I can see that many products will be created around Apache Spark to solve big data and fast data problems.
DataStorageAsean: Is Hadoop for all companies of any size and can anyone in an organisation get access to data held in a Hadoop cluster?
Gan Chun Yee: Hadoop is for all companies of any size because every company will have to deal with all forms of data – structured (ERP, CRM), semi-structured (logs, sensors, social media) and unstructured (images, videos, emails). There will always be data segregation concerns as nobody should have access to all data, so certain data policies need to be implemented.
DataStorageAsean: What are the specific challenges for Hadoop adoption in the ASEAN region?
Gan Chun Yee: Identifying business-use cases and finding highly-skilled Hadoop resources are always the common challenges for Hadoop adoption in the ASEAN region. Many organisations that try to adopt Hadoop always find difficulties in justifying the ROI (Returns on Investment) because many couldn't identify the true business benefits their organisation stands to gain from this distributed system.
Therefore, many organisations are leveraging partners such as Fusionex that can help them establish business-use cases by bringing our vertical big data solutions that run on our big data analytics product. This accelerates the entire big data implementation and at the same time eliminates the need to employ highly-skilled Hadoop resources.
DataStorageAsean: What's unique about your Hadoop offering?
Gan Chun Yee: Fusionex Insights (GIANT) simplifies the entire Hadoop implementation through the rich experience of graphical user interface. This empowers the user to easily create a big data solution without deep knowledge of Hadoop. Fusionex GIANT provides end-to-end big data analytics solutions covering data management and processing, visualisations of data and predictive analytics. The product’s ease of use has allowed users to perform analysis easily by asking the system in natural human language without worrying about rows, columns or the X and Y axis.