From Data Warehouse to Data Mesh: The Evolving Needs of Data and Data Architectures

What data fabric and data mesh are today, was what data warehouse was in its heyday—the next big thing, a game-changer, the difference maker.

However, data warehouses are limited in today’s settings, mostly in two aspects: First, they are only capable of performing queries on and analysis of large amounts of historical data. Second, and more importantly, the kind of data they can process might no longer be relevant in this day and age of big data and data analytics.

Then along came data lakes and data lakehouses, and now we’re looking towards data mesh and data fabric.

And just like that, the data warehouse seems to have lost its relevance—or is losing it, at least, to newer, more modern means of storing and scrutinising data.

From Warehouse to Lakehouse

The concept of the data warehouse emerged in the 1980s and has served businesses fairly well—and for several decades. Unfortunately, a data warehouse can neither store nor process unstructured data with enough efficiency to make it an ideal repository for the crude oil equivalent of data.

This inadequacy soon led to the development of the data lake, a type of architecture in which unstructured data is stored and made available for whatever purpose as determined by the organisation using it. The problem is, this lake can get all sorts of messy, swamped by all sorts of information that can turn the data lake into a data swamp—an unmanaged data lake in which the data is rendered either inaccessible or of little to no value.

And that brings us to the data lakehouse, in many ways a fusion of the data warehouse and the data lake. Data lakehouses enable structure and schema like those used in a data warehouse to be applied to the unstructured data that would typically be stored in a data lake. This means that data users can access the information more quickly and start putting it to work. And those data users might be data scientists or, increasingly, workers in any number of other roles that are increasingly seeing the benefits of augmenting themselves with analytics capabilities.

A Continuing Evolution: The Rise of Data Fabric and Data Mesh

The ways to store and process data are evolving still, and proof of this evolution is the development of data fabric and, more recently, data mesh.

The data fabric is a kind of architecture that integrates a set of technologies, services and even architectures to help an organisation manage its data and get the most value out of it. Identified by Gartner as one of the top 10 data trends of 2022, the data fabric treats data and related storage and processing solutions as a singular layer or fabric where everything is interrelated and spans across all environments. This makes it possible to subject data to continuous analytics and distribute the results wherever they are most relevant. Critically, it can utilise existing data technologies and leverage other related architectures.

The data mesh, on the other hand, is another type of architecture, but one that digresses from centralisation and instead looks to decentralise data such that it is distributed to specific business domains. These domains—sales, marketing and customer services, among others—have different data needs and requirements, and data mesh gives them more control over what data they get, establish data governance policies for it and then use it how they deem fit. 

One Cloud, Different Data Architectures

There is no one-size-fits-all solution to the different data needs of organisations. For example, the data warehouse might suffice for one business, but not for another that needs something more adaptive, like the data fabric.

Regardless, what matters more is the organisation’s choice of data management provider, and this is where Cloudera comes in. A leader in delivering a hybrid data platform with secure data management and portable cloud-native data analytics, Cloudera enables organisations to extract in real time, valuable insights from their data to drive value and achieve competitive differentiation.

Not coincidentally, Cloudera offers the above-discussed data architectures, having recently launched the world’s first all-in-one data lakehouse cloud service to add to its already impressive suite of solutions. These include Cloudera’s Data Warehouse, its secure and governed data lake service, its Unified Data Fabric and its scalable data mesh. Each guarantees unmatched freedom of choice—any cloud, any analytics, any data—without compromise and can help transform organisations, and even entire industries, with data.

 

share us your thought

0 Comment Log in or register to post comments