Author: Hu Yoshida, Chief Technology Officer, Hitachi Vantara
Object storage, or content addressable storage, which was once an afterthought for archiving data has now become mission critical as we see the explosion of unstructured data driving more of our business decisions. While core database applications with structured data still drive much of the business today, integration with unstructured data from mobile devices, internet and other connected devices are driving a digital transformation through the cloud, big data, analytics, governance, and IoT.
All major public cloud storage providers, including Amazon Web Services (AWS), Microsoft, IBM and Google have adopted object storage as their primary platform for unstructured data which makes it the primary storage for hybrid cloud applications.
Service providers see immediate benefits from object storage’s flexibility and scalability over file-based approaches. As more enterprises adopt public and hybrid cloud applications, object storage with RESTful cloud interfaces and APIs provide easy access to cloud applications and management of unstructured data.
Unstructured data growth is far outpacing the growth of structured data, and more enterprises are struggling to store and manage multiple petabytes of unstructured data. File systems with their hierarchical data structures cannot scale to meet the growth of this data without creating multiple silos of isolated data. Backup, which multiplies the storage requirements has also become untenable.
The only way to manage this big data growth is to implement a metadata-based, scale-out platform that is not dependent on infrastructure or location. The data will outlive the application that created it and the infrastructure where it initially resides. Object storage metadata will preserve the data’s content and RESTful interfaces will keep it accessible in a cloud environment. Backup can be eliminated by keeping two or more replicas of the data store.
Analytics will be driving more critical business decisions, but analytics is only as good as the data that it analyzes. Analysts and data scientists spend 80% of their time gathering, cleansing, and curating the data that goes into their analytic models. This is where the metadata in object storage is valuable. Metadata is attached to data when it is ingested and stays with the data until it is deleted and scrubbed.
The content of metadata is customizable and offers flexibility in the identification and management of the stored data. A key differentiator in object storage systems is the vendor’s metadata framework that best addresses the enterprise’s long term needs. Another differentiator are APIs for access by analytic tools.
The metadata in object storage also facilitates the governance of data, especially where content awareness is needed for regulation compliance. For instance, European Union privacy regulations require that an individual has the right to be forgotten, which means that all records with their private information must be found and deleted unless they are under legal hold. That would be difficult to do without metadata.
Object storage can also provide WORM (Write Once Read Many) technology to prevent data from being modified. Hitachi’s object storage solution also provides a hash to prove immutability.
IoT is driving even more unstructured data to improve business operations. Machine driven data has very little metadata. In order to integrate operational data into the business process, we need to address the growing issues around data management, data governance, data sovereignty, identity protection and security breaches. These can be helped with object storage metadata.