Creating a Centralised Data Hub by Hubert Yoshida, Chief Technology Officer, Hitachi Data Systems
Data is exploding and coming in from different sources as we integrate IT and OT (operational technology), and data is becoming more valuable as we find ways to correlate data from different sources to gain more insight, or we repurpose old data for new revenue opportunities.
Data can also be a liability if it is flawed, accessed by the wrong people, if it is exposed, or if it is lost, especially if we are holding that data in trust for our customers or partners.
For example, GDPR (General Data Protection Regulation) is due to be implemented by May 25, 2018 and will have a major impact on organisations that do business with EU countries. The transparency and privacy requirements of GDPR cannot be managed when data is spread across silos of technology and workflows.
Data is our crown jewels, but how can we be good stewards of our data if we don’t know where it is: on someone’s mobile device, an application silo, an orphan copy, or somewhere in the cloud?
How can we provide governance for that data without a way to prove immutability, and show the auditors who accessed it when, and how can we show that the data was destroyed?
For these reasons, we see more organisations creating a centralised data hub for better management, protection and governance of their data.
This centralised data hub will need to be an object store that can scale beyond the limitations of file systems, ingest data from different sources, cleanse that data, provide secure multi-tenancy, with extensible meta data that can provide search and governance across public and private clouds and mobile devices.
Scalability, security, data protection and long term retention will be major considerations. Backups will be impractical and will need to be eliminated through replication and versioning of updates.
An additional layer of content intelligence, can connect and aggregate data, transforming and enriching data as it is processed, and centralise the results for authorised users to access.
For example, Hitachi’s content platform, (HCP) with Hitachi Content Intelligence (HCI) can provide a centralised, object data hub with seamlessly integrated cloud-file gateway, enterprise file synchronisation and sharing, and big data exploration and analytics.
Creating a centralised data hub starts with the ingestion of data which includes the elimination of digital debris and the cleansing of flawed data.
Studies have shown that 69% of information being retained by companies was, in effect, “data debris,” information having no current business or legal value. Other studies have shown that 76% of flaws in organisational data are due to poor data entry by employees.
It is much better to move data quality upstream and embed it into the business process, rather than trying to catch flawed data downstream and then attempting to resolve the flaw in all the different applications that are used by other people.
Content intelligence software can help cleanse and correct data and apply it to the aggregate index (leaving the source data in its original state), or apply cleansing permanently, when the intent is to centralise the data on a content platform as an encrypted, single instance stored with safe multitenancy, with system and custom metadata, replicated for availability. The data is now centralised for ease of management and governance. RESTful interfaces enable connection to private and public clouds.