Author: Hu Yoshida, Chief Technology Officer, Hitachi Vantara
This is part 2 of my Top IT trends for 2018. In my first post, I covered Preparing IT for IoT. This post will look at some new requirements for managing storage that will make IT more effective.
4. Data Governance 2.0
2018 will see new challenges in data governance which will require a new data governance framework. Previous data governance was based on the processing of data and meta data. The new data governance must now consider data context and be flexible to quickly adapt as new regulations are unleashed by the regulators on new processes and data types like crypto currencies.
The new data governance must now consider data context. Surely one of 2018’s biggest challenges will arrive on May 25, 2018 when the EU’s General Data Protection Regulation (GDPR) goes live and affects all countries worldwide where the processing of personal data for EU citizens occurs. GDPR gives EU residents more control of their personal data. Individual controls include the ability to prohibit data processing beyond its specified purpose for collection, the right to access, the right to rectification, the right to be forgotten, the right to data portability, the ability to withdraw consent to the collection and use of personal data, and many more.
If an EU citizen invokes their right to be forgotten, a company must be able to find the individual’s data throughout its technology and application stacks (many of which are logically, if not physically, separated), evaluate the intent of each data element (as some regulations will likely supersede GDPR – such as financial reporting responsibilities), eradicate the data, and provide proof that the data has been eradicated to the EU citizen along with an audit log to demonstrate compliance to regulators. Responding to individual actions and enforcing individual rights can drive up costs and increase risks in collecting and storing personal data. Those costs are not limited to the working hours required to complete the requests – there are also penalties to consider. GDPR violations can cost up to €20m ($21.75m) in fines, or up to 4% of the total annual worldwide turnover of the preceding financial year.
GDPR also requires mandatory breach notifications within 72 hours to your customers. What is interesting here is the ambiguity of the term “breach”. In IT, this word often conjures up images of clandestine or rogue groups executing various forms of network intrusion attacks to unlawfully gain access to organizational data. However, in the eyes of GDPR, a data breach is defined as a breach of security leading to the accidental or unlawful destruction, loss, alteration, unauthorized disclosure of, or access to personal data, transmitted, stored, or otherwise processed. Consider how broadly defined that is – those “hackers” certainly fit the definition, but so does your database administrator who accidently executes a “DROP TABLE” command against your CRM system.
With mandatory breach notification requirements due to your customers within 72 hours, such a short window of time can be quickly compounded without comprehensive data processing models and the appropriate checks and balances regarding data use. It has taken months for discovery and notification of breaches in high profile cases like the Yahoo breach. The ability to do this is impossible for most simply because of a lack of data awareness – that is, when data is scattered in different application and technology silos throughout the organization, especially since more data creation is done today on the edge, on mobile devices and/or in the cloud.
5. Object Storage Gets Smart
By now most IT shops have started on their digital transformation journey and the first problem that most run into is the ability to access usable data. Application and technology decisions often lock data into isolated islands where it is costly to extract it and put it to other uses. Many of these islands contain data that is duplicated, obsolete, or goes dark in that it is valid but no longer used because of changes in business process or ownership. Data scientists tell us that 80% of the effort involved in gaining analytical insight from data is the tedious work of acquiring and preparing the data.
While Object Storage can hold massive amounts of unstructured data and provide customized and extensible metadata management and search capabilities, what’s been missing is the ability for it to be contextually aware. Object Storage now has the ability to be “smart” with software that can search for and read content in multiple structured and unstructured data silos and analyze it for cleansing, formatting, and indexing. Hitachi Content Intelligence software can extract data from the silos and pump it into workflows to process it in various ways. It can create a standard and consistent enterprise search process across the entire IT environment by connecting to and aggregating multi-structured data across heterogeneous data silos and different locations. Additionally, it provides automated extraction, classification, enrichment and categorization of an organization's data.
In a recent evaluation of a complex stroke CT case study, a custom MatLab DICOM parsing script was written to perform the filtering and extraction of DICOM tag data, a process that took 50 hours. Using a Hitachi Content Intelligence DICOM processing stage and the same medical image data, the query time was reduced to 5 minutes. This amounts to a 99.8% performance increase in being able to analyze the CT cases
My next post on IT Trends will talk about new data types that IT will start addressing in 2018.