Waterline Data announced its latest platform offering, Smart Data Catalog 4.0. Data Catalog Expands beyond Hadoop with new automation and crowdsourcing features to quickly connect right people with right data.
Cited by Matt Aslett, Research Director, Data Platforms and Analytics, 451 Research, "It has become clear that the data catalog is a fundamental enabler not just of the management of the data within a data lake, but also for a variety of related business use cases. By creating an inventory of data and data lineage, tagging sensitive data to control access, and even identifying data redundancy, the data catalog can be used to identify data for analysis, enable data governance and rationalise excess data sets, unlocking the potential value of big data projects."
Connecting the Right Data to the Right People
The 4.0 version of the Smart Data Catalog interchanges manual tagging of metadata with an automated process that rapidly classifies and organises all of an organisation's data assets and lineage, making data readily available for:
· Self-Service Analytics.
· Data Governance and access control for regulatory compliance.
· Data Rationalisation for greater storage and cost efficiency.
Smart Data Catalog 4.0 solves fundamental questions that most organisations have regarding data. Where do I find it? Where did it come from? What's in the data? Who can use the data?
Smart Data Catalog 4.0 Key Features
SDC 4.0's new enhancements were all created to speed up the usability of trusted data in the enterprise. New capabilities include:
· Support for directly fingerprinting and cataloging data located in Teradata, Oracle, MySQL, and other relational databases expands Waterline beyond prior version support for Hadoop-only data sources.
· Support for Data Lakes operating in Amazon AWS.
· Tag-based access control figures out sensitive data fields and enables data tagged as "sensitive" to have access automatically monitored by Apache Ranger and Cloudera Sentry, along with other access control tools via REST API integration.
· Improved user experience for the business experts with a new user interface "skin;" faster, more scalable search based on the industry standard SOLR search platform, enhanced crowdsourced ratings, annotation, reviews, and collaboration features.
· The industry's most extensible, open architecture that supports Hadoop, Spark, and Cloud deployment environments; an RDBMS plug-in architecture for relational sources, as well as extensive REST API partner integration and extensibility.
With its distinctive integration of automated data inventorying plus crowdsourcing, Smart Data Catalog 4.0 enables data professionals to "fingerprint" data at scale through analysation of actual data values. The software automatically tags data fingerprints to glossary terms as well as matches terms through crowdsourcing, and then curates the results by enabling data stewards to accept or reject tags. Meanwhile, business professionals can search and use data through a user-friendly interface or through various third party applications.
Alex Gorelik, CEO at Waterline Data said, "Our mission at Waterline Data is to connect the right people to the right data while information is still fresh. Most organisations have more than 50% of their data stagnating in quarantine zones or lost in data swamps, because nobody has the time or expertise to identify and organise the assets and decide who should have access to them. Waterline Smart Data Catalog 4.0 delivers a unique combination of automation and crowdsourcing that allows our customers to quickly get their data out of quarantine and into use with the confidence that the data is properly tagged so it can be governed and put into use in days instead of weeks or months."
Smart Data Catalog 4.0 is available immediately for members of Waterline Data's Early Access program. More information can retrieved here