All press releases Press Release

Waterline Data Unveils Latest Smart Data Catalog for the 'Petabyte Enterprise'

November 13, 2017


No Other Data Catalog Comes Close to Matching Waterline's Scalability, Native Support, and Rapid Addition of New Partner Integrations and Data Sources

Mountain View, Calif. — November 13, 2017 — Waterline Data today announced the release of Smart Data Catalog (SDC) 4.03, the latest version of its solution designed to help enterprises with multiple petabytes of data, manage and gain insight into data instead of drowning in it. SDC 4.03 is the most powerful and proven enterprise-class data catalog available in the market that offers:

  • Multi-petabyte scalability
  • Native support for critical cloud, RDBMS, and Big Data platforms
  • Rapid expansion of support for new partner integrations and data sources

"It has become clear that the data catalog is a fundamental enabler – not just for data management within a data lake, but also for a variety of related business use cases – and Waterline Data has been at the forefront of helping enterprises unlock the value of data with its Smart Data Catalog," said Matt Aslett, Research Director, Data Platforms and Analytics, 451 Research. "The company continues to evolve its product with the addition of machine learning-based automation, improved scalability, and native support for both cloud-based and on-premises big data environments."

Tackle "Billions of Rows" with the Most Scalable Data Catalog in the Industry

The power of SDC is in its unique ability to replace the identification, contextualization, and rationalization of data with an automated process that uses machine learning and crowd-sourcing so customers can:

  • Rapidly classify and organize all of an organization's data assets and lineage
  • Convert used, stagnant, and even previously unknown data into discoverable assets
  • Easily govern data and make readily available for self-service analytics.

With its ability to automatically profile and tag billions of rows of data, SDC 4.03 is now by far the most scalable data platform in the industry helping global organizations rapidly convert their data swamps back into data lakes. By continuing to focus heavily on SDC's rapid growth in performance and scale—so data analysts can spend more time using data for competitive advantage and less time looking for it—Waterline has helped customers cut data processing time by up to 10X.

"Waterline is the only data catalog solution we evaluated that can support the vast volumes of data in our client environments," said Jennifer Benito, Principal Consultant at Trace 3. "After we deployed their solution in a matter of days, it profiled and tagged over 30 billion rows of data in about half a day, making it possible for business users to quickly search for and find the data they need to do their jobs. Our clients need that kind of scalability to support the volume and variety of data in their growing data estates."

Universal Native Support

Waterline is the only data catalog to natively support all critical cloud and on-premises Big Data environments. With the rapid growth of the cloud, Waterline has extended its support to include Azure, Google Cloud Platform (in preview) and multiple simultaneous S3 environments to extend its prior support for AWS EMR and AWS S3 standalone environments. Waterline has also added support for MapR 5.2 in addition to its prior support for Cloudera and Hortonworks.

"The environments that organizations are trying to catalog continue to grow in size and diversity," said Eric Kavanagh, CEO at Bloor Group. "The fact that Waterline runs natively on multiple Big Data environments is a big deal as it delivers a design center that will scale with the increasing volume and variety of data."

Rapid Addition of New Partner Integrations and Data Sources

Waterline is also continuing to rapidly extend its partner integrations and data source support with expanded integration for market-leading partner solutions, including Paxata, Collibra and Trifacta as data catalog enabled applications. Extended support for additional Big Data, cloud and relational data stores includes Azure Blob store, multiple Amazon S3 buckets and Amazon SSE for secure data environments as well as Amazon Redshift, MS ADSL and MSSQL.

"Our joint customers are already seeing significant value with the combination of Paxata and Waterline, including one of the world's largest restaurant chains," said Nenshad Bardoliwalla, co-founder and chief product officer, Paxata. "The deeper integration with scalable cloud native services like EMR, S3, and Azure Blob Storage will make it ever easier for our joint customers to find, profile and prepare data more quickly in the rapidly burgeoning hybrid, multi-cloud technology landscape."

"It's do or die time for the petabyte organization," said Andrew Ahn, VP of Product Management, Waterline Data. "It's not how much data you house but what you're able to do with it that will determine the fate of your organization. With Waterline, organizations can quickly complete that otherwise long and painful first step of cataloging their data for analysis and use."

About Waterline Data:

Waterline Data customers spend less time looking for data and more time using it while complying with ever changing data governance mandates such as GDPR. Waterline Data automates data discovery, compliance and the ability to take action on data by using a combination of machine learning, ratings and reviews, and tribal knowledge to deliver the most comprehensive data catalog solutions on the market. The company is funded by Menlo Ventures, Jackson Square Ventures, Partech Ventures, and Infosys, and implemented in large enterprises around the globe. Founded in 2013, the company is headquartered in Mountain View, California.