Centralizing all metadata from 30+ applications across the enterprise in 12 Weeks

Client Overview

An American, Ohio-based provider of equipment and services for data centers, with a portfolio of power, cooling, and IT infrastructure solutions and services that extends from the cloud to the network’s edge.

A private equity firm acquired a major network power business and a major thermal management business to form a new venture that is focused on providing services to the critical infrastructure market.

The client brings together hardware, software, analytics, and ongoing services to enable continuous and optimal running of vital applications for data centers, communication networks, and commercial and industrial facilities. Its portfolio comprises power, cooling, and IT infrastructure solutions and services, extending from the cloud to the edge of the network.

Headquartered in Columbus, Ohio, the client has 20,000 employees and more than 25 manufacturing and assembly facilities around the world. The company has regional headquarters around the world in Maidenhead, England, Bologna, Italy; Miami, Florida; Pasig, Manila, Philippines; Nanshan District, Shenzhen, China; and Mumbai, India.

Project Scope

Centralize metadata from all enterprise systems to help business users manage business metadata and find datasets for consumption. Enable the organization to document and understand its data through system-specific data dictionaries and enterprise-wide business glossaries.

Business Situation

The client was in the midst of enterprise-wide digital transformation, migrating from legacy mainframe applications and disparate on-premises applications to Oracle EBS regionally and Oracle Cloud globally.

All while having both legacy and new channel systems operational in a complex hybrid environment across three global regions APAC, The Americas, and EMEA.

One of the big challenges business users had was finding where the data was in the environment after their data was migrated to new channel systems and how they could find trustworthy data from other parts of the business in the new environment.

The key problems the business faced were:

Business users had no visibility into existing data sets or their contents.
The quality and relevance of each data set
Spent lots of time finding data, understanding data, and recreating data sets that already existed.

Technical Situation

The technical landscape at the client consists of 30+ heterogeneous applications and are integrated with a big data environment which is comprised of three Cloudera Hadoop clusters – Development, Production and Disaster Recovery. The clusters are LDAP, Kerberos, and Sentry configured for authorization and access controls. Reporting is to be performed directly off the Production data lake only (via the Silver and/or Gold layers only) using standard reporting tools.

Solutions

ChainSys data profiling algorithm centralized all structured and unstructured metadata within the chainsys data catalog and computationally generated additional metadata like statistics, patterns, & probables on the data. This helped the users better understand the data they have in their environment as well as the ability to search and access the knowledge through a simple and intuitive user experience.

The data catalog also helped the business define and develop a data dictionary and business glossary, which was a first across the entire organization. This was extremely useful for the business, IT, and especially the Compliance team as it allowed them to scan and identify all PII data in the organization.

Benefits

The data management benefits of the data catalog become apparent to the business almost immediately by allowing the business and IT to understand the breadth and width of their data it allowed for clear data quality standards to be applied to different datasets. The benefit to the compliance and GRC teams was also apparent and helped them identify all sensitive data and PII data, which protected them from massive fines and audits from governmental bodies.

The most significant value was seen in the impact on analytics and data science initiatives; the traditional IT organization was not equipped to provide the required insight and knowledge into the data. The catalog allowed the ever-increasing number of non-technical business users to browse and find the data they are looking for, which enabled them to conduct their analytics and reporting much more effectively.