IBM - Incentive Management Solution – New
Company: IBM, Armonk, NY
Company Description: IBM is a values-based enterprise of individuals who create & apply technology to make the world work better. Today, more than 400,000 IBM employees around the world invent and integrate hardware, software and services to help forward-thinking enterprises, institutions and people succeed on a smarter planet.
Nomination Category: New Product & Service Awards Categories
Nomination Sub Category: Incentive Management Solution – New
Nomination Title: IBM GSI Data Lake for making timely and data-driven business decisions.
Which will you submit for your nomination in this category, a video of up to five (5) minutes in length, explaining the features, functions, benefits, and results to date of the nominated new or new-version product or service, OR a written essay of up to 650 words describing the same? (Choose one): An essay/case study of up to 650 words
The Challenge
The amount of data relative to IBM’s $79B worth of business is astronomical. There is data corresponding to each client, product, sale, and every employee involved with each sale. Historically, all of the data elements required to calculate and pay commissions to IBM’s sales force of nearly 30,000 sellers, was gathered from numerous sources and analyzed by IBM’s Global Sales Incentives (GSI) organization. In the past, the reporting and analytical work was actioned manually using either complicated spreadsheets or data pulled from production, complex databases containing hundreds of schemas and tables. It took 15+ people to manually put data together and analyze it, which took days. The analysts required in-depth knowledge of several production databases to produce accurate outcomes..
The Solution
Using data from numerous disparate IBM systems, the GSI team created a Data Lake metadata warehouse that automates and streamlines multiple IBM processes. The Lake was implemented in 3Q 2018 and now serves as a one-stop-shop for all of the reporting and analytical work related to sellers’ coverage, performance, targets, etc. Furthermore, it enables fast turnaround times and simple data structures. It’s data also feeds a cognitive chatbot and has a robust GDPR compliant security scheme. The automation provides productivity gains on multiple levels, on-time delivery of various analyses and reports, which ultimately enables IBM’s management team to make timely and data-driven business decisions.
The Data Lake also contains abundant historical data, which contributes to the quality of the analysis. This was not readily available in the past. Before the introduction of the Lake, operational employees were using different rules and interpretations of data from various systems. The results were operational issues, ineffectiveness, and miscommunication. Currently, the operational processes are streamlined based on data from the Lake and its metadata warehouse.
Obstacles Overcome
The implementation of the Data Lake by the GSI team was not without it’s challenges. The hurdles that the team overcame are :
1) The reliability of reporting and analytical databases - on average there were 140 hours of downtime per month which affected the service provided.
2) Performance due to size and complexity of production systems had to be effectively translated to streamlined analytics & AI ready views in a Data Lake.
3) Multilayer security challenge to provide specific access levels due to SPI and GDPR to ~5000 people had to be addressed.
4) Having a GDPR ready solution with the ability to mask and change data close to real-time had to be done.
5) This was developed without CIO direct involvement. We had to learn and grow ourselves in order to complete the tasks.
6) Growing data demand and robust refresh times had to be addressed - from 200+ million rows processed monthly before, to 200+ million rows processed twice a week currently.
Innovation
The technology used is s suite of Data ETL (Extract, Transform Load tools) and Cloud tools such as IBM PureData for Analytics, IBM DashDB, IBM Cloud applications, BigSQL,and various Python applications.
Results
After the Lake went live, the productivity gains were 2,200 man-hours per month. To address a data/report/analysis request took an average of 48 hours in the past while currently it takes an average of 2 hours.
Benefits
-Historical data is readily available
-Stream-lined reporting
-Consistent analysis
-Flexibility to address the ever-changing business needs
-Easy to build dashboards and visualizations from scratch
-Expanded data analytical capabilities
-Data and labeling streamlining
-Utilized as the “go to” source for multiple web and mobile applications
The Data Lake houses over 1 billion rows of data and takes over 1 TB storage space. It provides data to 12 downstream systems and its data is actively used daily by 50,000 + IBM employees. It is estimated to provide $1M+ of value to IBM annually.