Cisco Systems IT Team
Company: Cisco Systems, San Jose, CA
Company Description: Cisco System, Inc. (Cisco) designs, manufactures, and sells Internet protocol (IP)-based networking products and services related to the communications and information technology (IT) industry. The Company also provides services associated with these products and their use and also provides products and services for transporting data, voice, and video traffic across intranets, extranets.
Nomination Category: Information Technology Categories
Nomination Sub Category: Information Technology Team of the Year
Nomination Title: Cisco IT Database Services Innovation Team
Tell the story about what this nominated team has achieved since January 1, 2014 (up to 650 words). Focus on specific accomplishments, and relate these accomplishments to past performance or industry norms.
Cisco IT developed a five pillar approach (auto-detection, self-diagnosis, self-healing, auto-notification and self-service) to achieve operational excellence in the database services. This is part of Cisco IT’s strategy to enable Fast IT. With the end-to-end automation, these interconnected pillars help improve service availability, and help enable clients to be self-sufficient.
Auto-Detection: As part of our pervasive monitoring strategy, our goal is to detect issues on our services before business disruptions. We have deployed a central monitoring solution based on pre-defined configurable policies. This solution replaced the discrete, and scripts based monitoring across 2,800 servers using 75,000 crons. We now monitor around 326 production, 1272 non-production Oracle databases, 158 Oracle ERP, and 54 Oracle fusion middleware environments using this comprehensive monitoring solution. When compared with the industry best of 25 events (health checks) being monitored, here at Cisco, we are monitoring 135 events in an effort to prevent any business impact.
Self-Diagnosis: After auto-detection identifies the issue, self-diagnostic engine kicks in which gathers the data for future root cause analysis. Self-diagnostic engine also analyzes the data and directs it to one of the next phases depending on the policy. The policy is not just limited to environmental thresholds and exceptions but includes healing on its own, auto notification and others.
Self-Healing, Self-Tuning and Predictive Analysis: A commonly asked question from Cisco database administrators is “Do we really need humans to fix issues?” Today, we’re self-healing close to 40% of events raised from our services. An example of self-healing is the storage addition for databases, where additional storage capacity is added automatically when threshold is exceeded. Cisco also developed other self-healing solutions such as auto-restart and terminator to further improve the productivity. Predictive analysis is just one step away in Cisco. If any event can’t be self-healed then it is sent to the next pillars for actions.
Auto-Notification: As we continue our journey to self-heal everything, there are events that require manual intervention. We automatically notify our business and support teams about these alerts. There are two ways these auto-notification alerts are being managed today. They are either published to a real-time database dashboard for actions by database administrators, or they are sent to the respective IT support teams. The real-time database dashboard is a central dashboard developed in Cisco to monitor all the database events and resolve most issues with click of a button.
Self-Service: Why wait for a DBA to respond, if the tools can resolve the issues? With our extensive experience, we’ve developed many self-service tools. Examples of self-service tools include enabling the trace in the database, terminating a database session, and viewing database information. Tool adoption is significant (3K to 10K per quarter/tool) and it helps greatly to reduce unplanned downtime. Our database support team is a global leader in enabling click-to-chat online support for helping internal IT clients. These automations influence Cisco IT’s transformation from a Waterfall release model to an Agile model enabling continuous delivery.
Overall, this five pillar approach helped Cisco IT database team to significantly increase service availability. Now, we self-detect more than 97% of issues, and self-heal 40% of the events. Our IT clients receive proactive communication concerning system glitches at lightning speed. The number of support cases and client wait time for issue resolutions have been reduced by 26% and 79% respectively with the launch of many self-service tools. We continue to innovate and self-heal thus accelerating Cisco’s Fast IT initiative.
Shishir Kapoor is an IT Director in Cisco’s Global Infrastructure Services (GIS) organization. He is responsible for Enterprise Data Platform Services (EDPS), and Platform as a Service (PaaS) offerings. In this role, Shishir has pioneered the use of a five-pillar framework to manage Cisco’s data platform and cloud data platform strategy.
Shishir is a global senior IT executive with 23 years of experience in strategy, planning and transformation of large scale IT systems and processes in hi-tech, telecommunication and automobile Industries.
In bullet-list form, briefly summarize up to ten (10) accomplishments of the nominated team since the beginning of 2014 (up to 150 words).
• Achieved operational excellence in database services enabling FAST IT
• Deployed a central monitoring solution replacing the discrete, legacy, scripts based across 2,800 servers using 75,000 crons
• Capture more than 97% of database service issues through auto detection (Cisco database team monitors 135)
• Conduct self-analysis of system symptoms and collect data using the intelligent tools developed in Cisco.
• Achieved 40% of self-healing through automation
• Send pro-active communications about the system glitches at lightning speed to business and support teams
• Developed real-time central monitoring dashboard for all database events to resolve most of the issues with click of a button
• Implemented click-2-chat service for the first time in IT for internal clients for improved user experience.
• The number of support cases and client wait time for issue resolutions have been reduced by 26% and 79% respectively with the launch of many self-service tools.
• This automation influence Cisco IT’s transformation from a Waterfall