Programs, Architecture & Analytics

Root Cause Analysis

Root CauseAn Introduction

ThinkGRC developed and offers free of charge an issue/incident classification and root cause analysis system to help Information Technology (IT) professionals better understand how issues/incidents arising within technology are related to the management of operations, organizational structures, systems, and leadership.  The goal of the ThinkGRC Root Cause Analysis System (RCAS) is to provide a structured classification system for problem identification, reporting and trend analysis which can be used to improve operational, technical and management decision making.   

Getting Started

The following will provides detailed instructions on how to download, use and train your personnel on the ThinkGRC Root Cause Analysis System.  

Want a quick start?

A Classification System

The ThinkGRC Root Cause Analysis System is structured as a top down classification system mainly focused on Information Technology consisting of common problem types, programs and  management systems involved in Infrastructure & Application Support Operations, Software Development, IT Project Management and everything else Enterprise Technology related.  

The classification system is twofold.  It is designed to identify the “causes” of an issue/incident from a technology standpoint; while simultaneously classifying the issue/incident as it relates to operational & organizational “Management Systems”.   

The ThinkGRC Root Cause Analysis System (RCAS) will provide two important pieces for your IT Incident & RCA program.  

  1. A viewable map for RCA classifications and selections to make it easy for any individual in your organization to perform an RCA.
  2. A standard set of classifications which are to be used to identify problems, report and trend issues/incidents events.  Overtime, the classification system will help you identify issues/problems within your organization and provide the supporting data & justification for resource allocation and change.

How to use the ThinkGRC Root Cause Analysis System

First download a copy of the ThinkGRC Root Cause Analysis System Map as seen below. Download

ThinkGRC Root Cause Analysis System Map

Our Incident

Next identify the incident/issue under question and run through the RCAS Map.  We will use the following incident/issue as a sample.

“A Server entered a hung state due to a snapshot issue after a recent virtual machine upgrade.  It has been determined that the snapshot issues is due to a compatibility issue with the current storage configuration after the upgrade.”

Map Usage

We will take our issue/incident and start at the top of the RCAS Map.  It is important to make and record your selection at each level of the map.  During this exercise you may find the need to make more than one selection at each level of the map.  This means that the issue/incident was caused by more than one variable which is very common.  If that is the case, simply run through the map start to finish multiple times making all of the relevant selections and record them in association with your issue/incident.

Problem Classification Selection

Now let’s start with the issue/incident stated above.  Go to the top of the map and ask yourself: Was the issue/incident caused by a Software Issue or an Infrastructure Issue?  Since we are IT focused and most groups/operations are broken down along these lines, this is the first primary area of differentiation that we will make.  We will select “Software Issue” as our top level Problem.

Problem Category Selection

Next, think about and ask yourself: Where did the issue/incident originate (select one)?  These selections have been broken down into the five primary functions within an IT Corporate Enterprise operation.  We will select “Application” as our Problem Category.   

Problem Type Selection

Next think about the type of problem that you had and ask yourself: How would you classify/code this type of issue/incident?  These selections are a collection of the most common IT Operations, Software Development and IT Project Management issues.  We will select “Failure” as our Problem Type.

Causal Factor Operator Selection

We will now move on to make our Programs (Initial Cause) selections.  On the left hand side use the key to identify the Causal Factor Operator and the Causal Factor rows.  These two rows are dependent both from a selection and reporting standpoint.  What you want to do is look at the Causal Factor row to review the types of Programs that your organization is managing and determine the issue with the Program by identifying the Causal Factor Operator.

Causal Factor Selection

To make the selection do the following, ask yourself: The issue was caused do to a (Causal Factor Operator)?, In the area of (a Causal Factor/Program)?  For the purposes of this example we will select the Causal Factor Operator = “Review/Testing/Verification Issue” and the Causal Factor = “Quality/Assurance/Release”.  When we record and report this issue/incident we will state that there is a Review/Testing/Verification Issue within our Quality/Assurance/Release Program.       

Root Cause Operator Selection

Next, we will identify the Root Cause Operator and Root Cause.  As what was done in the level above, these selections are dependent.  In the ThinkGRC Root Cause Methodology, Root Causes have been associated with issues related to “Management Systems“.  Management Systems are core organizational areas, operations or functions which if fixed will have a net positive effect on all previous levels of the map and most importantly fix or improve the issues identified by our selections.

Root Cause Selection

Now ask yourself: This issue was caused due to our Management Systems, how would you classify the issue type (Root Cause Operator)? As it relates to the following question: The Root Cause of the issue can be best related to the following Management System area (Root Cause)? Review both rows and pick the best combination to represent the issue as it relates to your Management Systems.

For the purposes of this example, we will select the Root Cause Operator = “Identification / Scope Issue” and the Root Cause = “Process”.  Therefore, we would say that the issue ultimately originated due to an issues with our Management Systems where “we” the “organization” have a problem properly identifying and scoping Processes to properly manage our Programs and ultimately the delivery of business processes and functions.  This is an issue with Organizational Management and should therefore be the responsibility of Organizational Management to resolve.

In practice, all of the ThinkGRC RCAS Map selections should be recorded and presented for reporting and trend analysis.

Record the selections as follows for reporting purposes.  In a future article, we will discuss how to record these selections and develop reports and trends for long-term analysis.

ThinkGRC Root Cause Analysis System Map Selections

Download the associated presentation or view on SlideShare.


Presentation Download:

Download

 

Thanks in advance, please Contact Us if you have questions and provide feedback on usage, selections and improvements.

ThinkGRC

 

Print Friendly, PDF & Email

Sharing

Facebooktwittergoogle_plusredditpinterestlinkedinmail