Hi there! We’re in Course One. We're starting lesson five which is called “Risk Rating Criteria”. Here's what we're going to cover in this lesson.
- We're going to be reviewing the rating criteria for FMEA (severity occurrence and detection): how we come up with these rating criteria and what they mean
- We’ll be contrasting prevention and detection controls and looking at how they affect FMEA and
- we'll use FMEA measures to define and prioritize actions.
So, we've identified some risk we want to control and problems we want to fix. Sometimes we have so many problems we don't know what to focus on first. Other times we don't have the capability to solve all the problems. We know we need to fix something, but we don't have the buy in from others to do it, and we need that.
We want to understand what's a priority and what feature takes priority over another: what will control the most risk and get us the biggest risk reduction for our efforts? We want to understand those details with our cross functional team. We want to get cross functional input, have discussions, and come to a conclusion.
This is why we use rating scales in our risk analyses, which we use to try to compare risk to each other and to criteria that we've defined.
Rating Criteria
Let's continue to build out our FMEA by applying ratings and talking about rating criteria that's used with FMEA and how it's used to make decisions. Applying an numerical rating turns our ideas into quantitative information: numbers that we can use to prioritize by rank, multiply together to assess significance or compare against a predetermined acceptance criterion.
There are three rating scales in FMEA: severity, occurrence, and detection.
- S in our FMEA table is short for severity. It's the severity of the potential effect of the failure.
- O in our table represents occurrence, or how often we think this failure will occur because of that cause.
- D is detection or our ability to detect the cause of the failure before the failure happens.
Each rating has a scale with defined meanings. For each value, your team could be using a scale of 1 to 10 or perhaps you're using a scale from 1 to 4 or 5. Your company may predefine these scales for you to use as a company standard, or you may be able to define your own scales for your project.
No matter what your scale is, all FMEA rating scales follow the same basic structure. The lowest severity rating is the least severe and the highest severity rating is the most severe.
- Severity ratings are given definitions based on what's important to the industry, the company or the project. This could be based on combinations of things like safety to users maintaining production, scrap, lost business and or damages to the environment.
- The lowest occurrence rating corresponds to something that's unlikely. Whereas the highest rating corresponds to a situation where the failure due to that cause is almost inevitable.
- The lowest detection rating corresponds to a situation where we are almost certain to be able to detect the cause before the failure, the next step or before product is shipped. The highest rating is assigned when it's almost impossible for us to detect the cause.
I've seen these rating scales applied in different ways, depending on the industry and even just the individual business or design project.
- For severity, the traditionally accepted approach is to assign a severity rating associated with the worst-case effect for that failure.
- For example, I've seen projects assign a severity rating based on the worst-case severity and
- I've seen teams assign a severity based on the most likely outcome.
- I've also seen teams break apart potential effects into separate line items and evaluate the risk of each event.
- The most conservative approach in all these cases is to choose the worst-case severity.
- For occurrence, the traditional approach is assigning a rating based on the likelihood of a failure due to a specific cause. I have seen variations to this, too.
- Teams have used the straight up occurrence rating of whenever that cause has happened regardless of the failure mode, which is a conservative approach because it results in a higher rating.
- Or there are teams that want to use conditional probabilities which adds to the likelihood rating a consideration of the probability of an event’s seriousness given that the failure occurs.
- Other teams have taken great care to assign an occurrence to repeated causes that add up to how often it happens.
No matter what, the team needs to be consistent and document how they apply the rating scales. They also need to meet their company standards.
What is important about these rating scales is that:
- they're defined before FMEA working meetings begin,
- that each rating level follows the MECE principle (which means each level is mutually exclusive from the others and together they're collectively exhaustive, covering the breadth of the situations for these rating scales).
- It's also important that the team uses it consistently throughout the project and that it's clearly communicated how the team apply the scales.
For all of these ratings, many teams get stuck on what number to assign. Where we can, we can use real life information to predict what could potentially happen. We can use:
- complaint information,
- published studies and
- our own test results
to help us determine any of the data that's used in an FMEA table.
Lacking that, we’ll just need to use our judgment and experience.
I've seen teams add a column to the FMEA to track the data source of their ratings and that's an acceptable practice. You and your team want to decide on a rating to apply to the FMEA line items and move forward with it so you can use the analysis.
As you learn new things and obtain the latest information, then go ahead and revise your ratings. For posterity, you can document why you decided to change it. Estimations are acceptable.
Use FMEA measures to define and prioritize actions
The whole reason for all of these FMEA ratings is to help teams define and prioritize actions. So let's talk about the different ways we can use these ratings to do just that.
We don't just want a table of information. We want to use it to help drive decisions about our system, how we want the users to use it and the design itself and how it's manufactured. The way we can help our team do that is through the severity, occurrence, and detection ratings.
Before we started the FMEA, we set up acceptance criteria for ourselves.
- We could use the severity, occurrence, and detection ratings themselves.
- Maybe our team doesn't want any of the highest severity ratings.
- If the occurrence is a rating of eight or higher we're going to consider that unacceptable.
- If our detection is dismally high, indicating we can almost never detect an issue, we'll want to work to eliminate that.
- There are other common FMEA measures that we can base acceptance criteria on
- R. P. N. (risk priority number) is the product of severity, occurrence, and detection. The higher the R. P. N., the higher the risk. We may define a limit on R. P. N.
- Criticality (which is the product of severity and occurrence) is another measure against which we can base acceptance criteria. The higher the criticality number, the more risk we have.
- Risk index (or criticality matrix, which is the intersection of severity and occurrence): usually this criterion is set up with three levels of risk in a matrix diagram. Low severity and low occurrence levels may be an acceptable risk and are usually indicated with green high severity and high occurrence, maybe an unacceptable risk meaning we're not going to accept this scenario and must take action to resolve it - it's usually indicated with red. In the middle is a yellow area where we need to decide if the risk is acceptable or not.
If our FMEA measures are indicating that we have a situation with an unacceptable risk, then we need to start to define actions to reduce our risk. We need to either:
- redesign something to eliminate the risk,
- change something to make the risk less severe or less likely to happen or
- improve the way we're detecting it.
This is why so many industries adopt FMEA. When used proactively early in the design process, it can be used to drive decisions and communicate with other departments so we get the best product that we can deliver.
- That information can help us work with our suppliers. So now they understand what features are most important.
- This information can also help our supplier management people evaluate potential suppliers capabilities to be able to provide the parts that we need.
- We can communicate risk information with our quality departments so they're inspecting the right features in the right way and can define their level of inspections based on the criticality of the feature that they're inspecting.
- Our manufacturers can design their process and define process validations based on the critical features of our design.
- And when we're working with the test lab to test our designs we can set a level of confidence that we need in the results of our test that's based on risk.
Our FMEA measures and criteria are only as good as the ratings from which they came and how they're applied. However, teams can take this to extremes.
I have worked with teams where they assigned every event a severity that is within the top two severity: the worst. This may seem conservative but what's happening is that we lose granularity in our FMEA. Nothing is discrete or nothing stands out because everything is severe, and we lose some of the capability to rank things for improvement. A way to help our team to find effects and assign a severity rating that makes sense is to take them back to the tree diagram concept. Chances are there are effects in a causal chain that happened before the worst of the worst and that the worst happens because of other failures or issues that are layered on top of it. Map it out with your team to get agreement on an appropriate effect and severity to work from in your FMEA.
An occurrence rating can also be taken to extremes. Teams are afraid to choose something because of lack of information, especially early in concept development. Or teams nit-pick on occurrence ratings with complicated calculations. This is where I've seen teams seek comfort in assigning a rating if an extra column is added, just to know what information was used to determine occurrence. It is also helpful to make the occurrence rating descriptions clear for everyone on the team to use. We'll see some current scales with two or more levels of descriptions: a 10 point numerical scale, a matching 10 point descriptive scale, and on top of that another matching five point descriptive scale.
Just make sure that everyone on your team has the same understanding of the levels and that they're applying it consistently. And team judgment is an acceptable way to assign an occurrence rating when lacking tangible data. We want to spend time improving the design and its controls. Not necessarily spending our time in trying to calculate the most accurate occurrence ratings that we can, especially in early concept development.
We don't want to wait until our engineering design is finished to start FMEA. We want to use it iteratively to help us gather, organize, and rank information that directly relates to the risk of our product.
We finish the FMEA when the project is done. By finished, I mean all the information in the table is captured and current and we use it to make our final benefit - risk decision about our product before we release it to the market and start selling it.
Even after the product is in the market, we continue to monitor its performance in the field and compare it against the rating values we assigned in the FMEA.
- Is our benefit-risk decision still valid or is the effect in the field more severe than we thought?
- Is it happening more often than we thought it would?
- Is it the case that we can't really detect the problem before it happens?
Things may change and we need to have a contingency plan to address it. Press releases, direct customer communications, and recalls make up part of the contingency plan. But using the FMEA proactively will help mitigate those types of things.
An FMEA is an analysis that should be used by the design team and the business. It's not a math problem where we can plug-and-chug and it gives us an answer. People make decisions about the product and the project. FMEA is an analysis that can be used as a valuable source of information to make decisions based on the best data available.
And we're at the end of this FMEA lesson. Here's what we talked about today:
- We described the different rating criteria that's typically used for FMEA, and
- we looked at how different controls, whether they're prevention or detection controls, how they can affect those rating scales and then the overall effect on our FMEA, our analysis.
- We also looked at evaluating FMEA measures to decide what to do next. What are our next steps? We also reviewed some practical applications for applying risk ratings in our FMEAs.
We covered a lot of ground today. I appreciate you being with me. I'll see you in the next lesson.