The Safety of Work

Ep.74 Is a capacity index a good replacement for incident-count safety metrics?

Episode Summary

Thanks for joining us for this episode of the Safety of Work podcast! Today, we are discussing whether the capacity index is a good replacement for incident-count safety metrics.

Episode Notes

This topic interested us mainly because of a paper we encountered. It’s a very new peer-reviewed study that has only just been published online. We will use that paper as the framing device for our conversation.

Join us for this interesting and exciting conversation about the capacity index.

Topics:

The belief in required metrics.
Low injury rates and what they actually mean.
The regulator paradox.
The six capacities.
Due diligence.
The problem with the study’s names for metrics.
Measuring activities.
Practical takeaways.

Quotes:

“Injury rates aren’t predictive of the future, so using them to manage safety, using them as your guide, doesn’t work.”

“And while I think you could always argue that there are different capacities that you could measure, as well, I don’t think there is anything inherently wrong with the capacities that they have suggested.”

“Basically, what we’re doing is we’re measuring activities and all of those things are about measuring activities. Now, unless you already know for sure that those activities provide the capacity that you’re looking for, then measuring the activity doesn’t tell you anything about capacity.”

Resources:

A Capacity Index to Replace Flawed Incident-Based Metrics for Worker Safety

Feedback@safetyofwork.com

Episode Transcription

Drew: You are listening to The Safety of Work podcast episode 74. Today we are asking the question, is a capacity index a good replacement for incident count safety metrics? Let's get started.

Hey, everybody. My name is Drew Rae. I'm here with David Provan. We're from the Safety Science Innovation Lab at Griffith University coming to you from Brisbane and Melbourne. Welcome to the Safety of Work podcast. In each episode, we ask an important question in relation to the safety of work or the work of safety and examine the evidence surrounding it. In this episode, I don't think we've really got a question or we just have a paper that we're keen to talk about. I’ll throw it straight to you, David.

David: Yeah, Drew. Thanks. Let's start by introducing the paper straight off the bat. Normally, we have a bit of a general overview, but the paper that we're going to review today is a recent paper by Professor Sidney Dekker and Michael Tooma. The paper is called, A capacity index to replace flawed incident‐based metrics for worker safety. It's very recent. It's only just been published online, maybe only in the last couple of weeks.

We usually start by talking about the authors. At least I suspected from most of our listeners, Professor Sidney Dekker doesn't need too much of an introduction. He's been behind a lot of the new view safety theories particularly safety differently in resilience engineering for more than two decades. Michael Tooma is a very well-credentialed and influential Australian-based OHS lawyer that works as part of a legal firm, Clyde & Co. When it comes to matters of safety science and when it comes to matters of safety law, then the authors are very well-credentialed to talk about that.

Today, we're just talking about this paper. The paper was published in the International Labour Review. I hadn't come across this journal before, maybe somewhat embarrassingly. It's actually a peer-reviewed multi-disciplinary journal of the International Labour Organization. It's actually managed and administered by the ILO Research Department. 2021 is its 100th year anniversary. It was established in 1921 and it's published every quarter in English, French, and Spanish. The topics that the journal covers are all of those matters related to work—law, industrial relations, management science, economics, and health and safety.

Interesting journal choice, but I did speak with Sidney before this paper was submitted and it was designed to be a paper that contributed to the policy debate and really took the flawed issue of incident metrics to every corner of the world, particularly back into developing countries as well.

Drew, we've talked about measuring safety performance a couple of times on the podcast. In episode 35, we've talked about leading and lagging indicators. In Episode 55, we also discussed the white paper out of the US particularly on The Statistical Invalidity of TRIFR.

Drew, I think, at least my recollection if it's true or not but I suspect—in both of those previous episodes when we've talked about incident metrics—we've both being critical of the use of incident count measures as an indicator of the level of safety management or the level of (I suppose) capacity to make work go well within an organization.

Drew: Yes, and one of the dangers of being critical of incident counts is that people immediately ask you what you should use instead. Not surprisingly, this is a question that often gets thrown at people who are presenting new ideas in safety. What would they give to replace current metrics?

I think I've been fairly clear with my own position, which is that I don't really like the use of metrics. If we can't have good metrics, I would rather have none at all, but that's not an answer that flies with a lot of businesses. They feel a need to have businesses. One of the interesting things they point out in this paper is that there may be a belief that businesses are required to have metrics in order to demonstrate that they are appropriately managing safety.

There have certainly been hints of that coming out of a couple of recent mining things in Queensland. The sign that actually collecting appropriate metrics may be an active obligation on organizations for safety. I would certainly believe someone like Michael if he says that businesses use lagging indicators because they think it's a legal requirement, then that's probably his direct experience.

Anyway, the paper has just come out. We're already getting questions on LinkedIn and various places on what we think about it. We thought we'd do a podcast just directly reading it, reviewing it, and giving our own opinions.

David: Yeah, Drew. Thanks. Ben Hutchinson did a review of this and does a lot of good reviews on LinkedIn and his own blog of the safety science literature. I think it was Katrina Grey who asked us to podcast about this. To pay back, Katrina, you'll have to come and present your research on the podcast in the not too distant future.

Drew, section one of this paper (I suppose) rehashes very old ground about the problems with lost time injury reporting, but it puts a bit of a slant on it from a governance and due diligence perspective. Between Michael and Sidney, there's a heavy flavor throughout this paper of not just thinking about the safety of the work but thinking of the way that offices and managers within the organizations also demonstrate the discharging of their obligations. There are this intertwined drivers for the work in this paper. Do you want to just go back through and highlight how the paper talks about the problems with injury counts as performance measures?

Drew: Sure. Let's go through the list. As you said, David, I quite like the spin that they've put on this. Normally, criticisms about lost time indicators start with how they get skewed or how they lack statistical validity. The arguments here are based more around the fact that they’re not fit for purpose if you are on a board or you are seeing the management of an organization trying to achieve due diligence. I'll just read out the list.

The first one is that lost time indicators are unsearchable for comparisons. You can't use them to compare one business to another or one industry from another because everyone has different base rates, different denominators under the injury counts. Everyone uses different definitions of what counts as an injury of different types.

You create these numbers, you publish them in your annual report, but you can't pick up two annual reports, see different numbers, and decide that one company is safer than another company which absolutely defeats the purpose of a number if you can't compare two numbers and even decide whether one's bigger than another.

The second they say that it's unsuitable for trends. You can't use them to decide if you're getting more dangerous or getting more safe because of the low statistical power that they have. Because of that low statistical power, you also can't link any variation to any management action.

It's inappropriate to reward managers for a number that goes up and down due to random noise instead of goes up and down depending on the performance of the manager. They're unsuitable for insights into what's going on because lost time injuries really track the loss of productivity, not the causes of those losses. We've got much better measures of productivity. You're measuring productivity and income directly. If it's safety, what we really want is to track something that is separate from productivity, which loss time indicators don't.

There's another couple that clearly comes in through Tooma’s work. The next one I think is really quite interesting, which is that low injury rates don't tell you whether you're actually meeting your legal and safety obligations. They are two totally separate things. I don't know about you, David, but it never occurred to me that you wouldn't believe that. But I will understand how if you don't think about it clearly, you might think, well, this is my defense. We must be safe because we've got a low injury rate when really you should be proving you're safe because you're meeting the requirement.

David: I think there was a general, I suppose, the idea in the industry and maybe it's still is—I'm not that well connected to it—but for a while that, if you don't have an incident, you don't have a legal problem because the core obligations in legislation are to manage hazards to prevent incidents. Clearly, other than the administrative requirements, I know that there has been for a long time [...] in some industries which is if you don't have the incidents, then you can't possibly have any legal compliance issues or problems because the regulators are only going to take interest in an organization that has an incident.

Drew: Tooma actually cites directly to some court cases here. David, I don't know if you looked up any of the cases. I have to admit that I haven't, but he cited them in support of the fact that low injury rates are not a defense if an incident happens. I presume people have actually tried to run that argument in court to say, yes, we had an incident, but look, our overall rates are low so we're not guilty of being unsafe. We just had bad luck. That argument doesn't fly because the court doesn't care about the low injury rates. They care about this specific incident and whether you were doing enough to prevent this specific incident.

The final one is just fairly obvious that injury rates aren't predictive of the future. Using them to manage safety, using them as your guide doesn't work. If you've got an indicator that is not predictive, then you can't use it to plan your actions. David, I don't know about you but I think these are all fair criticisms. I quite like having them documented. They're well-known but that doesn't mean that everyone's read them, everyone knows about them, everyone accepts them. I think we can't too often say these things in different ways that might appeal to different audiences.

David: Yeah, Drew. What we're just going through now, I think it's one of the reasons that I mentioned earlier about the selection of the journal, just making sure that these perspectives or realities on these indicators make their way to as many stakeholders as possible. I think, also, it doesn't take away from that other point that you're raised at the start, which organizations want something to provide them with an indication of whether things are safe or not safe, or getting better or getting worse. That's the way that companies are run. They're run with metrics and indicators.

The question becomes, if not these metrics that we've got, then what metrics should we have? I suppose that's the stepping-off point for this paper, like I said, to try and do two things which are to actually think about how to demonstrate some level of compliance and also to understand the safety of work. We know these metrics aren't good or helpful for us, but does that mean they're actually bad? Why is using bad metrics a bad thing to do?

Drew: To be honest, I kind of thought that this was obvious, but I guess if you are someone who thinks lost time indicators aren't great but they're the only thing we've got so we'll use them. It is actually worth spelling out the arguments for why having a bad indicator actively causes harm. The second half of section one goes through, why are bad metrics bad?

They start off with the obvious which is that bad metrics don't directly cause accidents. It's not like, oh no, you've used TRIFR, therefore someone's going to get hurt. They say that using those metrics does create conditions within your organization that can be bad for safety. It's a second-order effect.

One of the reasons is that focusing on metrics acts as a decoy. They cite lots of safety theories that in turn cite lots of accidents that show these. If you're using bad metrics, then that can actively prevent you from recognizing the problems that cause accidents. Because you're focusing on the wrong things, you're seeing improvement when things are actually going bad. Your attention is being drawn to things going wrong in minor ways instead of to the big accidents that are looming. An unhelpful metric can actually disguise an accident that's on its way.

The second thing is that if you connect those bad metrics to incentives—things like rewards and bonuses— either explicitly or even just in the board always talking about those things and making them implicit in things like CEO appointments, retention, and promotion decisions, then that creates a culture of secrecy. I know Andrew Hopkins has published a lot of work about this, the idea that once you start using metrics to judge people and those metrics are not accurate, then you have huge pressure within the organization to manage the metrics effectively, which leads to the hiding of accidents.

The final problem is the regulated paradox, which we've talked about on the show before. That if you've got a metric that you're driving down towards zero, then the better your performance against that metric, the less information that metric provides to you. You become blind to what's actually going on. David, I noticed in the show notes, you've put in a reference to Rene Amalberti's paper, The paradoxes of almost totally safe transportation systems, which is a recommended read for anyone who wants to know more about the regulated paradox.

David: Yeah, Drew. I think we've probably put to bed that these incident count metrics are bad again and having bad metrics is bad for safety. It's a bit of that irony that we wrote about in a safety [...] paper as well, Drew, which is that just because they don't work, if you still want to use them, use them. No, they don't work, and using them can ironically actually be worse for safety than maybe not using anything at all.

Drew: In terms of our review of this paper then, section one, I can't see anything objectionable. I can see lots of stuff that I agree with. If you're writing a paper where you want to excite me and get me nodding along the way, then having the long list bash against injury rates and against bad use of bad metrics is going to have me always on board. How about you, David?

David: Yeah. Well, I skipped down because I got more interested once it started getting into section two and what the [...] idea of a capacity index was.

Drew: You are really interested in that question except what's wrong with TRIFR, but what else are we going to do?

David: Exactly right. Now, we're in section two. Basically, section two provides this explanation and theoretical justification for a new measurement in safety which is referred to as this capacity index. This has a very safety differently flavor. It’s coming from a new view of safety theory flavor, but then it's also coming from this direction of measurement in order to demonstrate due diligence. We've got two concepts going on here that I'll quickly talk about.

Due diligence is this legal defense or framework for legal defense for demonstrating that an individual acted appropriately in relation to their legal obligations and provides a safe system of work. We'll talk about the due diligence framework that's used, which has got six elements. One of the problems that the paper was trying to solve is, how can we create measurements in the organization to know as responsible people that we're doing the things that we're expected to do under our legal obligations framework? That's concept number one.

The second concept is this idea of capacity, which is lifted straight out of the new view literature, so HRO theory, resilience engineering, safety differently, Safety II, human organizational performance. All of these theories propose that there are capacities that organizations need to have or enable for them to perform work successfully and safely. This is the study of having capacities (and I think that many of our listeners will be familiar with this idea). Capacity to create the safety of work or to manage work in a way where safety is an emergent property.

But we're starting to mix these two ideas around, Drew. I don’t know how you feel about whether these two ideas belong together or don’t belong together, but I struggled to have these two ideas interchanging throughout the paper.

Drew: David, whenever I'm going trying to make sense of this sort of thing, I go back to our safety of work model. That's one of the reasons why we invited the model in the first place was to have a framework that we could hang these discussions around. Just a reminder for our listeners, we talk about this in a lot more detail. I think it's in episode 50. We're talking about different purposes you have for safety. One of the purposes is we do safety to demonstrate to people outside the organization that we’re safe.

That's where due diligence comes in. Due diligence is externally facing. It’s facing away from the point of the hazard towards showing outsiders that we’re managing that hazard. It's a real and legitimate concern for organizations. I'm actually not very comfortable with the idea that measurement is an appropriate way of demonstrating safety to outsiders. I think that is perhaps a misuse of measurement in the first place. But I can accept that due diligence is something that is necessary. The people who do due diligence might be made comfortable by a metric. I'm just happy to just accept that as a business reality.

Then we've got these other forms of safety. We've got social safety, which is convincing ourselves that we care about safety. We’ve got administrative safety, which is the operation of a safety management system. We have physical safety, which is direct changes to work or the workplaces to improve safety. Then we have operational safety, which are the outcomes from operational work.

Whenever we're trying to introduce new measures, they're always going to fit somewhere inside this framework. Lost time indicators fit fairly clearly. Lost time indicators are measures of outcomes. They are absolutely a direct measure of safety. The problems with lost time indicators are not that they're failing to measure the right thing. It's just that there are really, really bad measurements of that thing. The idea of capacity is resilience engineering's proposal for a better measure of that same thing.

The idea is to have a measure that comes really close to measuring properties of operational work that are present in that operational work but aren’t the outcomes of that work. You find lots of papers written about proposing new measurement systems based on resilience that are based around measures of capacity. The reason why I find this a bit confusing is that if you think about things in terms of the safety of work model, demonstrated safety and operational safety are right at the far ends of the model. You can't measure both at once.

If you want to measure operational safety for the purpose of demonstration, you've got to go through the other types of work first. The risk is that you end up measuring those other types of work instead of measuring operational safety. I think that's the thing that we're going to come back to. Because the common mistake that pretty much everyone falls into, and any time I try to do this myself I fall into the same trap. David, I know you’ve called me out in meetings for this before.

You start off thinking that you're going to create some new measure of operational safety, and you end up just measuring the safety of work. You end up measuring the administration instead of measuring the safety. I think that's where we’re going to see ourselves heading in this paper. Are we actually succeeding in measuring properties of work, or are we just ending up back in that crap of measuring safety activities?

David: The paper does call that out a few times like measuring the safety of work rather than safety work. It is very hard to get out of that trap, and I don’t think this paper managed to get out of that trap either.

Drew: Yeah. I don't know about you, but when I was reading the paper, it was like someone wandering through the woods saying I'm not going to step in the bear trap. Click, snap. They’ve stepped right in the bear trap that they’re telling you that they're trying not to fall into.

David: Yeah, a couple of ways. The paper mentioned six capacities, and these are capacities which are statements of what organizations need to do to demonstrate the six elements or criteria for demonstrating due diligence. I just want to run through what the six capacities are that make up the capacity index if you like.

Drew: Yeah, sure. I'm happy to read them off. Each one of these has got a keyword, which doesn't seem to match the description. I'll just give them to you as they are on the paper. The first capacity they call know, which they say is the capacity for things to go well under variable conditions. This is the thing we often look for is work needs to adapt to local conditions, how successfully is it adapting? Are the conditions local to a particular place or local to a particular time?

The second one they call understand, which is the capacity to anticipate through risk competence and risk appreciation. The third one is called resource, which is the capacity to make resources available and to identify goal conflicts. Then we have monitor, which is the capacity to monitor, identify issues, and communicate about those issues. Then comply with the capacity to assure the effectiveness of the monitoring. Then verify, the capacity to learn from value and success.

David: This is the way the first mis-integration of these two ideas because you're right, they don’t quite line up because the keywords relate to the six due diligence principles where know is about maintaining appropriate levels of knowledge of health and safety. That's being used just as a category if you like to throw this capacity in there about the things that go well.

They don’t really quite line up that well because I think what they're trying to do is look at the requirements of individuals to maintain knowledge of health and safety. Then try to think about what that might mean is an organizational capacity. They actually just don’t line up as well as they might need to.

Drew: Okay. Now that you’ve explained that, I can see where it's coming from and I can see the mismatch. Because if the requirement for due diligence is to know what's going on, that is a totally separate thing from the capacity for things to go well under variable conditions. In fact, there’s a good argument that a lot of the capacity for things to go well comes from the freedom of not having someone looking over your shoulder.

In fact, one of the big tensions in safety differently is how do you just let teams vary locally and safely. Because any form of oversight tends to standardize things. Already, we've almost got a little direct contradiction in these objectives here.

David: Yeah. These capacities, there's a couple of things intertwined together. The capacity to monitor and identify issues and communicate are at least three broad areas of capacity and nothing specific in there. I see this as very much a theory paper, Drew. I don’t know if that's exactly what it would be called. This is proposing a new idea in safety. Theory building is something that needs to be thought through very carefully.

Where is your starting point? What existing theory are you stepping off from? What are you responding to? We talked about that a lot of times in the podcast. What is the theory being provided in response to? I think the starting point with the due diligence elements is just not the right starting point to understand capacities to create the safety of work.

Drew: Yeah. I don't want to read too much into the intentions of the authors because I think the fairest thing to do is to take a paper directly on its own merits. But I think this is a common problem with metrics. Very often, the reason why people propose metrics is they are trying to provide a solution to a problem. They try to provide something that they can give to industry, possibly sell to industry, or at least sell as an idea that other people can pick up and use.

That's often creating a productized solution is inconsistent with good theory building. To create products, to create packages that people can take, you've got to make all of these compromises. But the compromises then end up shoehorning things together that really go together so you have a bad theory. Often, a good theory is very unhelpful. Often, the good theory just says here’s a problem. We got no solution to it, but we've described the problem fairly and well.

David: Yeah, absolutely. We’ve got to the end of section two where these capacities are introduced. It's actually quite a considerable section on each of these capacities. It's actually not a bad read because it cherry-picks the literature, but it does actually just tie in a lot of the different aspects of the safety of science literature and puts it into these six different bucket areas. Some of the capacities that we've mentioned, they're not too far from, for example, the resilience potentials, Erik Hollnagel’s capacities to monitor, anticipate, respond and learn.

You can read through it, it's a very accessible language. It’s well-referenced, but I just think that the framework here in section two is actually the right place to actually start creating performance measures. We’re going to see that I think very clearly in section three.

Drew: No, I agree with that. I don't think cherry-picking isn't entirely fair criticism. It's not intended to be a comprehensive review. The stuff that they pull out, it gives you a good sense of a body of literature. It doesn't give you a misleading impression of that literature, I don't think. You could question why certain things have or haven't been included.

David: I think my comment there about cherry-picking is in relation to the fact that the starting point for identifying the capacities to create safety was to have six due diligence principles and then pick the safety science literature in discussion that could line up with those capacities. The starting point with this section was not to say, what does the literature say are the capacities that are needed to create safety of work? Then how does that literature line up? Then let's see how this literature lines up against the six areas.

There's not enough in there about some of the things that we know are capacities to create safety.

Drew: Thank you for explaining that. That much I definitely agree with. It's trying to shoehorn things that fit these due diligence principles. I agree that there are some things that don't get included because of that.

Let's move on to section three. If this is leading up to the proposal of a suite of metrics, then section three is that proposal. Particularly there is a table in this section, which highlights what the metrics package looks like.

We've gone on from these vague descriptions of capacities down into some really quite specific suggestions. Just to describe what this table looks like, it's got a list of capacities which are the ones we've already talked about. Against each of those capacities, it lists off the due diligence requirements, which I think is taken directly from that statement of what boards are required to do for due diligence. Then they have two columns, initial measures and developing measures. The reason for these two columns is that they refer multiple times to this suite as a work in progress.

The initial measures are things that we can measure now. The developing measures are where they would like the capacity index to go perhaps with further work and further definition of measures.

David: Just to frame this, there's a couple of things that are said in this paper prior to this section. One is that it is easier to measure safety work than it is to measure the safety of work. That’s one of our approaches, Drew, and that was in there. It absolutely is. You can count and you can see the safety of work, so it is very much easier. But then also quoting some of Deming's work from the 1980s to say that outcome measures are not variables that organizations should set out to control.

The measurement of the activity itself as opposed to maybe the input measures. The problem is the paper sets out this landscape well as what needs to happen, but section three goes on to basically do all those things that the paper says not today, which is the bear trap that you mentioned earlier. Drew, your opinion on this table? We've got this table there that actually—I suppose some credit to the authors here. They’ve actually put metrics on the table and said to measure this.

Drew: I guess going through the columns in order, we've already talked about the capacities. I think you can always argue that there are different capacities that you could measure as well. I don't think there's anything inherently wrong with the capacities that they've suggested. I agree that all of those capacities are things that do work for safety. That having measures relating to those things would be helpful and certainly would be better than measuring outcomes.

The trouble is that once we stop thinking of those capacities in terms of due diligence, I think there's a fundamental theoretical conflict because due diligence is a lot about measuring the work of safety rather than the safety of work. I think that's fundamental in the definition of what boards are supposed to do. They're supposed to focus on measuring administration and representing the organization to outside stakeholders. They're not actually supposed to focus directly on operational work that much.

If all of these capacities are properties of operational work then we have this tension. When we move into the next column, which is about the initial measures that they're proposing, they’re all about just measuring volumes of certain types of activities. Perhaps we should give some examples there, David. Do you have the table in front of you?

David: Yeah. I've got a couple of examples. For example, this idea of building the capabilities and people say that things go well even under variable conditions. The metric for that is the number of work insights per million hours worked. Another one would be the capacity to anticipate through risk competence and risk appreciation at all levels of the organization. Measuring the capacity to anticipate future risks is about the number of learning reviews per million hours worked for example.

You're seeing all of the capacities boil down to a particular activity. Let's say a safety work activity or operational work activity (not the work itself) per million hours worked to get it right.

Drew: Yeah. My concern is that the moment you start doing measures on these activities, then all you do is drive up the activities and drive down the quality. You tell a supervisor that you must produce 100 worker insights and they'll produce 100 worker insights. Every one of those worker insights will be absolutely less than insightful.

You tell people that they need to conduct a certain number of learning reviews. That says nothing about whether anything he's learned in the hours learning reviews. It's just measuring an activity and driving activity without measuring the actual capacity that that activity is supposed to create. I don't think that what we're doing here is actually measuring capacity at all.

We’re measuring things that are neither creators of that capacity nor symptoms of that capacity existing. They’re merely activities that exist vaguely in the same topic area as the capacity. I think that’s a real problem. I don't think it's enough for the authors to say, well, this is just a work in progress, we're going to get better. Because every one of the actual proposals is a measure of the volume of activity. And then all of the developing measures are just placeholder names for metrics that don't exist.

They've got some wonderful names for metrics that they don't define. Instead of just counting the number of learning reviews, they're going to have a resilience control score—a control implementation assessment. These are not metrics that exist. It's like, well, we propose a bad way and we’re promising to give you a better way, just wait till the next paper is published.

David: Yeah. Drew, I think I'd probably need to be a little bit directly critical of this table and this work here because I think the table is mostly unhelpful and particularly when very credible authors put something into a peer-reviewed publication that can be done in an organization. I think part of the role that I take seriously on a Safety of Work podcast is to say, please don't do this because I don't think this table moves us forward. We'll talk a bit about that more before we finish the podcast.

But it also shows a bit of a lack of understanding about what he's done in the industry at the moment because a lot of those developing metrics around a significant event rate or severity rate. Organizations have been using injury severity rates, [...], or significant incident rates for two decades. Some of the other things are just developing metrics like safety plan implementation as a developing metric. But these are not helpful things to put in concrete inside a table. It feels to me like it was a draft table. It just was an initial discussion that just never ever got finished.

Drew: I think it goes beyond being unfinished. I don't think this is deliberate, but it is certainly a little bit disingenuous. So under the idea of resourcing, they mention a thing called a resili score, which they say is a measure of the resilient state. Now, I don't know about you, David, I don't know what a resilience score is. It's not defined in this paper. It's not spelled out. I don't think this exists in the peer-reviewed literature.

But what's going to happen is we now have a published paper that has been peer-reviewed that says resili score is a good way of measuring things. Even though the peer reviewers have never peer-reviewed resili score because it's just a placeholder name. Someone is now going to publish a technique called resili score and that's not going to be peer-reviewed. That's going to be put out by a company that is presumably selling it to organizations off the back of the fact that it's been documented in a peer-reviewed paper.

I think that is really quite blurring the grounds between sort of industry products and academic products. When we have the industry products mentioned in an academic article but not defined in that article so they don't get peer reviewed. But the overall article on the table in the package claims peer review.

David: Yeah. And there's another big claim, which is why I get particularly nervous, even though the work-in-progress statement is made, there is also another claim that the measures in the capacity index (those measures that we’re talking about) are not only consistent with existing and emerging research in safety and resilience engineering. But direct quote, “They also exhaustively cover the due diligence requirements under work safety legislation in many Western countries.” And this is untrue and unsubstantiated.

If you're following these metrics in this table, in this index, you are not covering your due diligence requirements or your legal requirements. I'm definitely not as credible as Michael Tooma to talk about RHS law, but that claim is not true.

Drew: I'm just going to have to accept your word for that one, David, because I know even less about the application of RHS law than you do.

David: If you're measuring six items, if you're measuring the number of working sites, you’re doing the number of learning reviews, you're doing the severity of your incidents and the other four or five points in that table, that is not comprehensively and exhaustively covering your due diligence obligations or your legal compliance obligations.

Drew: I can certainly say that it’s not consistent with emerging research in safety and resilience engineering. I’ll let you say the one that it doesn't cover the due diligence and I'll say that it doesn't cover the safety science either. Basically what we're doing is we're measuring activities and all of those things are about measuring activities.

Now unless you already know for sure that those activities provide the capacity that you're looking for, then measuring the activities doesn't tell you anything about the capacity. But how do we know that the activity provides the capacity? We could only know that if we have some better measure of capacity, which we currently don't have. I mean I'd love to have it. That would be the holy grail of safety measurement, is a set of good measures of capacity. But without them, falling back on just measuring the activities creates this sort of circular argument.

Even in this paper, they talk about things like a number of audits. I think that your number of audits, we know that audits are not effective at creating safety capacity. I think there is both solid existing and certainly emerging evidence that audits don't find the safety issues that we need them to find. Now, maybe from a compliance point of view, audits still have enough of a reputation, enough credence that you can claim that you're doing your due diligence because you've done the audit.

But from a safety science point of view, you absolutely cannot say that. You cannot say we are safe because we have audited it. If you believe that, then let me throw you an entire shelf of accident reports. Every one of which talks about the audits that happened before the accident, and how the issues that caused the accident were not found by the audits. Some of them even say that the audits gave a direct sense of false security.

David: Yeah, Drew, I just want to follow one example through just to pick up on what you said there because I think it's really important. We've got a capacity in an organization that we want to understand. Let's choose, for example, that building the capability in people so that things go well. That's capacity number one. How do we measure the capability that people have to make sure that things go well? That's what we're trying to understand.

First of all, we need to think (from your point, Drew) how do we define that, how do we understand that, and how do we measure that? When the paper goes on to describe the metric, it says the number of worker insights per million hours worked. Now, I can't (in my head) know the link between someone doing a work insight—like a task observation or something like that—and building capability in people to make sure things go well.

Maybe if the person is doing electrical work, maybe knowing that they're an electrician tells me more about their capability to make things go well in that electrical work job than a worker's insight. We've got these measures that don't tell us about the capacity, and even if we could figure out the mechanism that we talk about, okay, the worker insight understands work as done. Then that means that change gets made to the physical work environment. That means, then people trained in this change, and that means that they've got a new capability.

If we drew that link, then there’ll be plenty of other things first to measure without having to measure the activity that's suggested.

Drew: I think I can see the sort of inherent logic. You might assume that companies that care about building capacity in people so that things go well might also be the sort of companies that are trying to do things like hold learning teams. You might even presume that the learning team actually creates increased capacity in things to go well. As a board, you want to know that your company is doing things like learning teams. That your company is trying to understand what makes work go well. I think that much is reasonable.

But I think no one who does learning teams would tell you, we know it's been a good learning team if we have the right number of worker insights. We've got this performer that every learning team has to come up with 10 worker insights, and we know it's been a success if we've got all 10. It's a terrible measure of the success of your learning teams.

Much as I hate to throw this into the world of listeners who probably believe a lot in safety differently, there is currently no good evidence that learning teams build capacity in people that things go well. This hasn't been scientifically studied. It’s taken as an article of faith. And to take this article of faith and suddenly turn it into a metric I think is going to destroy any possibility of learning teams helping work go well.

David: I think the institutional theory and definitely my own ethnographic research, Drew, showed this. This reminds me of the HRO literature in the ‘80s, which was the HRO theory was never designed to be a how-to model. It was designed to be a descriptive model of some things that were seen in organizations that seemed to be highly reliable. Once it turns into a top-down formula—we know in relation to the corruption of role and task in hierarchical institutions—that it becomes compliance activities.

The learning teams become nothing about learning and just about doing one a month. We know that with so much of our safety work activity that this paper just walks straight into the wall of those [...] about the way that organizations function.

Drew: David, I know we've joked before about safety differently audits and the way every new thing about safety gets captured into the old patterns. Are we seeing here two leaders of safety differently steering us straight into the path of turning it back into that same compliance-focused, activity-driven safety frameworks that safety differently started off by complaining about?

David: I think this paper tries to do too many things. It's trying to do something that extends the safety dip and operationalizes some of the safety differently ideas. It’s trying to do that in a way that someone can practically do something with it right now. It’s trying to do something that not only helps improve operational safety but also satisfies the need for demonstrated safety and the discharging of directors’ due diligence obligations.

It's trying to solve all these really big issues in safety. As a starting point, that's not the job of an individual paper.

Drew: Let me throw this at you as a direct question. This is probably less to do directly with the paper. We see this requirement for due diligence that is built into the way institutions currently work. It's built into the way we think about how companies run, it's built into a whole series of court cases that have established the requirements, it's built sometimes into black letter law.

We've got this idea of safety differently, which comes from a very almost anarchist position about safety, which tries to focus purely on the operational end and doesn't start with the legal requirements. Is it the case that once you try to bridge the gap between those two ways of looking at the world—the due diligence way of looking at the world, the safety differently way of looking at the world—that it's inevitable that you’re just going to see this conflict and this trying to shoehorn things together that don't really fit?

David: I think, Drew, it’s probably my turn to, I suppose, think through this in the safety of work model because it's an end need, but it's something that shouldn't be conflated together. This need to demonstrate due diligence and the need to create operational safety, they are both needs of organizations. But this paper conflates the two and I think that's the problem.

Drew: It's possible that this sort of project of creating metrics for both could work better if instead of trying to line the two up, we simply say every organization has to have two sets of metrics. One set of metrics about capacity and one set of metrics about due diligence.

David: If you go back to due diligence in some of the work that I've done with boards, there are individual directors knowing and maintaining a knowledge of safety. How many hours a year do you spend learning about safety? How many safety of work podcast episodes have you listened to? How many industry briefings have you been to? How many hours have you spent per year, per director?

Knowing and understanding the risk that your organization faces at the control frameworks in place, how many sites have you visited? How many risk outputs of major hazard risk workshops have you reviewed as a board? What risks have you directly verified yourself that the controls are in place of work in your organization?

You could have a set of activity metrics for directors because discharging their due diligence is about stuff that they do. And then deal with operational safety in a whole different way that actually relates to the context of the way that work happens in your organization. So yeah, maybe. I don't see a single set serving multiple purposes being kind of useful for any.

Drew: What I love about what you just said, David, is that you have actually reconciled the two. You showed how you could apply the safety differently principles and the idea of capacity to a board. But you'd be talking about getting the board to think about themselves as workers and thinking about their capacity to meet their due diligence. But then you would separately measure the capacity of the organization to meet its safety.

David: Hadn't thought about it as elegantly as that. But yes, because it's what you are trying to create where in your organization? You're trying to create due diligence in part of your organization—the board or senior executive—and you're trying to create safety in the front line. I think the paper's trying to do too much. What happens, unfortunately, is it doesn't really do any of it that well.

Drew: David, I don't know if you'd like to close off this discussion on that charitable note. Because I do know that you’ve got a couple of perhaps slightly more harshly worded conclusions. I don't know if you want to throw any of them in.

David: Publishing work is a big responsibility. I know that Michael and Sidney don't take that for granted at all, but I worry the critical review may not happen and people will take some of this as the next management fad or fashion that we've talked about on one of the episodes of the podcast. Okay, this is great. This is what I have to do. I don't think it's going to help people.

Let's move on to practical takeaways, Drew, because I think there are some takeaways in here. Maybe if I just run through practical takeaways and then we can leave some requests of our listeners?

Drew: Absolutely, let's do that.

David: All right. Practical takeaway number one, yes, move on from injury rates and counts as your measures of safety performance. Yes, do do that. Also, we need to move on (in my opinion, Drew) from what we've so-called leading indicators, which really just count the frequency of the performance of safety work activity, which is really what this paper does as well. Whether it’s audits, investigations, inductions, the things that we spoke about in episode 35, the number of learning reviews, the number of compliance audits, or the number of worker insights, which are discussed in this paper.

We've spent nearly 74 episodes now talking about this really tenuous, difficult, and complex link between safety work and the safety of work that suggests those safety work indicators are really not going to be that helpful or useful for us about the safety of the work.

I think, Drew, the third one, if we’re going to go beyond that, then yes, do identify the capacities that your organization needs to have to make work go well. That is what this paper set out to do in terms of these capacities, do that. Don’t come from a due diligence point of view, but come from a work perspective.

And then, I think what you're saying, Drew, is that when you've got those capacities, the things that are important in creating safety as an outcome of your work in your organization, then identify ways to directly understand and if desired, measure that capacity. If the measurement is qualitative and it's not a frequency rate or it's not a count, then so be it.

That's the sequence I’d say—move on from TRIFR, move on from safety work-related leading indicators, do take an attempt to identify your capacities, do think about how you're going to understand the extent to which that capacity exists in your organization, and do something with that information.

Drew: I don't want to pretend that steps four and five there are easy. I think part of this project of improving measurement is recognizing that any measures we think of are going to be flawed and we're going to keep falling into the same traps. Very, very big brains have followed that path of moving on from TRIFR, identifying capacities, trying to understand and measure that capacity, oh damn, realize that actually what we're doing is measuring safety activity again, back to the drawing board.

Do (when you come up with measures) ask yourself, are we actually measuring capacity, or are we just measuring volumes of activity? And if we fall into the trap, don't beat yourself up about it. We all do it, try again. Eventually, one of us is going to crack this problem and actually find some good measures of capacity.

David: Drew, you’re right. You can pick any of the theories. You say, okay, high-reliability organization theory, which is also topical in your part of the world at the moment, Drew, particularly the mining sector. You say, right, I want to measure my capacity around deference to expertise. How am I going to know how that happens?

Or say sensitivity to operations. I've tried to do this before to say, well, I want my senior leaders to be sensitive to operation, so I'm going to count the number of days that they spend in the field every month and set them a target around that. There you go, falling straight into that trap of just the volume of the activity itself rather than what it’s actually creating in your organization. It is very hard to break out of this trap that I've sort of criticized the author's here for, but we need to put some horsepower behind this.

Drew: It is frustrating that it's so hard to break out because if we just picked that example of deference to expertise, it is important and it's a real thing. People in your organization probably know where the experts are deferred to. If you could ask everyone and trust yourself to get an honest answer, you could probably tell people. Who gets listened to? Is it senior management or is it the people who have the expertise? If you were getting an honest answer, that's a real question that could be answered which could be measured. It's just that measuring stuff well and getting in a stance is so hard.

David: Yeah, it is. One of the things about performance metrics (just to throw it out there) is any indicator, any representative measure will only ever tell you what questions to ask. It’s never going to give you the answers. You might not even need to measure it all if you've got a set of really good questions in your organization because you can go and actually get all that descriptive insight without necessarily needing to be tabulating an indicator as well.

But Drew, invitations for the listeners. Let's try and have some of this discussion. Let's try and get a bit of horsepower behind this. What are organizations doing to try to measure the capacities that they believe they need to have to impact the safety of work? Where have you got to with this whole metrics and measurement?

I'm going to assume that pretty much everyone in the safety community—it’s probably too big a claim—lots of people are probably on board with this need to move on from incident rates. I know there are lots of people in organizations who are doing work to change the way that they have safety performance conversations, and thinking about then in your organization, I'd love to hear your stories. If you've had a go at understanding your capacities and measuring them, please tell us what you're doing.

Drew the question that we asked—we did ask a question at the start—is a capacity index a good replacement for incident count safety metrics?

Drew: I think measuring capacity is the right direction to go. I think this paper is a self-admitted work in progress and needs to be treated as a work in progress rather than an answer at this point. If you're interested in seeing work as it's progressing, then have a look at this as work in progress, but it's not ready to be a replacement.

David: Thanks, Drew. That's it for this week. We hope you found this episode thought-provoking and ultimately useful in shaping the safety of work in your own organization. Please join in this discussion with us on LinkedIn or send any comments, questions, or ideas for future episodes directly to us at feedback@safetyofwork.com.