The Safety of Work

Ep. 68 Are safety cases an impending crisis?

Episode Summary

Safety cases have been around since the inception of nuclear power. Now, safety cases have spread all the way to amusement parks. They are always linked to major accidents.

Episode Notes

Today, we plan to discuss whether safety cases are headed towards an impending crisis.

Join us as we figure out if the work safety community is headed for disaster.

 

Topics:

 

Quotes:

“...It’s a little bit paradoxical: Because why do we try to identify hazards, if not making the implicit claim that by trying to identify hazards and control them, we are making our system safer?”

“People don’t share their safety case data with anyone they don’t have to share it with.”

“And if we can turn the reasons why people do things into theories, and then test those theories, then we’ve got good potential for changing how people do things…”

 

Resources:

Safety Cases: An Impending Crisis?

Feedback@safetyofwork.com

 

Episode Transcription

David: You're listening to the Safety of Work podcast episode 68. Today, we're asking the question, are safety cases in an impending crisis? Let's get started. 

Hey, everybody. My name is David Provan and I'm here with Drew Rae. We're from the Safety of Science Innovation Lab at Griffith University. Welcome to the Safety of Work podcast. Each fortnight now, we ask a question in relation to the safety of work or the work of safety, and we examine the evidence surrounding it.

Drew, what's today's question?

Drew: David, today we're going to be talking about safety cases. I think this is the first episode we've discussed safety cases or at least the first we've directly discussed them. 

I thought we'd start off with just a little bit of background before we get into the paper that we're talking about. Safety cases have been around for a long time. The paper that we’ll read this day suggests that they first were introduced for nuclear power in the 1950s which sounds about right. They've suddenly spread through different industries, so very popular regionally in nuclear, then chemical, then oil and gas, and now defense rail. I've even seen some construction safety cases.

David: Drew, we mentioned safety cases as a recommendation or as an idea when we did the episode on the Dream Boat incident. We're seeing it at least in the Australian domains now that safety cases are spreading through rides or amusement park devices. There are a couple of big accidents that started shaping the formal thinking around these ideas. Do you want to give some background to them?

Drew: Yes, since safety cases have always been linked heavily to major accidents. The reason is that when we have a major accident, we're faced with the fact that regulators are very limited in their knowledge, resources, and access. People are trying to work out, how do we make it possible for regulators to control this thing which is out of control?

The response is to shift the burden of proof and to say, regulators can't control safety. The company needs to control safety. What we'll do is we'll make the company have to convince the regulator that they are managing safety correctly.

Ultimately, that's the central idea of a safety case. It's an obligation on a company to convince a client, an independent assessor, or a regulator that the system is safe. That convincing usually takes fairly standardized formats mostly around producing a bunch of evidence—evidence about the system or evidence about the processes you're following—and then an argument on top of that evidence to show how it all fits together.

Safety cases then end up being called a living document which is just a way of getting around the fact that it's a document. But a document is supposed to be regularly updated so that it constantly reflects the current argument, current evidence, current state of the system.

David: Drew, for our listeners that aren't familiar with safety cases or aren't in any of those industries that use safety cases, it's like you said, it's literally an organization making the case for safety for their asset or their operation.

Like you said, it takes some fairly consistent formats, but this paper points out that all of these safety cases start with a description of the system, description of the facility, or what an argument is actually being made for. Then, after that description of the system that the safety case is covering, then a risk process of what are the hazards in that system. What are all of the control measures that are being applied for all of those hazards? Why is the overall risk as a sum of all the individual risks a tolerable and a safe assessment?

Drew, would you give any more description for someone who hadn't picked up a safety case document before? This can be hundreds and hundreds of pages or maybe thousands of pages. Any more to provide a bit of context?

Drew: As we'll get onto a little bit later, it's not essential that the safety case be based around hazards and controls. That's a very common format particularly with major hazards, particularly in oil and gas, that they base it around assessing the risks, showing that we've controlled the risks.

The fundamental idea is that you can make any type of argument at least in principle. Instead of showing that you've assessed the risk, controlled the risk, you might also make an argument that says, our previous system was safe and we haven't changed anything major. Therefore, the new system is safe. Anything that counts as a logical argument with supporting evidence supposedly could make up a safety case.

In practice, they tend to rely heavily on evidence of having identified, assessed, and managed hazards just because that's the way most people try to achieve safety, and safety cases tend to reflect the safety management processes.

David: If that's some context, the start of the paper that we're going to discuss today talks about the position that there's quite a large amount of research on safety cases and this research focuses on a lot of the tactical aspects of safety cases—how to structure them, how they should be reviewed, how these arguments should be formalized mathematically, or how these arguments might be better generated automatically—but then, it goes on to point out this very little research that evaluates the safety case methods and practices as a whole, and to answer that question that we're all about on this podcast which is do safety cases actually work in terms of improving safety?

Drew: David, in preparation for this episode, I did actually go back and looked at a few papers, and even a couple of PhDs that deal with safety cases just to see how they position their work and justify it. It's fascinating because safety cases didn't come from Safety Theory. They came from this practical regulatory need. You look at how people justify the work.

They basically start off by saying, safety cases are a thing. We have safety cases. We have to do safety cases. My question then is how do we do safety cases better? Or how do I do safety cases for autonomous systems? Or how do I move from safety cases to security cases? Or how do I automate safety cases?

They start with this assumption that safety cases must work or at least must exist because they do exist. That's fairly common to a lot of safety activities. The activity comes first and then the research into the activity comes later. The research is always trying to build and expand the activity rather than to answer fundamental questions.

We get all these implicit claims that safety cases improve safety or safety cases don't improve safety, or safety cases are cost-effective or they're not cost-effective, but we start with those claims. We don't come up with those claims as the result of having conducted research.

David: There are some claims in the literature about safety cases. It seems like there's a group of claims that people say that safety cases work to improve safety. Some authors will claim that (like you said) they exist but that they also improved safety. Then there's almost a group of claims that say that safety cases may improve safety, but they're not worth the cost that's involved in developing and maintaining these cases in terms of safety benefits, so that resource that goes into those thousands of pages of documentation might be better allocated to safety improvement if it's elsewhere in the system. Then there are some claims that safety cases actually might have negative effects on safety. 

Do you want to talk a little bit about the example of how some of these claims might work? I find it so fascinating how these people claim that safety cases can have a negative impact because if people have made a case to safety and it's been accepted by a regulator, they might have an over confidence that their system is actually safer than it actually is in reality.

Drew: The argument that is made is that trying to show that your system is safe sounds like a recipe for confirmation bias and sounds like the opposite of hazard identification which is supposed to find out how your system might not be safe. You can understand where people who are instinctively anti-safety cases are coming from.

In fact, there was a major accident involving the Nimrod aircraft. Haddon-Cave in his report afterwards suggests that maybe we should call them unsafety cases and focus on how we prove that our system is not safe.

I understand the superficial level where they're coming from, but it's a little bit paradoxical because why do we try to identify hazards if we're not making the implicit claim that by trying to identify hazards and control them, we are making our systems safer?

I think for any safety activity, there is always an implicit claim that it is safe. That's what the people who promote safety cases would say. They'd say, we always got these implicit claims that we're safe. Surely, it is better to make those implicit claims explicit so that the regulator can check them, so that they can argue about them, so that we can find the gaps in them.

As we'll get into with this paper, the point is that that's not the way to find out what works or not works, to have high-level arguments about what logically should or shouldn't work. Why don't we just work out what does or doesn't work rather than what should or shouldn't? I thought this paper gave us a good discussion about why we can't just do that and maybe simultaneously what we can do instead.

David: I'll get you to introduce the paper in a second, Drew. Like you said, we know from this podcast or at least those who know from this podcast, it can be very hard to answer this question of do safety cases work? Do they prevent major incidents? Well, we never have enough major incidents to know and we never know the impact that the safety case was having on the causality of a major accident.

It might be a question that we never answer, but that's not the point because the question that we want to answer is how do the safety case approaches actually work inside an organization? Is the safety case approach better or worse than alternative methods for checking and validating the safety of our systems?

To do that, we actually need to know how people currently develop a new safety case in their organization. What are the work practices, concerns, needs, constraints, problems of these practitioners when they're developing, managing, maintaining, improving, and reviewing safety cases in their organization?

Drew, I'll get you to introduce the paper because you know this domain very, very well. You know the people very well. You know the universities very well where these ideas first formalized. Do you want to introduce the paper?

Drew: Sure. The paper is called Safety Cases: An Impending Crisis? The authors are Ibrahim Habli, Rob Alexander, and Richard Hawkins. They're from the University of York which is my old hangout. In fact, they're from the research group that I used to be part of there six or seven years ago. 

It's worth pointing out, and I think this is a really interesting thing about the paper and one of its limitations. This entire research group is dependent on the existence of safety cases. They pretty much invented the safety case in its modern form and they've done a lot of work advocating the adoption of safety cases.

It'd be really interesting to see whether they are capable of following through on some of their intended forms of research suggested in this paper. It's a bit of a conflict of interest when you're going on someone, why do you do safety cases and their answer is because you told us to do it. It then makes it a little bit circular when you try to uncover how they work.

These three particular authors I have a lot of respect for. I would say that they are the three people within the group who most consistently—whilst following along with the overall research mission—have maintained their healthy skepticism about the evidence base. They've always been concerned about making sure that what they teach and promote is grounded not just in ideas but is grounded in evidence for what works and doesn't work.

It's not in itself an empirical paper. They haven't done a study here. What they're doing is they're basically setting out almost like a manifesto or a research agenda for how they might go about studying. This is—you could imagine—the first step in a research program, sit out their philosophical position. I thought that position itself was interesting and worth discussing.

David: I agree, Drew. This is a February 2021 conference paper and it's available on ResearchGate so our listeners will be able to access it. It's about as current as it gets. The paper's lying at the state of affairs when it comes to safety cases—where safety cases have come from, where they're at, and what might be the research opportunities to move forward.

The paper talks about three important aspects that any research in theorizing around safety cases needs to be able to understand. The first is that we need to establish the real world safety assurance needs.

You mentioned the regulators earlier, Drew. What are the actual needs that safety cases have the potential to make? How do we know exactly what these needs that are being made are and the context in which these needs get fulfilled by safety cases? What do engineers need? What do regulators need? What do organizations need? What do communities and societies need from safety cases? How do the safety cases actually address those needs?

The second aspect that the authors talked to is creating theories that capture how safety case practices lead to delivering on those outcomes, all those needs. So really trying to understand how safety cases are working inside organizations, how they're delivering on what the stakeholders need from those processes, and to theorize around that, to look for generalizations, patterns, and consistent practices, and make sense of that practical safety case world that's happening in organizations everyday.

The third question, which is where (I suppose) we start to get really interested in moving a little bit beyond the descriptive side of establishing the needs and how safety cases make them, but really looking at research methods that answer efficacy and efficiency type questions. Is there evidence now that these safety insurance needs are actually being satisfied in a meaningful or a valid and reliable way? Drew, are these the important aspects that you'd think are directly relevant for forward research?

Drew: I think they are very relevant if you come from a mindset that research in safety is about creating safety processes and activities that people can use to meet their needs. I have a meta-concern which is that I think this is a little bit disingenuous about the role that academics play in creating those needs in the first place.

It's one thing to say what are the needs that we have to satisfy regulators? But if those regulators are being educated by our universities or are being persuaded that we need safety cases, then it's a little bit circular because we have created the need by our advocacy over particular techniques.

You can say the same thing about risk assessment. We have a need to create a risk assessment. Why do we need a risk assessment? Because the regulator insists that we have a risk assessment. Why does the regulator insist we have a risk assessment? Because safety people told regulators that that's the way to do safety.

I personally think that to understand safety, we need to go and level up from this instrumental. We have a need for what process or activity can meet those needs to understand the whole ecosystem of assurance. That's not what most people are interested in. Most people are much more instrumental demand-driven.

Certainly, if you're in an organization, you're not interested in what are the deeper forces that cause the regulator to ask you for something. If the regulator's asking you, you want to know what process will adequately meet that. So yeah, if you’re talking about functional, then asking what's the need, how do we meet that need, does it affect work, are the right questions to ask.

David: Rob was one of the co-authors on our manifesto of reality-based safety science paper. You can see this is an effort to get the research community around safety cases to think beyond maybe some of the tools, techniques, and practices within safety cases, and start to look at case studies, descriptive research—really looking at the practitioner perspective on these things rather than just the theoretical perspective.

Drew: David, we talked—I'm not sure whether that was last episode or a couple of episodes ago—about how to make sense of arguments in the safety community. I thought we might practice a little bit of that here by talking about the context in which people started advocating for safety cases.

As we mentioned, safety cases have been around since the 1950s. Two particular accidents that drove them forward were Piper Alpha in 1988 and Seveso in 1976. Seveso pushed Mainland Europe towards safety cases and Piper Alpha dragged the UK kicking and screaming into that same space. In both cases, it was this demand for accountability towards the public and the regulator rather than trusting the regulator to come in, inspect companies, and do so effectively.

Then there was this movement in the 1990s to start going beyond just the broad notion of a safety case to really quite formally representing them. This was a big trend in engineering because engineering was moving much more towards model-based development.

The 1990s will win, everyone had software tools to do things. People were coming up with bright, new software tools to support anything—to support engineering, to support software development, to support documents. This is when your Microsoft Word started going through version after version, adding more and more functionality.

The idea was basically doing the same thing with safety to get towards you to create a model of the system, you describe formally what safety looks like, and then that lets you very thoroughly—possibly even automatically—check that the system you're developing is safe.

The dream was we could press a button on our design and it would come back saying, ding, ding, ding, no, it's not safe because of part number 57 in this spot. Fix that and you've made your system safe.

We never quite got there, but that trend towards making safety analysis—this formal thing with its own notations, its own rules, its own mathematics—was becoming very popular. Safety cases were a way of making safety arguments into that shape. We had these theories about how arguments are formally constructed and we turned them into notations such as claim-argument-evidence or goal structure notation.

Two particular groups, Adelard—who did a lot of selling of software tools, and particularly John Bishop and Robin Bloomfield—and University of York, the safety research group. The original people there were Tim Kelly and John McDermid. They're both quite senior and moved on now. We've got that new generation coming up and starting to challenge those original ideas in this current paper.

David: Drew, you've gone back to the source. The first two things if we went back to that episode on how to resolve debates in safety science, you've gone back to the original source. You've explained what the original sources have said about safety cases. Let's move forward now to the different research that's done in the safety case area. Do you want to talk about the research that goes on across the domain?

Drew: Sure. I'm extracting this directly from the paper. They give quite a good list of all of the different types of research people do on safety cases. We've got research that gets done on notations, different ways of writing safety cases, different ways of structuring arguments, different ways of representing those.

There's research on processes. How should you produce safety cases? They get big and complex, so maybe how do we modularize them? How do we reuse them? How do we check them? There's work on fundamentally, what is a safety case? What's an underlying meaning, metamodel, or representation?

Work on automation. How do we create tools to make safety cases, tools to analyze safety cases? Possibly even formalize them so we can start to prove properties of them and do automatic checking.

There's a lot of work on producing domain-specific safety cases, particular patterns of arguments, or particular ways of arguing about certain things. There are ways of going beyond cases for safety to general assurance cases, maybe cases for security or reliability cases.

Then stuff on confidence. How to represent uncertainty in safety cases?

If you've been keeping track, notice that every one of these bits of research make safety cases bigger, more complex, or more broadly applicable, but it never goes backwards and asks should we do safety cases? It always takes us to the assumption that this is a solution. It says, how do make this a better solution? 

As the paper points out, there's been very, very little serious research into whether safety cases are valuable. That's just taken as a given. They point out that it's not a totally empty research space. It's just that the research that's done is done by people who are heavily invested in safety cases, so they're just pointing out where they've used them successfully. Or, it's done by people who have not heavily invested in safety cases who are saying, here is an accident where they had a safety case, and claiming that safety cases don't work. It's not what you'd call convincing evidence to actually change your mind. It's by people that you'd probably already agree with.

David: I found this section three of the paper, Drew. It's titled The Value of Safety Cases and makes those two claims that there are people who claim to use safety cases successfully, people who point to the accidents that happened with organizations, and operations that had safety cases that were approved by regulators and supposedly convincing.

I want to say, what do all this evidence based on the safety case is for? Do they make our system safer or not? I started to cherry pick a little bit of some other safety theories around. I decide, maybe we spend a lot of effort in safety cases, lots of money. I spend a bit of time in the oil and gas industry and a lot of consultants, a lot of risk engineers, a lot of money preparing, maintaining, updating safety case documents.

The authors are just flagging the possibility in this paper that if we don't know if we're getting the value out of it and there are other ways to realize safety value in our systems, maybe that resource that we're spending on safety cases could be used more beneficially in the organization. Again, the claim is not made in the paper strong enough to say, hey, let's redirect our efforts away from safety cases. It just puts this flag out there as a possibility.

Drew: David, I think I know exactly why because of the conference they're presenting this at. They can't directly say what they're probably thinking which is that everyone who criticizes safety cases has their own alternative to plug. Every criticism of safety cases comes with don't spend all of your time producing all of these big safety cases. Spend all of your time using my technique instead. They can't say that explicitly, but that is the implied difficulty here. 

It's a valid thing. Of course, we should be spending our time on whatever thing works best but at the moment, the research is always done by people who are making claims in support of their preferred method. We've got very little by way of fair comparison.

David: The paper then talks about impediment. They say, okay, so why don't we know more about the effectiveness of safety cases? They point to these issues, like the lack of public domain safety case examples. In many places around the world, you can't get these documents. They don't exist in the public domain. 

People don't share their safety case data with anyone, they don't have to share it with regulators, which means it's very hard to experiment when something is perceived as safety critical and central to safety in major hazard facilities. They started saying, let's start running some experimental trials about what we do with safety cases. There's a lot of headwind to make progress in researching safety cases.

Drew, you had a couple of students and current students dabbling in this area. Is it as hard as it's made out to be to try to research safety cases?

Drew: Yes it is, and no it isn't. Yes, it is hard to get enough examples to start doing things like direct comparisons, but the biggest problem is that the people who have most access to that material have most vested interest in not rocking the boat.

A great example is that the UK Health and Safety Executive is sitting on a massive database of safety cases for major hazard installation. They've got access not just to the safety cases. They've got access to the review history, to the audit history of all of those installations, to the incident history, and the inspection history. The last thing they want is to generate any internal evidence that suggests that they're wasting people's time and money. They have no interest at all in making that available in a form that could be researched or in researching it themselves.

Interestingly, the people at York probably have better access than most people. There are 20 or 30 academic careers writing on safety cases. There's not a lot of interest in a unified effort together to gather all of the examples and start thoroughly testing it at that fundamental level that might shake the basis.

I don't mean to be cynical about that. People research what they think is going to create benefit. Anyone who's got this vested interest has a strong belief that what they're doing works. It seems like a waste of time to challenge something that you think already is working. The people who are challenging tend to be more of the outsiders or people without the techniques who don't have access to the data.

David: I thought it was quite good that the paper pointed out some of these challenges and some of the reality of research, but they went on to talk about what to do with this. We both like the fact that they didn't throw their hands up in the air here. The rest of the paper after this section is really lying out example research that could be done, talking about case study research, really trying to actually solve some of these challenges.

Drew: A lot of people, particularly at the engineering end of safety think that we have to do randomized controlled experiments that prove that things are safe or not safe, or we can do nothing. When we can't do the big experiment, they don't want to do anything at all.

I love the fact they—actually in this paper—start thinking what can we do? How do we make what we can do rigorous and worthwhile? I really like the way that they've laid out a plan forward.

Possibly, I might just go into a little bit of their way of thinking about it to present that plan. Their starting point is they say that when you're doing something that is as big, complex, and is far-reaching as safety cases, the chances that it does exactly nothing is pretty small. Even if you hate them, you're probably thinking that they do bad things. Even if you're skeptical about them, you probably think maybe they do a mix of good things and bad things. If you like them, you think that they're doing good things.

That's not actually the question. The question is not does it do nothing or does it do something? We don't need to do that null hypothesis testing that we do when we're asking, does a drug work? The way they talk about that in the paper is they talk about variance theories versus process theories. I don't quite know, David. Do you know where this language comes from?

David: No. It's new to me, this language. And I'm not even sure I completely understand it myself, Drew.

Drew: I don't know if it's something to do with the fact that they're coming out of an engineering computer science world. We have very different labels for this in social science, but their idea is that variance theory is about the relationship between two variables.

If you do more A, you'll get more B. If you do a safety case, your system will have lower risk. If you do a safety case, the regulator will be happier. If you'll do a review, you'll have fewer errors. So a relationship between two things. That's the thing where you do a blind experiment and test what the relationship is.

Whereas, a process theory is something that explains how something works. They say we don't need to create variance theories in order to be able to create process theories. Process theories will take us most of the way we want to go. Instead of asking the question do safety cases work, you change that question to how do safety cases work, or what do safety cases do? Those ‘how’ theories don't need experiments. They can be studied descriptively or what were testing-type case studies or intervention-type studies.

David: Yeah, Drew. Maybe this is a language that's used to describe theories as opposed to research methods because we'd use descriptive research methods, experimental research methods, or whatever we were doing. I liked that they’ve drawn out that distinction because it does open up new possibilities for the rest of the paper now. Actually, we said in the manifesto for reality-based safety science first describe what happens in the real world. They're saying, let's talk a bit about process theories and explain how safety cases actually work, rather than us spending all that time talking about how we think that they should work in a normative kind of way.

Drew: If you're just interested in inputs and outputs, then they say that there's too many possible relationships between them, and every one of them would need a massive experiment. Whereas if you've got theories about how things work, then you can do much more smaller subtle tests of those. They do you say and they’re still clearly coming from this very science engineering idea that research is about testing theories. They’re still following with this process. You put a theory out there and test it, which is not quite how we would describe doing it.

They do say that you can't just randomly test theories. You pick out theories that are plausible and you pick out those plausible theories by first asking how people think things work or what are the reasons why people do things. If we can turn the reasons why people do things into theories and then test those theories, then we have good potential for changing how people do things by disrupting their ways that they think things work. 

If you imagine that vaccines cause harm because vaccines contain little green men that are going to run through your bloodstream, then it makes sense. Let's do research that tests whether these little green men exist because if we can disrupt your idea that little green men exist, then we can disrupt the fear of vaccines. If we can disrupt your belief that doing a safety case magically makes your system safer, then we can give you more reasons to believe that it makes your system safer by increasing your ability to spot errors. 

We can then test that this does increase your ability to spot errors and say, no, it doesn't quite work like that. It doesn't help you spot errors but it helps you maybe spend more time and maybe that spending more time on design makes your system safer. You just progressively work through ideas and test them, so you find out actually how something works or doesn't.

David: Yeah and I think they did also suggest ways of testing some of those different theories and approaches, like one example of almost by doing three parallel safety cases, using three different methods for the same system, looking at the outcomes that get generated by those different methods, and then almost tracking that system forward through time. 

You’ll almost be running three processes concurrently over the same system and seeing if you get similar outcomes, different outcomes, why things might be similar or different, then what might happen with that system over time, and what of those safety case techniques had ways or differences of being able to link to the way the system is actually currently operating.

Drew: We might talk a little bit about that example, David. I don't know if it's obvious just how cheeky they’re being with the selection of this example. It's a thing called the McDermid Square. To fully understand the context, what you need to know is that McDermid was the senior researcher and head of group, and then head of school for these researchers for most of their career. The McDermid Square is something that has been appearing in teaching slides for at least 30 years.

These people have had to present cause after cause that's got the square up here. Now, they put it in this paper as an example of something which needs to be tested as a theory. They're showing how we can take current explanations for how things work. Instead of just teaching it as fact, we can turn it into a theory with a set of hypotheses and then test those hypotheses. 

The McDermid Square is just really a fairly simple piece of guidance. It says that where you have an unfamiliar problem and an unfamiliar solution to that problem, then you've got to have a much more extensive safety case. Where you’ve got a well-understood problem with a well-understood solution, then you can get away with just a very simple safety case based on existing standards.

It seems to make sense but as always, things that seem to make sense, you can actually start to deconstruct. Is that really true? What does that imply? It says there is a sort of optimal size and complexity of safety. If your safety case is too complex, it's going to be costing you money and causing disadvantages. If it's too simple, it's not going to be doing a good-enough job. 

Let's actually test it out. Let's see if we can see when a safety case is too complicated. Let's see if we can see when a safety case is too simple. Let's go and find those too simple or too complex safety cases and look for the harms that they're causing. Let's see if we change that situation, if we make it more optimal. According to the McDermid Square, do we get rid of the harms and increase the benefits?

David: I quite like the McDermid Square and when people look at the paper, I thought okay, well it's not a bad way of thinking about how we need to approach different problems and different solutions with our safety case or our safety arguments. Arguments in terms of actually the preparations in the case for safety of our systems. But like with much of this other space, the authors called out in this that there just hasn't been a lot of research to test any of these models or to actually test the claims that any of these models make.

Drew: I think you'll appreciate the point that they go on to make which could easily be a line straight from our own manifesto saying that it's clear that such studies are infeasible to conduct without the direct engagement of practicing engineers, users and regulators, and without access to the real development and operational settings. They're very embedded in this idea that to do this research it can't just be researchers coming in from the outside and testing things or setting up trials, experiments, and case studies.

It needs to be a collaborative effort with people who are currently doing the work. With people who are going to come in and ask some of these harder questions about how is the work working. Perhaps with a sympathetic understanding of how the people who do the work think that the work works, and capturing some of those folk theories, turning them into formal hypotheses, then making changes and interventions to start testing them out.

David: If I may, I might put you on the spot, something that I've thought about for a while with safety cases. I don't know how familiar you are with the way that therapeutic goods and all that food and drug approvals get made, or particularly pharmaceutical and medical device decisions get made in terms of the approvals for those products on the market, where they do independent clinical trials. 

Those independent trials are run not by the company that has developed the technology or developed the drug. They're run by independent certified researchers, who the company then contracts to go and work with the doctors and the hospitals, actually run the trials and report back on the efficacy of the device or the drug. We see this model with independent certifiers in the building and construction industry and things like that, and they're making a lot of that information publicly available. 

What will be interesting to know is whether some of those ideas could be applied to safety cases wherein the safety case actually had to be almost facilitated by an independent party that was independently certified, as opposed to by the company and their consultants.

Drew: I think that's one of the pieces of folk wisdom that creates a perfect hypothesis for testing. A lot of the advocates for safety cases would say safety cases lose their value when you outsource them. If you just get someone else to create a safety case for your system, then that becomes very much demonstrated safety. You're doing assurance for the sake of satisfying the regulator not for the sake of making your system safer. They would say the important thing is that safety cases start early so that the co-creation of the safety case influences and improves the development. 

Okay, so there we have two different theories about how they work. That's a perfect opportunity for doing the type of mechanism-based testing that they're advocating. One mechanism says independence allows a degree of checking, evaluation, and finding things. The other says that having insight is what creates that checking, evaluation, and influences the design. Let's see which one actually happens. Both theories claim ultimately some sort of design improvement as a result of the investigation. Let's look for which mechanism actually links to the design improvement.

David: And while I call that example, Drew, I think Haddon-Cave mentions this with Nimrod and that review is that the nature of the commercial relationship between, even if companies are outsourcing it like you said has demonstrated safety because the company that’s getting outsourced to is getting paid by the company who they're providing the service back to. That's a little bit like a company goes and gets certified against ISO 45001. They're paying a company to come and certify them. It's in the certification company’s interest to give the client what they want in terms of that certification.

Whereas, I suppose, it's a very big difference to that medical type of model where the person gets paid to run the trial. The person doesn't get paid to confirm the results back, to actually provide a confirmatory response back. I think even then you might even have three or four different models to go and test. What happens when a company runs it internally? What happens when it's run by a consultant the company engages? What happens when it's run or facilitated by a completely independent party? Like you said, we should be able to test some of these things.

Drew: I think one of the difficulties with something like a safety case compared to, say, a drug trial—this is one of the reasons why I am personally skeptical that they work in the way that people think that they work—is that a regulator might not be able to run a drug trial. But a regulator can probably read and understand the results of the trial. The difficulty with safety analysis of complex systems is that it is possible that our regulators lack the capability even to look at the finished product of the safety analysis and to understand it.

The risk is that the safety cases, the supposed way or distracting away from the design to represent it to the regulator, that representation itself becomes so complex that only experts can understand it. We end up with the second hand, the regulator then needs to ask an independent third party just to read the safety case on their behalf. Once we get to that point, we know that the original regulatory theory itself is not holding up. We should be fairly skeptical about anything that then rests off that regulatory theory.

David: A great conclusion and insight. I like the way that you’ve described that, again, all things that we should be testing and understanding. Let’s talk about the conclusions of the paper and then a few practical takeaways. We've got this paper that talks about safety, titled Safety Cases and Impending Crises, and talks about all this research has been done on the tactics and the practices, but this absence of research has asked the big questions, also how does it actually work in reality questions.

Drew: Quickly just before we move on, I want to ask you a quick question about the title of the paper. They said an impending crisis, and then they pointed out this problem with the lack of evidence for the efficacy of safety cases. What do you think is the crisis that they were referring to?

David: That's a great question. Safety cases and impending crisis. One assumption that I'd have is that safety cases are maybe built like a house of cards and the crisis might come to if we suddenly realize that safety cases don't do what we think that they do, then we've got all these open questions about these hugely hazardous systems that we have been taking comfort in the safety case. If we find out that we should be taking comfort in that, then that's a massive crisis of confidence in our major hazardous systems around the world.

Drew: But if we've been using them since the 1950s and we've never had the body of evidence for them, where is the impending crisis coming from? I guess I asked the question, I want to tell you my pet theory.

David: Okay.

Drew: I think it's us. I think there is a new school of safety scientists who are starting to ask fundamental questions about the field. When I say us, I don’t mean like you and me personally. Maybe I'm being a bit optimistic here, but I think we are part of a movement which is starting to come back to asking some of the fundamental questions. I think that is the crisis that they're referring to, that the safety work which it’s based on. 

This is the way we've always done it, let's make it more complex, let’s make it more sophisticated, let's make it more model-driven, let's make it automated, let's apply it to more situations. That house of cards is finally being challenged by safety researchers who are coming back and saying, yeah but does it work? How does it work? Let's look at what's going on and let's try to fundamentally understand and build some theory from the ground up. 

I think that's the crisis. It's a crisis for people who've been doing that sort of research and people who have been relying on the products of that research. Crisis is always an interesting framing of things because it's a disaster for some people and success or opportunity for other people.

David: Quite possible, yes. Whether it's a crisis for a section of the practitioner and research community that their work is based on these practices in organizations, or whether it's a crisis of confidence because if some things that everyone just believes work are showing not to do what we think they do, then that creates a lot of uncertainty. We know in safety that uncertainty is generally a crisis. It would be interesting to see. Maybe listeners can also share what they think if they get a chance to read the paper the authors might be referring to in terms of the crisis.

Drew: Let's move on to conclusions. I'm a little bit disappointed. They start with the usual sort of weasel words when people talk about evidence in safety, which is they say that the fact that we don't have evidence for the value of something doesn't automatically mean it doesn't have value. Coming from a scientific point of view, I think that should be our default assumption is absent evidence of value of something we should not assume it has value. I can understand why they are cautious about wanting to say let's throw this out because it doesn't have evidence.

Particularly given that what they're trying to preach is for the community of people who do use these practices to be involved in the research of those. They're trying to convert people who might be persuadable.

David: I think because safety cases don’t do anything inside organizations, whether it's administratively, or socially, or culturally. I think you've been involved in safety cases a lot in different roles in your career. I've been involved in safety cases for a long time in roles in my career. We know that they have impact inside organizations at management levels, at engineering levels, at safety professional levels. 

We know some of the social, political, and organizational impacts. In my experience at least, it sometimes is a bit more impactful in those areas than they are in actual physical risk reduction of the operation. They definitely do things inside companies.

Drew: Thanks for that, David. That's a good way of putting it. When we talk about whether something has value or is effective, that is itself a value statement. As researchers, what we should be interested in is we know that it does something. There's obviously something valuable to be found out by studying it. We shouldn't even try to influence practice until we understand better what it is that is happening that we are trying to influence.

What we need is a way to theorize about how they work and to learn from comparisons between them and other approaches. The way the author phrases it in the paper, they clearly still have this goal-means-end mindset that says that they think that there is a need to do something functional here. It's really a competition between different practices to see which practices are capable of achieving this end. Even when they talk about descriptive theorizing, they're doing it with the ultimate goal of finding out what works best and then trying to persuade practitioners to adopt what works best.

David: The authors laid out their final thoughts at the end of the conclusion and just say, look, we can only really move forward, like you said, with collaboration between the safety engineers and the researchers. We need access to these industrial settings. We need access to the safety cases. We need to understand what assurance needs are being met by the safety cases. We need to have hypothesis-driven studies be it descriptive or experimental. Probably the more practical, the better. We need a genuinely cumulative research program, so we actually need to build on the research. We don’t just need everyone running off with their different techniques and trying to develop new techniques. 

Basically, when researchers are running off and talking about new notation, new formalisms, and new tools in this thing, they should have to justify what safety assurance needs that is being met by them investing that research effort going in, trying to develop these new ideas or supplementary approaches to the existing safety case approaches. And then really just getting practicing engineers to report their experiences and get involved in empirical research work. I think that's a fairly good list of considerations for the safety case research. Do you find that it's what needs to happen?

Drew: I think it is useful certainly for researchers who are coming into engineering-type safety research programs, where those programs are very much driven around the creation of tools. I think this is going to be a very attractive way of slipping in more rigorous and (as they say) sort of progressive so that we can build up theories rather than each new technique adding to the weight of things that need to be explained rather than the weight of explanation that we have.

I personally am uncomfortable with the degree of techno optimism. I think there's a whole massive underlying assumption that it matters whether safety cases are effective in some, almost implied objective sense. What I personally find fascinating is that there are hardly any researchers who care about measuring effectiveness. That says to me that effectiveness is not in fact the object of interest. That effectiveness for 30 or 50 years has not driven what the techniques are, then it's not going to suddenly change as researchers start to get more interested in effectiveness.

What we really need to understand is how do we keep adopting these techniques and how is effectiveness so irrelevant? That's why I think the really interesting thing is the broader theories of safety that say maybe the reason for doing these activities never was about making the system safer. Maybe it's to meet some other organizational need or some other psychological need or some other social need.

David: I think it’s absolutely right. I think that's a perfect segue into the practical takeaways because what you've talked about there, I suppose, we spent a lot of time thinking about and we discussed in episode 50 on safety work versus the safety of work because we saw these other needs in organizations that are related to safety work activities. 

The first practical takeaway that I'd probably suggest to you, if you are in an organization that uses safety cases or even safety management plans, even though we talked about safety cases today, pay attention to the language that gets used in your organization around those plans.

I've sat in organizations where the discussion about the safety cases all been about regulatory approval, not about what we are learning about our system and what decisions do we need to make our systems safer. If the management, engineering, and the regulatory conversation that you're having is actually about what's the timeline of approvals, when does it go to the regulator, how do we make sure that we get it approved, then that's a flag for you. Not necessarily a red flag, but that's a flag for you that the mindset around safety case activity in your organization is one of achieving approval and demonstrating safety is not necessarily one of learning about the system and identifying safety improvements.

Drew: I’d add to that, when you find yourself saying things like we need to do this because that's what the regulator expects and using that end approval as your justification for making changes in doing or not doing things. I have directly heard language in the organization saying like, why are you spending time on that? The regulator doesn't care about that. If you don't do this, then it's not going to get approved. You sort of ask yourself, okay, what does that say about why we do what we do? Why is the language that and not do this because it's safer or do this because the evidence says it works?

David: Yeah. Particularly, an example of we need to go and talk to the workforce because we need to be able to keep a record to demonstrate to the regulator that we've done consultation with this revision of the safety case, rather than we’re actually going to go and talk to the workforce because we want to learn about the current state of the system so that we can actually learn about what we might need to explore more further with these safety case revision.

Pay attention and use our episode on ethnographic interviewing techniques and things like that to just pay attention to how safety work activities in your organization get actually talked about. That'll give you some insight. I don’t think we’ve spoken about yet on the podcast maybe, my ethnography paper. I did what we say to professionals when I ask them why they are doing a certain work activity and really just listen to what objective they're actually trying to meet with that particular activity. It's really insightful as to what the motivation and what the drive is for the work outcome that people are trying to achieve.

Drew: I guess the next one is where you have a practice like a safety case, perhaps you're in an organization that is adopting them or expanding their use, then don't just focus on how we do it. Very often when there's a new technique around, the focus is how do I learn how to do this? How do I learn how to review this? How do I learn how to do this properly? Think instead what the need within my organization that this practice is capable of meeting. Think about whether the practice meets your needs not whether you are correctly implementing the practice. 

We know that with safety cases, there is not a lot of evidence that they meet specific needs. It's very likely that they're going to need to be adopted. Even if they do work, it's going to be very context-dependent. It's going to be about how they work for you in your organization with your problems.

David: Yeah, from the evidence base, if people are very strong with their claims about safety cases, if people are very convinced that they work or very convinced that a particular technique is the most effective technique, then you should ask them a few more questions because the evidence doesn't really stack up. The evidence doesn't really provide support for someone to hold that strength of belief, unless, like Drew mentioned, maybe there's a bit of a vested interest in why certain technique is being promoted or why a certain belief is being held.

Drew: The final takeaway we've got here, if you'll excuse me mangling your phrasing, when some people say lack of evidence for this, the other side of that is lots of opportunity to learn really, really interesting things. If you're looking for research to do or just a professional looking for stuff to learn more about safety, safety cases are a wide open field. There are lots of really interesting things to learn not just about safety cases but because of the major approach they take.

They have the opportunities to teach us a lot about how we think about safety and how other safety activity work can fit together as well, if we spend a bit of time trying to understand how exactly they work in our organizations.

David: I totally agree, and as a researcher or as a practitioner this is meant to be a safety work activity, the development and maintenance of a safety case which is meant to be the consolidated safety viewpoint on our major hazard systems and technologies all around the world. As safety practitioners and researchers, understanding exactly how these practices work, what needs they meet, how people experience those processes and what they get out of them, and what the limitations of these processes are, should be at the forefront of our agenda I think.

Drew: Let's throw that directly as a question to our listens and invite you to share what is your experience with safety cases? How have they influenced your work? Have you had external demands to do them? Have you tried them because you had a need that you thought they might meet? Is there a particular way that you have decided to do them or being asked to do them in a particular way? Rather than asking that big effectiveness question, what is the local effect to mechanisms? What has changed as a result of starting to do the safety case? What are the impacts have you seen them have?

David: I know there are some organizations that actually adopt this idea of voluntary safety cases where even when there's no regulatory regime requirement for it, they have adopted the practice voluntarily almost as a self imposed requirement for their operations. I’d love to hear from people who have gone to do that because then in some respects, it takes a lot of that regulatory influence out of the process. It will be really interesting to see whether a voluntary safety case process runs differently to a regulatory safety case process at a local level inside the company.

That’s it for this week. We hope you found this episode thought-provoking and ultimately useful in shaping the safety of work in your own organization. Thanks for everyone’s understanding with the move to fortnightly. We’ll make this work for the next 70 or so episodes, and then we might revisit it again. Please send any comments, questions or ideas for future episodes to feedback@safetyofwork.com.