AI Roles: Mitigator

Cap the Downside

Dec 01, 2023

“If anyone says that the best life of all is to sail the sea, and then adds that I must not sail upon a sea where shipwrecks are a common occurrence and there are often sudden storms that sweep the helmsman in an adverse direction, I conclude that this man, although he lauds navigation, really forbids me to launch my ship.”

- Seneca

AI technology is increasingly showing its depth of power. It has the potential to cause very good things and very bad things. In the current conversation around AI, this is evidenced in the polarizing views of existential fear and savior transcendence. Both are unlikely to happen but the great potential in either direction is interesting. Ideally, we find ways to cap the downside so that we minimize the impact of negative events while keeping the upside of positive ones. That's where the mitigator role on AI teams comes in.

As written previously in the post on AI team roles, the mitigator is described as follows:

Mitigator: Responsible for reducing system level risks. AI is best used in an automated system, and trusting such automation requires finding ways to prevent destruction from such automation. Some of the role may [focus] around safety, ethics, and/or alignment, while other parts [revolve] providing guardrails in automation.

Let's dig in and understand this role more. We'll look at the various responsibilities of a mitigator and how to differentiate between a good and bad mitigator.

The Why

Why is the mitigator role a requirement for AI teams? Simply, any system, whether automated or not, has ways in which it can fail. The goal of the mitigator is to make sure that the failure modes are known and controlled so that mitigations can be triggered when failures ultimately occur. Controlled failure is common practice across every field of engineering, and successful AI systems will be no different. For example, gunpowder production facilities have a loose wall so that if and when a large explosion of gunpowder does occur, the wall falls down while leaving the rest of the facility mostly intact. If the wall wasn't loose, pressure would build up and cause a catastrophic failure. A similar difference occurs if you light a firecracker on an open palm vs a closed fist.

The secondary goal of the mitigator is to minimize failures overall. While this goes hand in hand with improving model performance, it also includes building additional layers to prevent bad outcomes from ever reaching a user. Some of these layers may be additional modules and checks that are added to the system, while others may be included in the creation of the underlying model.

Concerns

A good way to frame roles is to understand what questions they need to be asking. Definition through focus. A mitigator needs an understanding of how the underlying system works along with the technical details of each component. Additionally, the mitigator needs to understand where and how errors can be made, along with the level of error. He or she also needs to understand how people can break the system. Some questions that mitigators should be continually asking are:

If the system fails, what is the impact?
What constitutes a failure?
What are the planned and potential unplanned impacts that the system has or could have externally?
Where do biases exist in the system and how can they be corrected?
What are the second order effects of releasing this system?
How do we cap the downside?
How do we ensure that the system is aligned with our interests and the interests of humanity?
What human touch points should be built into the system?

Mitigators should be engaged across the product lifecycle of proof of concept, build out, and performance tuning. During the proof of concept stage, mitigators should be engaged to be alerted to issues that may arise during the build out phase. The build out phase is most important for the mitigator because it lays the foundation of various safeguards and guardrails which may alter architectural decisions. Again, the ability to limit the impact of negative events is best done before a system gets larger. While they are most important during the build out phase, they will likely do a majority of their work during the performance tuning stage. This is when they figure out additional mitigations and guardrails that need to be developed to deal with real world interactions.

Activities

How should a mitigator effectively spend their time? Their main deliverables are a catalog of errors and impact, safeguards, guardrails, and mitigations. However, there is additional work that needs to be performed to get to those effective measures. A non-comprehensive list of activities they perform includes:

Gain a deep understanding of how the system works
Understand the technical models that underpin the system
Understand how the system and components can be exploited
Perform experiments to examine the outputs of the system for bias, alignment, and error
Develop preventative measures
Perform experiments to test the effectiveness of measures
Find ways to limit impact when failures do occur
Create mitigations in the event of a failure.

What does good look like?
Unfortunately for the mitigator, it's much easier to identify bad ones than good ones. That's because when a mitigator does their job poorly, bad things happen. You get things like the chatbot Tay or large financial losses in the case of high frequency trading. When a mitigator is doing their job well, the best-case scenario is that things operate as expected, without issue. The saving grace is that you can benchmark your system against other competing systems or measure the amount of effort it takes to break the system.

With AI systems (and most systems in general) there are too many ways things can go wrong. So instead of trying to understand every path, good mitigators will focus on outcome states which tend to fall into a small number of categories. They will then find ways to limit the impact of the negative ones, capping the downside of a failure. At the same time, they know in their heads the types of errors, level of errors, and causes of errors that can crop up in the system. Great mitigators will find ways to create simple but effective measures that do not bloat a system.

Refer a friend

Themes

As mitigators practice their function, there are several themes they should think about to be successful. These themes are operating principles to get one in the right mindset for the role. While each mitigator will develop their own working themes and principles, the list below is a good starting point.

Cap the downside. Realize that you cannot predict, forecast, or control unknown unknowns. Therefore, limiting the exposure or blast radius of a failure can help prevent catastrophe.
Rigor in prevention. Automation decreases the ability to react and increases the robustness required for pipelines that are created.
Impact over frequency. The frequency of an event matters less than the impact of an event.
Think like a Chaos Engineer. Creating mitigations and bias removal might require the creation of synthetic data and modifications when training the system.
Failures as information. Unfortunately, not every issue is going to be caught in advance, but we learn information when things break. Airplanes crash, but they crash less and less. Each airplane crash that has made airplanes safer overall. Treating failures in the same manner will improve your system.

The mitigator is a crucial role that's required to establish trust in the use of a system. This is no easy feat, as it requires both the mind of an attacker and a defender. They are there to ensure safe usage or funnel usage to the right areas. Mitigators are responsible for ensuring positive impact of a system and reducing negative impact.

Embracing Enigmas

Discussion about this post