Governance as the Facilitator in Adopting Technology
Achieving societal acceptance requires safety, reliability, and predictability
governance - noun - gov·er·nance
the act or process of governing or overseeing the control and direction of something
-Merriam-Webster
"We don't know enough about the impact to healthcare subscribers yet for this to be safe"
The year was 2015 and I was sitting in a meeting about how risk models could affect consumers. Key stakeholders were worried about the interpretability of blackbox models and understanding the outputs the models were providing. Flash forward a few months:
"There are a lot of ways to use chatbots but the low hanging fruit seems to be customer support. Other than that, people don't feel that comfortable with other use cases".
That was 2016 and chatbots were all the rage. However, for all the promise people weren’t sure how to use them and they didn’t completely trust the outputs. Sound a little familiar? Fast forward to today:
"We really want to use generative AI but we can't really trust the outputs at scale."
What's old is new again. The same issues keep cropping up. How is AI in a state right now where everyone is simultaneously blown away by what it can do but companies are leery of using it? Let's discuss the factors that limit the adoption of new technology.
Factors Affecting the Adoption of Technology
“We don't know enough about X to use it yet” is code for we aren't sure how to control it yet in the ways we want. This assertion, and its various forms, is applied to all new technologies at one point or another. If people have doubts, they are unlikely to adopt a new technology. Looking broadly at what controls the adoption rate of technology are two things:
How the technology can be governed? That is, how can the outputs be controlled in a reliable, predictable, and safe way. Do we trust our ability to use it?
What can the technology be used for? If people don't understand the value of a technology or what new capabilities it provides beyond the status quo, they are much less likely to use it.
The first point is really a controlling factor for the second, because if you can't control a technology in a reliable way, it can be hard to envision how it can be used. For instance, if I gave you a "car" that could take you anywhere in less than a minute 0.1% of the time but the other 99.9% of the time it either wouldn't move, take you somewhere random, or explode, would you use it? Would you plan for its use and structure your entire life around it? Doubtful because it is not reliable or predictable for achieving the outcome you wish.
There's a large lag between when a technology is created versus when it is widely adopted by society. This happens across every sector. While we may know how to achieve a goal with technology, we have to solve the engineering challenge of governing the outcome or being able to control it in a safe and reliable manner. We know how to make a fusion reaction. We don't know how to make a fusion reaction last for a useful period of time. This really comes down to controlling the reaction in a useful way. Cars didn't become effective until we added reliable brakes and a steering wheel. Leaders aren’t effective if they aren’t in control of their teams. Effectiveness comes from being able to control things in such a way as to get a desired output.
The first Prius came out in 1997 and the first Tesla came out in 2008. In 2023 we're just now reaching the point where over 10% of new cars sold are electric vehicles. But the path to get there was filled with lots of questions of technological governance to improve system reliability and predictability. How can I recharge on a long trip? How to increase battery life? Can I go as far? Most of these questions related to being able to use an electric car the same ways as a regular car. Which means the perceived or real output of an electric car wasn't there yet. Now we've reached the threshold of adoption. Whether through better batteries or supercharger stations, the technological governance has improved.
We’ve seen a similar process with machine learning models over the past two decades. Statistical models have always been used for centuries in decision making but as structured data became more prevalent thanks to computer systems, machine learning models appeared to provide much better results. However, there was a constant question of if the results could be trusted. This forced systems of model governance to emerge to ensure that models were behaving appropriately and as expected. First, the development phase would have several stages of testing and validation to ensure models didn't overfit the data. Next, they would go through a human review. Once models were put into the real world for contact with consumers, they were only exposed to a small subset of users in the 1-10% range of total users to ensure no large damaging effects if the model performed incorrectly. As the model continued to perform as expected it was given a larger and larger chunk of user traffic until it was interacting with all of it. While this was just one scheme people used to govern models, entire frameworks and techniques were established to make sure models were safe, reliable, and gave expected outputs.
AI Governance
The situation we are facing right now with AI is that everyone sees the potential and wants to use it but there is a lack of control on the outputs that is preventing it from being used at scale. Sure many people see many ways to use it to augment an individual but putting it into a sustained, well-functioning product has a lot of challenges. Humanity has spent millennia developing processes to check outputs and verify decisions. We should not cast aside that knowledge as we adopt AI. While the methods of governance of AI won't be the same, they will likely be similar in form to what we have created in the past.
There's a few questions to aid the examinations of our systems to make sure we are governing them to our needs:
How do we define an unwanted output? Alternatively, it might be easier to define an acceptable output.
To what degree does our system currently create unwanted outputs? You can't change what you can't measure.
What controls and guardrails currently exist to prevent users from interacting with unwanted outputs? An inventory of output checks and modifier systems is helpful for understanding how you are controlling the base system.
Do the current controls and guardrails in place provide adequate protection from unwanted outputs? How do you quantify and assess proper governance?
What fallback mechanisms are in place if an unwanted output does occur? When a failure occurs, and it will occur, how is your system minimizing the impact?
By walking through these questions, you and your team can make a proper assessment as to whether your AI system is being governed properly or if additional adjustments need to be made. If your current methods are not adequate, there are numerous things that can be done to ensure that the outputs of your models are safe and reliable.
Variability assessments: What variability in outputs is acceptable for a similar input? If the same message gets across, does it matter? Think of how your various reports communicate results to you. An individual may repeat themself in different ways but the underlying message is the same. Does the variability in their language affect how you think about the issues? Sometimes yes, sometimes no.
Tests of repeatability: If you trigger the system with the same input, do you get the same response or many different responses. Are the responses acceptable? Understanding where variation arises within your system can help you understand the proper guardrails to put on.
Sensitivity Analysis: If you slightly alter inputs, how does that affect the final output? You generally want your AI to be less sensitive to variations of similar output while still being flexible for nuance.
Automated output checking: Checking the output of an AI system before the output reaches a user allows you to trigger interventions. The overall latency added to a system for checking outputs is minimal compared to the amount of time taken to generate the output. This is probably one of the more important guardrails because if you aren't detecting unwanted outputs, it’s very hard to correct them.
Guardrail enforcement: Have you established certain lanes within which your AI should perform? Measuring these deviations and re-triggering the system with modified inputs or randomness can help keep your AI system in the right lane.
Output modification: If an unwanted output is detected, how simple is it to change the output into something that is acceptable? Some things, like profanity, can be easy to modify, while others, like the conveying of unwanted, subtle topics such as racial differences can be more difficult.
Fallback systems: When your AI system fails through unwanted outputs, how are you gracefully mitigating the impact and managing the risk? Are you routing improper outputs to a modification system? If you detect a bad output do you trigger your system in a different way or do you ask the user for further information? How does the false positive rate of your detection system affect the user experience? This is a deep topic where a lot of prior engineering research has occurred in multiple fields.
These are just a start to the steps that can be taken. Working through these various safeguards will spring additional ideas of how to better govern your AI systems. Being able to build better AI governance systems will allow for the faster adoption of this powerful technology. As more and more people begin to trust the outputs of AI due to more reliable and controlled outputs, the more it will be used. This increases the pace at which more improvements can be made to society. Improve governance, improve adoption, improve society.