“The purpose of computation is insight, not numbers.”
― Richard Hamming
Here's an interesting problem - the prison warden chessboard puzzle. It goes like this. You are a prisoner along with one other individual. The prison warden will let both of you out if you can solve a puzzle that contains the key to your escape. The prison warden places the key under a square of a chessboard. Then he places a coin on each square of the chessboard with either the heads or tails side facing up. The prison warden will then bring one of you into the room with the chessboard and reveal where the key is hidden. He will allow you to "communicate" with your fellow prisoner by letting you to flip over only one coin (changing it from heads to tails or tails to heads). The other prisoner can then enter the room and has the job of figuring out the location of the key to go free. The prisoners can discuss and devise a strategy beforehand. However, the prison warden can listen in on the strategy prior to setting up the coins and key. The prisoners can't communicate once the "game" begins. How can the prisoners go free? How would you solve the problem?
There’s a great breakdown of the problem by 3blue1brown along with a solution:
The solution is rooted in error correcting codes. Error correcting codes are a way to embed data within a message that helps the receiver determine if the message has been altered in transit. These error correcting codes were primarily designed for bit-flips - where a single binary number (or bit) was changed during transmission due to hardware faults, cosmic rays, or electrical interference. How do error correcting codes solve this puzzle? They provide a way to measure how data has been modified and identify the location of errors. In this problem, the location of the key is treated as an error that is identified.
Why are error correcting codes necessary? Because the physical world is an unforgiving place. While you might think of algorithms and computations as executing flawlessly, too many things can affect the physical hardware that software runs on. In a similar manner, you might believe that AI models should be able to interpret something in the same way you do, or build the right logical argument given the same information, but there are too many differences in the encapsulation of data and knowledge to make this an errorless process.
Cognitive Biases
A big problem for current generative AI models is hallucinations. In some cases, hallucinations can be a feature, such as when you are trying to generate ideas. However, in cases where you are trying to rely on factual accuracies, hallucinations are errors. Hallucination is simply a polite term for errors. Unfortunately, just about every AI algorithm and model has errors, though the types and degrees of errors differ. This isn't as bad as one thinks. Humans make errors all the time. In fact, your brain is hardwired to make errors. These are called heuristics or biases. Part of the reason for these errors is that a lot of tradeoffs are made during decision-making. Your genes and brain have made tradeoffs for survival. Tradeoffs such as making decisions quickly and less accurately vs slowly and more correct, or accuracy in frequency vs accuracy in impact. These tradeoffs are tuned to help you live longer and procreate. The funny thing is even if you know about them, you can't really turn them off, but you can apply some mitigations. The easiest way to do this is to look at optical illusions. Let's look at this one of chess pieces:
Figure 1. A chess piece optical illusion
Look at the above image. On the left side is the illusion, the pieces on top appear white and the pieces on the bottom appear black. On the right, the backgrounds are removed in order to show that the colors of the pieces are actually the same. However, knowing the pieces are the same colors does not change the error of your perception when you look back at the left side. No matter how much information you are given or how long you stare at the right side, your brain keeps the optical illusion on the left side. It keeps making an "error". Unfortunately or not, it's very unlikely your mind will ever correct this error because it is tied into the greater mechanisms of how your mind works and the tradeoffs it has been optimized for.
What you do have though are ways to detect errors, create logical arguments, and provide mitigations for when errors are made. In the chess example above, we deconstructed the image to check if the pieces on top and bottom were indeed colored the same. This extends not just to how an individual operates, but to how entire organizations operate. There are multistep processes in place to ensure that errors are minimized, and typically no process involves a single person. In the same manner, we need ways for AI models to identify and correct their errors. This is particularly true because the bar is higher for AI models to make better decisions before a person is willing to cede control.
AI Error Correcting Codes
AI models attempt to create structures around knowledge, but they do so imperfectly. In designing error correction for AI, we need methods able to discern what information is trustworthy. We need a way to identify and provide anchors of knowledge that can be referenced. However, this problem is much harder than error correction for communication. In communication error correction, we have the original source of truth message, and we are attempting to duplicate it on the other side. There is a strict verification of exactness that's required. But if I tell you a fact such as 'my car is red', do you envision red, fire engine red, brick, burgundy, scarlet, crimson, maroon? Does the exactness of the color matter? In some cases, such as if I'm a painter, it might, and in other cases it won't, such as if you're trying to find my car in the parking lot. Both context and hierarchy of knowledge matter. These knowledge constraints are an issue we see right now with generative AI models. Asking AI to 'draw you a red car' will result in different shades of red based on the model you use, and the randomness embedded in the model.
How can we build error correction into AI models? It's a fascinating issue that I expect to become a hot topic this year, as companies begin focusing on methods of verification. I have outlined below what I believe is some direction of the path forward:
Understand the tradeoffs inherent to the modeling process: Speed vs accuracy is a typical modeling tradeoff but the context of use matters. How a model is trained, such as the loss and cost functions used, determine how a given AI will make decisions and balance between different types of errors. When models become very large, it can be difficult to untangle the types of tradeoffs that have been made.
Understand how much current models can self-correct: Some successful prompting techniques for large language models currently involve asking the models if they are sure of the answer and asking them to break out their thinking into steps (chain of thought). There is an even more advanced technique called tree of thought that helps LLMs explore entire solution spaces by allowing them to go back to previous thought steps.
Use external processes outside the model to find errors: The use of multiple models, each with their own knowledge and view of the world, is likely needed to identify errors and make corrections. Just like your colleague or manager reviews your work for errors, AI models likely need two-step or multistep processes with different models to help identify and correct errors.
Providing proofs or checks on "trusted" knowledge: If I have a data store of truths that I am using, AI models need a way to test against the information they accessed and assess if they pulled the right information and used it correctly.
Multiple test methods for validation: There's no perfect certainty on decisions, but multiple checks and methods provide a greater probability that you've caught errors. Having multiple sets of checks and references to "trusted" knowledge frameworks can provide ways to assess outputs.
Benchmarking and large-scale testing: AI models will continue to make errors, just like humans make errors in their judgment. It's important to benchmark and test these models at scale to understand what types of errors they make. A lot of leaderboards and datasets have been released to benchmark models against, but less work has been released about the common types of errors made. Knowing the common patterns makes it easier to design mitigations.
Mathematical progress: While it might be unknown at this point how to solve this issue in a nice mathematical or algorithmic way, research will continue. It took generations to get to error correcting codes and even more time to make more advanced algorithms. There might be ways to cleanly make these assessments, but they likely require advances in mathematics to do so.
Having ways to correct the errors of AI outputs will get us to a better place where we can begin to trust the results from AI models. Only when trust is given can AI then be used at scale to automate processes. An equivalent of error correcting codes for AI will lead us there.