Many hidden truths are often unobserved, not invisible.
-Matthew A. Petti
Try and describe a color. Any color will do. Now try describing that color to a blind person. Someone who has no reference for what a color is. Is it even possible?
Now try describing how to use a screwdriver. How explicit do you need to be to direct how one’s fingers and hands should position and move? Can you describe the exact force needed to turn a locked screw without a measuring device? Is that even how you think about using a screwdriver?
There are things that cannot be described solely using language. For some, these things have to be observed with different sensors. For others, they can only be experienced and learned through trial and error. This is an issue for those believing large language models (LLMs) will solve all. Unfortunately, LLMs will not solve everything. Understanding color solely with language is an unsolvable problem. The question becomes how do you get information that does not exist in the chosen mode into your model? The way current AI solutions get around this is through multimodal models that find ways to combine different modes together. For instance, language combined with visual information. There are some models that combine numerous modes like images, video, thermal, text, audio, depth, and inertial measuring units.
However, even with multimodal models there are things that still can't be captured in available data. Either the data doesn't exist, what we think of as the data is really a side effect of the true data generator, or the true data is unobservable. Some things we only learn through trial and error. We instead try to understand these unseen or unobserved items through indirect ways or through proxies, things that we can observe.
Specialists and practitioners of all types have created vast vocabularies to be able to articulate and convey information to peers. This language is typically referred to as jargon. However, it serves the purpose of quickly conveying complex topics to the initiated. Normal vernacular does not include the reference items and tacit knowledge that has been accumulated through experience. Above all though, specialists and practitioners have amassed a body of knowledge that cannot be readily conveyed in language, even in jargon, and is best learned through apprenticeship. Why? How is it possible to transmit information that is only experienced from one’s perceptions? So much is learned through doing, through experimentation, through trial and error.
Building quality machine learning models requires asking deep questions about how data is produced. Questions such as, ‘does this data properly model this problem?’ or ‘does the information exist that can model the generator producing the data?’ At the time when neural networks were the hot thing, I recall talking to many people who thought they could simply throw a neural network at a problem and all of their issues would be solved. Turns out the results were hit or miss. The reason for this tended to boil down to the following example. If you want me to predict what color shirt you are going to wear tomorrow, I won’t be able to if you’re providing me with the data on the movement of penguins in Antarctica. I see a similar trend today with people applying current AI methods and failing for similar reasons.
During modeling, I try to understand a more nuanced question of what things do people know that our model can't consider? What information should we be capturing to improve our model performance and get closer to emulating the generator of the data? These types of questions create bounds around the maximum performance of the model and also provide an understanding of how to frame errors, infer new variables, determine other types of data to capture, and other ways to approach building a model.
An alternative way to reach this understanding is by asking questions such as ‘where can this go wrong?’ and ‘what will make this fail?’. These questions are not meant to downplay anything that has been built but instead help to get a feel for how well we understand what we've built and to figure out how we can mitigate any risk around possible failure modes. For instance, if I know that our model tends to predict high values in crucial situations, we're going to have to control for that in a user experience. The research question then becomes, what information, or lack thereof, is making us predict high in these crucial situations?
The best place to solve modeling issues may not be in the model itself. It may be in the data capture step or it may be in the way the model is used. Obviously improving data quality or identifying data that provides more signal will make a better performing model. In model usage, the improvements can be more subtle and harder to measure due to confounding variables. However, just like someone can’t really tell you how to kick a football or shoot a basketball as it doesn’t instantly download into your brain, someone has to show you and you have to practice, the same is true with model usage. There are nuances to model application that can give superior performance results to a strong wielder of a mediocre model than to a weak wielder of a top-notch model.
When discussing indirect modes what we really mean are two different things. First, can the data we are trying to convey be interpreted, provide signal, and understood by our methods. Second, is there a learned experience we can employ to improve performance? Both of these topics require an understanding about the bounds of transferability and the use of real-world feedback to improve. Finding ways to harness the unobserved will provide superior results.