The single biggest problem in communication is the illusion that it has taken place.
-George Bernard Shaw
Imagine this. You and your loved one sit down to watch a movie. Except the movie you want to watch doesn't yet exist. One of you speaks into your device that you want to watch a combination rom-com and heist movie with four plot twists, has a giant elephant, a washed up chef, a happy ending, and stars George Clooney and Julia Roberts. One of you goes to make the popcorn and five minutes later your new personalized movie is ready to watch. As you watch the movie, you laugh, you cry, your adrenaline pumps, and your content desires are fulfilled. You end the night happy and satisfied.
Does that sound farfetched? It's a future that colleagues of mine who are working on aspects of this technology think will happen within five years. This is a future of intense personalization and customization. Are people ready to continually articulate their desires?
Believing People
A surprising thing about a world where people can articulate their desires and have them actualized, is that people usually don't know what they want to a high degree of detail. In statistics training, you are taught to be wary of survey data because it can be fraught with errors in the sampling and question process, along with the fact that people might not be truthful (yes, there are some survey methodologies that handle and deal with this but they are not that common). Instead, it is better to use data on observed behaviors instead of relying on what people think they would do.
Why is this? Why do people behave differently than they themselves think they will? Part of it arises from people not being able to put themselves in a theoretically situation, part of it stems from not having all the stimuli of the actual decision making environment, some of it is random, part of it is how questions are laid out, and part of it is that people change their minds. The understanding of these psychological effects has molded our decision-making processes and systems. Machine learning and AI systems have adapted to this situation by relying on a combination of the actions people take and their personal characteristics instead of relying on rough suggestions.
Let's look at recommender systems to show how individual uncertainty manifests in the machine learning world. Recommender systems provide a way to match similar types of content to similar types of people to provide a quality recommendation, as seen in services such as Netflix, Spotify, and Amazon. Before streaming services, a person used to have to go to Blockbuster to rent a movie and the movies would be categorized in very broad genres - Action, Romance, Drama, etc. If you went to the store and asked for a recommendation, a mediocre rep would ask you what genre you liked and recommend the latest title in that category. That process was very hit or miss. If you spoke to an exceptional rep, they'd first ask you questions like "what are some of your favorite movies?", "describe what you're feeling tonight?", or "what's the last few movies you've hated?". Then the rep would typically provide a great recommendation of a movie you may or may not have heard of. Note that no rep asked you for the exact experience you wanted to have, they merely reasoned by association from available choices. Recommender systems work in the same way except they are even more personalized because they have a better understanding of the intricacies of you and people like you.
In either case, the recommender system or the Blockbuster rep, you don't really know what you want. Either because you aren't sure, you don't know what's available/possible, or you want many things and are trying to tease out your feelings to make a choice. This nebulous swirling of uncertainty is easier to collapse into a decision when limited choices are put in front of you for you to assess. This brings up a few questions:
If we gave people the ability to continually ask for the content they wanted so that it could be created, would people want to?
Do people want to live in a world where they have to continually articulate their desires from nothing?
Do people prefer to make a selection among choices or do they prefer to find the exact thing they want?
In a world of infinite possibilities, it seems like it will be difficult for people to continually dictate what they want in a timely manner. It is a much higher mental overhead to create something from scratch or even just articulate the idea than it is to pick among choices. Most people prefer to take the less energy intensive route.
Creating Content
Creating content is hard work. The hard part of creating content is in crafting something from nothing. Some people might call this an NP Hard problem. It is a combination of showing people what's possible, artistic expression, understanding the desires of others, making something that hasn't existed before, and creating something people actually want that they themselves didn't know they wanted. Steve Jobs described this excellently when he stated:
Some people say, "Give the customers what they want." But that's not my approach. Our job is to figure out what they're going to want before they do. I think Henry Ford once said, "If I'd asked customers what they wanted, they would have told me, 'A faster horse!'" People don't know what they want until you show it to them. That's why I never rely on market research. Our task is to read things that are not yet on the page.
Bringing AI into the mix doesn't change this fact. It can make the process faster, usually when steered by an expert. What AI services enable is the ability to generate various ideas to try out along with the generation of a large amount of content. This process works even better if you provide seed ideas of what to try. However, you as the individual creative still need to make the final determination about what your audience wants. This is compounded by the fact that most AI systems are mediocre individuals relying on past patterns, so it can be hard for them to generate something that is simultaneously novel and exceptional. The steering of the expert helps to tease out those pieces that are both novel and exceptional.
The question then becomes, are normal people able to articulate their desires to an AI such that they will get something that satisfies their desires? Let's assume that AI can make a defect-free multi-hour-long movie (it can't as of the time of this writing). Would someone who isn't Quentin Tarantino be able to describe what he or she wants to watch effortlessly and well enough in one go that ultimately Pulp Fiction is created? How about describing the Dark Knight by a non-Chris Nolan? For movies of this caliber to be created by AI from a simple description by an individual in a single go requires a large leaps forward from where AI is today. Even if we were there, would people be able to articulate what they wanted? Would they want to?
Directing AI
Give me something funny.
I want to see a Rom-Com set in Seattle where most of the flirtation happens over email.
Show me an action movie that travels through mountains and forests in which a band of heroes are tries an item. It should have elves and orcs.
Make me a heist movie that plays in reverse where a band of criminals are being tried for a crime they didn't commit and there are seven plot twists.
Give me a story that’s Nordic in origin about the love between two sisters as they try to navigate the world. The older sister should be able to freeze things. Oh, and I want an animated snowman that sings songs. And trolls that turn into rocks. And a fake love interest. And a crotchety, slimy antagonist. And it should contain multiple songs. Make one of the songs a hit.
Each of these describes a popular movie. However, you might use a description different from the one listed above. Instructing AI and agents is going to be a continual challenge that will not be easily resolved in the future. This is a communication issue. Two people can talk and still not be on the same page about a topic on which they both agree. Going from human to machine becomes even harder. The closest thing we have today to improving how we talk to AI is called prompt engineering.
People are trying different approaches to accomplish tasks through agent systems such as Auto-GPT and MetaGPT. However, they tend to be lacking in full execution and breadth of ability and understanding. The following system components need to be created in order for AI to be able to carry out your desires based on a brief description.
A listening system that understands the vocabulary and expression of a single individual compared to other individuals. The way you describe a house is different from how I describe a house and an AI needs to be able to distinguish the subtleties of those differences.
An out-of-context understanding of what an individual wants. If you see a review, a comment, or tweet on the internet, it is hard to know how to interpret what that person is saying without the greater context of what is going on in the world and with that individual. AI needs a way to understand the context of the world and the context of the individual to properly fulfill a request made in a single sentence.
The ability to reason by analogy while still allowing for the novel.
The ability to ask for clarification or follow-up based on the AI’s internal understanding and direction. When working with clients, we will routinely hit a point where one of three or more paths could be taken based on prior conversations. A discussion is setup to clarify the desire that needs to be fulfilled and to weigh tradeoffs.
The ability to make tradeoffs between conflicting values and requirements.
As these modules are built into AI and agent systems, I expect performance to greatly improve. They might even become magical in their interaction. When we look at technological advances, the hope is that we can reduce mental overhead, maximize fulfillment, and reduce waiting time. While the vision is that we can give short commands and have our desires fulfilled, the reality is that people spend a lot of time shaping the requirements of what they actually want. The world is built on choices that address a majority of these needs through time and survivorship but new content is being continually created to address desires. As we progress with AI it will be interesting to see how our psychology changes in interacting with these systems and with others to obtain what we want.