“No matter how brilliant your mind or strategy, if you’re playing a solo game, you’ll always lose out to a team.” - Reid Hoffman
There is currently a race going on to build artificial general intelligence (AGI). As a byproduct of that race many companies are building, releasing, and deploying machine learning algorithms to monetize this research process. Many of these companies will fail. Those that have currently established defensible moats are converging towards models that can produce similar results. What will set the winners apart from the losers? Partnerships and acquisitions to gain access to either better data or better technology.
Businesses are hungry for automation and AI is hungry for data. Some researchers are postulating that AI systems will run out of quality data to use by 2027. This increases the value for high quality and hard to acquire data. We are entering a period where AI technology is improving rapidly with lots of AI companies spinning up (lots of tech to improve businesses), overall valuations of companies are decreasing (so they are cheaper to acquire), and most companies do not know how to value their data (opportunity to find data cheaply). Let’s examine how partnerships and acquisitions can help both businesses and AI builders succeed.
Partnerships
Probably the most prolific partnership in generative AI at the moment is between OpenAI and Microsoft. While Microsoft has made and is planning to make large investments in OpenAI, the partnership is not exclusive. What is the advantage of this partnership for both parties? Microsoft gets access to cutting edge technology; the MIT Technology review has written about all the ways the tech will help Microsoft. OpenAI obtains access to a large amount of data that no one else except Microsoft has access to. What type of data is that?
Word documents
Excel spreadsheets
Powerpoint decks
Other data from Microsoft Office tools
Draft iterations of all of the above
Azure cloud usage information
Xbox game data and CGI libraries
Skype conversations
Code (potentially private Github repos)
How does this type of data improve the GPT or DALLE systems? If you think about marching towards AGI, you want to be able to have lots of different types of actuators or things that the system can create. Business documents are a next logical improvement from language models as they are a medium with which a lot of the world currently interacts with. As these models improve, they will need more diversity and depth of expertise in the data used. OpenAI creates a more powerful AI and Microsoft gets to improve their business.
Partnerships provide a great way for both parties to win. Businesses that have hard-to-gather data can share it with AI providers to improve the underlying models and in return get access or cheaper use of those models to improve their business. This enables a lot of win-win situations. I expect a lot of interesting business alliances to form. While OpenAI has aligned with Microsoft, I would anticipate Google partnering with a few healthcare and biotech companies as they have already begun building models in this area. AlphaFold and health imaging models are examples where Google has already reached into the healthcare/biotech space.
Acquisitions
If no one is going to partner with you to give you better models, or better data to improve your models, the next obvious step is to acquire a company that provides the asset you want. Given the current recessionary landscape, the next 18-24 months are going to be ripe with opportunities for acquisition. Companies that have valuable sources of data will become cheaper and academics/startups constrained by the huge cost of creating models with current methods will find new model structures to get around this constraint. Private equity is set to have a field day.
Let's say you want to make a better generative model, either as a foundation model or modifying the outputs of a foundation model. How would you do that? You would look to acquire new data sources that others are not using. Typically in an area that requires deep expertise or is expensive to acquire. Purchasing data from a data broker is normally available to all, if you can pay. What are some interesting ways you could acquire data that others would not readily have access to? You could buy one of the following:
call center for audio and conversation data
hospital to improve medical expertise
engineering company to generate product designs
media company to gain access to complex content and viewer data not widely available
bio-lab with samples or clinical trial data to gain an understanding of different drug interactions
genetic testing company to understand gene response
chemical manufacturer for chemical process, sensor, and material data
ad agency with a large number of clients to get customer response data
loan provider to understand the attributes of a loan that people respond to
satellite company for geospatial images over time
As the stock market continues to slide, all of these could become economically viable ways to purchase large amounts of data at scale. Typically, the value of the data in these companies are not factored into the valuation. Which means you could potentially buy a revenue generating asset that improves a model and potentially overtakes the value of the purchased asset. I think a lot of interesting deals are going to be done around this.
If instead you are a company that feels like they are falling behind in regards to AI, you could acquire one or multiple small novel companies doing impressive AI work. The current state of tech layoffs means a lot of startups will be created and some of these will find ways around the AI moats of big players by finding more energy efficient model structures. I'm expecting many small leaps forward that start to drive down the compute cost and size of various AI models. These will likely occur in places such as academia or small companies that cannot afford the large cost to train a state-of-the-art model and have no other choice than to innovate.
Future state
Thinking about where AI is headed, a few things are true:
more data is needed
more expert data is needed
models need to do more things
model compute costs need to decrease
AI is a great way to automate processes
data is undervalued in organizations
If you combine these factors, the fastest way for AI providers to improve is through partnering with organizations that have scarce but important data sources or finding ways to acquire those data sources. If you are a business trying to maintain its market advantage, you can increase your capabilities or reduces costs by partnering with an AI provider or acquiring small AI teams to bring the capability in house. In either case, the near term future of AI will be around aggregating many components in one form or another.
I've missed your insights Eric - this is great. Was thinking of you and your family just the other day - hope ya'll are doing well.