The Constrained Resource Amplification Maxim

Understanding where the most effort is applied

May 16, 2024

“Even if you have constrained resources, don’t cut corners. People will feel it.”

— Tony Fadell

The greatest amount work in an endeavor goes into the scarcest and costliest resources. You put a lot more effort into figuring out what home to buy than where to go for lunch. That's because the cost of a home is so much greater and impactful than where you go for lunch. Get lunch wrong and you might be unsatisfied for an afternoon. Get your home wrong and you could potentially be unhappy for years.

One can also see physical equivalents of this phenomenon all around:

Electrical Engineering: Areas of higher resistance take more voltage or block more electrical flow, depending on whether the circuit is in series or parallel. The most resistive elements control how the circuit operates and design changes have to be made to account for highly resistive elements.
Structural Engineering: In non-determinate systems, stiffer structural elements bear more load and are the focus of design.
Fluid Dynamics: In piping systems pressure is the resource that is managed and controlled for. You need high pressure systems for long transport but low pressure at end points for the fluid to be usable.
Network Theory: The largest and most connected nodes control the organization and flow of information in the network. The scarce number of these large nodes makes all other nodes operate according to their preferences. Think of how the entire SEO industry organizers around Google which links to most other nodes on the internet.
Thermodynamics: Heat transfers to areas with the highest thermal capacity, but components that are the most malleable to heat define usage limits and cooling designs.

Let's extend that understanding and improve the mental model a bit. Thirty years ago, a significant amount of effort in software engineering projects was dedicated to data compression. Memory was the constrained resource and controlled how projects were designed and managed. Approximately 80% of the work focused on compressing data efficiently because memory was roughly 1,000,000 times more expensive than it is today. Yes, you read that correctly, a million times more expensive.

Here's the math, assuming Moore's Law for computing performance:

Compute performance doubles every 18 months.
Over 30 years:
$Number\ of\ doublings=\frac{30\ years×12\ months/year}{18\ months/doubling}=20$
Therefore, the multiplier increase in performance is:
$2^{20} ~= 1,048,576$

These days, with compute and storage being a million times cheaper, data and compute requirements are often given little thought. Nowadays, memory efficiency is rarely a concern, allowing for extensive data logging. That led to the world of big data, where the focus is on what insights can be extracted from large amounts of data. The storage of such data was incomprehensible 30 years ago. This shift has resulted in the industry bottleneck moving from the cost of data to the speed of data access, retrieval, and delivery. The constraint is now how quickly the relevant data can be moved, which is where most of the effort and energy is being applied.

Let's define a concept I call the Constrained Resource Amplification Maxim:

Constrained Resource Amplification Maxim: Resources that are the most constrained have the highest level of effort and attention placed on them, typically in a non-linear fashion. This induces higher indirect costs on the constrained resources. The amount of effort expended is proportional to how constraining the resource is.

Why does the Constrained Resource Amplification Maxim hold? Because expensive resources carry significant costs. Any mistake or loss is much more costly. From an insurance perspective, extra attention is given to constrained resources to prevent additional losses, avoid increased costs, or ensure operations can continue smoothly. In other instances, there are physical or supply limits to what can be accomplished. These require much more mental load to deal with and can create designs that seem absurd when scarcity levels change.

To illustrate the point, let's examine a large and highly constrained project - creating a tunnel. The entire project is based on how well one single piece of equipment performs - the tunnel boring machine (TBM). The machines cost somewhere on the order of $10-60M and have entire teams of people whose job it is to make sure that the machine is operational. These teams are typically running in 3 shifts, working around the clock, and costs somewhere between ~$3-$12M+/year to employ and ensure the performance of the machine. Most of those team members spend their time on ensuring the machine is in good working order and doesn't break down. Ultimately, it costs 15-30% of the total machine cost per year to ensure this single resource is operational. The entire project is based on the performance of this single device and therefore maximal effort is applied to it.

If you look at the development of large AI models, the biggest constraint is GPU compute. GPUs were originally designed to process graphics on computers quickly, unlike CPUs which are general purpose computer chips. GPUs process graphics using the same math (linear algebra) that drives most AI algorithms, which is why they were adapted for training deep learning models. At the time of this writing, it currently takes at least 10,000 high end GPUs to create a state-of-the-art foundation model due to the amount of experimentation and compute load required. A lot of work has gone into finding ways to reduce the GPU load required for models. This has resulted in techniques such as LoRA and quantization for reducing the required compute. Or alternatively, reducing the amount of data required as in the case of Phi-3. Microsoft put teams to work for years to find the smallest competitive model they could in Phi-3. It was trained in 7 days on only 512 GPUs. However, that was for the final run, and doesn't account for all of the experiments running in parallel to test different methodologies and compare results. If the compute problem got solved, the next scarcest resource is likely to be either high quality data or evaluation of outputs.

Embracing Enigmas

Discussion about this post