My recent conversation with Rodney Brooks reminded me that he is that rarest of all humans: the combination of an exceptional mind who also likes to build things. But more than this, he is also obsessed with solving problems where there is a real market need and doing so in ways that are appealing to the humans that the solutions were designed to help. Throughout his storied career, he has married the state of the art in AI and robotics in ways that maintain human agency, always giving people the ability to control the systems in ways that are natural to them.
The urge to build something, combined with an entrepreneurial spirit, drove Brooks to found a startup in the robotics space, and one that is known the world over: iRobot, which produced the Roomba vacuum cleaner. Brooks likes to point out that the Roomba was the result of 13 failed attempts by the company to produce a commercially viable robot—a classic tale of persistence and trial and error that is the stuff of startup legend. Since then, Brooks has founded a number of pioneering intelligent robotics companies—Rethink Robotics, which produced a human-assisting two-armed robot called "Baxter," and Robust.AI, which is currently working on human-assisting warehouse cart robots, known as "Carter."
The golden thread throughout has been the premise that people only accept things when they don't lose agency. In our recent conversation, Brooks pointed out that humans have invented a phenomenal number of machines and software systems that we are willing to adopt without understanding their internal operation or engineering, as long as they behave reliably and in predictable ways and we have a measure of control of these systems, so we have the ability to step in and override things are going awry or not meeting our expectations.
To illustrate his point, he frequently uses the example of driving a car; few of us understand exactly how a car operates, and we initially find the experience of driving unfamiliar and disconcerting. However, after a relatively short time, we not only become proficient at driving a car, but we treat it as an extension of ourselves: it becomes "second nature" to us. And when we encounter unusual circumstances—for example, a pothole in the road or slippery conditions—we are able to intuitively understand and adapt to these scenarios, as we have incorporated the car's operation into our own "world model" because it behaves in a consistent way, and in a way that we can reliably control.
Plugging Into Our World Models
As I discussed in my recent article, this ability to create new tools and to extend our skill sets is rooted in the capability of our neocortex to expand our world models beyond our physical limits and purviews, to allow us to adapt to different circumstances, and to learn from others, in order to cumulatively build knowledge and efficiently master new environments.
We have essentially "hacked" this same ability to allow us to operate and even tele-operate a wide variety of physical machines and systems we have invented that allow us to travel farther and faster, and to perform superhuman tasks. The same can be said in the intellectual domain; we are able to leverage computing machines and software running complex algorithms to enhance our ability to think and to increase our access to information beyond what is stored in our brains.
However, Brooks points out that these unique human capabilities have some unexpected side effects that have come to the fore in our recent interactions and fascination with large language models (LLMs), such as ChatGPT.
Don't Believe It's Magic
First, Brooks points out that we are guilty of anthropomorphizing machines, so when an AI system or machine does something surprisingly well, we assume it has broader competence like a human; "when humans explain how to get from one place to another in English, we know that they are broadly able to give directions between places, and are also able to broadly communicate in English." But no such assumption is applicable for any machine because they lack generalizable models and depend on their training data and internal algorithms, which are incomplete relative to the vastness of human experience and knowledge.
The inability to perceive this limitation is amplified for LLMs as their facile use of human language suggests to humans that all other brain capabilities (Steering, Learning, Simulating and Mentalizing) exist in addition to the Communicating function, as this is the foundational cognitive basis set for all humans. In short, LLMs' facility with language deceives us into believing that they have all the other constituent human cognitive abilities.
Second, when we don't understand how a machine or AI is doing what it does, we are unable to predict its limits and so we treat it as "indistinguishable from magic," in the famous words of Arthur C. Clarke. This can similarly be attributed to our brain's creative and imaginative capabilities and the attendant sense of exhilaration and wonder upon encountering something new.
Brooks observes that the current hyperbole around LLMs is the direct product of such deceptive or delusional magical thinking.
Don't Believe the Hype
Brooks goes on to point out that this is not just individual behavior, but a collective behavior that he thinks is best described by the (not so catchy) acronym, FOBAWTPALSL, for Fear of Being a Wimpy Techno-Pessimist and Looking Stupid Later. In short, he argues that people don't want to be seen to be technology Luddites or deniers, and this is particularly true for seemingly magical technologies where the inference is that, if only they were more intelligent, they would have seen the light and invested earlier. This societal or group pressure results in people not applying the level of judgment and precision they would otherwise employ when it comes to checking the claims that are being made. And this effect is compounded by the overt advocacy of the vested interests who have invested vast sums to develop the associated technologies and supporting infrastructure, many of whom also own social media channels that further amplify the technophilic hysteria. Brooks makes the amusing observation that "the herd mentality that results is like a group of 5-year-olds playing soccer—everyone flocks to the ball and ignores everything else."
The downside of this behavior, he observes, is that investment tends to become focused on a limited set of technologies and approaches, and other promising alternatives and avenues are overlooked, hindering longer-term progress in favor of short-term rewards.
But despite these clarion words of caution, Brooks remains optimistic about the potential for AI to manifestly assist and augment human endeavors in the coming years and decades.
LLMs: Friend or Foe?
Brooks is a fan of Daniel Kahneman's elemental model for the operation of the brain, in which there are two systems working together: a fast, automatic, reactive "System 1" that is tasked with instant recognition and autocompletion-type functions, and a slower, more deliberative cognitive processing "System 2." System 1 is then populated by System 2's processing of experiences in the world, and the resultant distillation of what works and therefore is the next best action (or word).
With this backdrop, all current AI models are all System 1-type, with LLMs like OpenAI's ChatGPT, Anthropic's Claude, Meta's Llama or Google's Gemini models being text-autocompletion machines, which do not have an associated System 2. Brooks points out that what is remarkable is that these models seem to have captured knowledge and even some "meaning" in a way that was unexpected, just based on analyzing trillions of words from webpages and a more limited set of reference books and sources; it is as if they possess an intermediate "System 1.5" functionality. Note: Diffusion models like OpenAI's Dall-E, Stability AI's Stable Diffusion and Midjourney are analogous image auto-complete machines, but it is not clear that they have yet captured meaning in the same way as LLMs.
Importantly, there are many such System 1 AI tools already in use. For example, video-based Simultaneous Location and Mapping (vSLAM) systems allow a set of cameras on a robotic system or car to build a map of the world; convolutional neural networks (CNNs) have shown remarkable proficiency for image processing and interpretation, and we now have an impressive set of Diffusion models for image generation. In addition, there are a massive number of control systems that run different types of AI or machine learning logic or algorithms that underpin nearly every machine system we use today, each of which has elements of System 1 logic, and even some limited System 2 programmatic "understanding" of the world in which they operate.
So, looking forward, Brooks sees a future in which we invent System 2-type machines and models that can be combined with LLMs that will act as natural, intuitive human "front ends." And, just as we do for such systems today, for example for AI assistants for home control or prompting of LLMs, we will learn to use different levels of language based on the observed intelligence. He argues that this will come naturally, as this is precisely what we do with other humans like babies, for whom language is nascent, or foreign-language speakers for whom simpler speech is optimal, and similarly for those whose cognition is impaired.
But even in this future, Brooks points out that humans will always be "in the loop" for any nontrivial system or scenario, in order to manage all the "edge cases" or unique scenarios that dominate our world and which cannot be described by any formal model. Human experience and our ability to build and update our multidimensional, multisensory world models simply cannot be replicated for the foreseeable future. So, humans will have to check these AI systems and machines to verify they have the right "credentials" in a given domain.
In fact, combined machine/AI System 1 + human System 2 solutions are already the norm, for example:
◼ Autonomous vehicles are remotely assisted by human operators that intervene when unexpected circumstances occur.
◼ Autonomous warehouse robots increasingly assist human workers with human override/control when needed.
◼ Advanced image processing is used to assist human diagnosticians, resulting in earlier medical diagnoses.
◼ Human scientific understanding is assisted by AI systems that analyze complex scientific problems such as protein folding prediction or particle physics trajectories, or novel materials design.
◼ Novel mechanical structures are predicted by AI systems to assist human engineers and architects.
In all likelihood, Brooks thinks that this future will not be realized by a single omniscient system, but rather by building a set of "expert" cognitive System 2 models that have the required knowledge, perspective and ability to reason in a given area. Then, the optimum set of models will be dynamically integrated and orchestrated for any given task. At the same time, our physical world will be enhanced with changes to our infrastructure that will allow better sensing and coordination of autonomous machines and robots in our world.
Keep Calm and Carry On...for Longer Than You Think
But this won't all happen tomorrow. Brooks points out that humans adopt machine usage when the value is greater than the effort associated with the change, in terms of the cost in goods, energy, time or another value—in other words, when there is a positive Return On Investment (ROI). And the same will be true for new AI and robotic systems or solutions. He cautions that we must not be fooled by two beliefs that have come to characterize modern business culture:
i) We believe that there will be exponential improvement in performance.
ii) We believe that market adoption will be equally rapid.
The former is rarely true for any prolonged period because the starting point is frequently not far from optimal, given the succession of past innovations on which the technology is built, and as Brooks puts it, "if exponentials continued forever, they would eat the universe."
Even the most dramatic example of prolonged exponential improvement—the Moore's Law increase in digital device density is finally coming to an end after 50 years, as we have reached the physical limit for reliably producing a "1" or a "0."
Notably, in the physical world, we have iteratively improved the performance of machines over the past 100+ years, so only incremental gains are typically possible. And some goods or systems actually get bigger, more complicated and become more expensive. Brooks gives the example of the Model T Ford, which was simpler and much cheaper, with a price equating to ~2 percent of average wages in 1924, compared to more than 80 percent of average wages for an average car today.
On the second point, we tend to have an unfounded belief that "magical" technologies must be adopted everywhere immediately. But adoption dynamics are defined by human and physical world transformation timescales, and these changes and build-outs, e.g., in telecommunications or transportation networks or energy or computing infrastructures, take decades to complete.
The net effect of these two misconceptions is that we habitually overestimate the near-term impact and underestimate the long-term impact, a phenomenon known as Amara's Law. Brooks points out that Amara's Law has been observed in both the AI and robotics domains, with more than seven decades of hype cycles that have overpromised and under-delivered in terms of ROI.
Where's the ROI?
Like me, Brooks doesn't see that there is any logic to a blanket refusal to adopt AI "machines," pithily noting that "rejection of the [future] role of AI machines is like saying we're all going to be Amish."
Instead, he advocates that we should look for areas where there is a critical problem to be solved—where there is real economic or human value to be gained—and then evaluate what combination of machine intelligence and human skills and oversight will provide a solution that delivers this value, with the required reliability and predictability.
Based on his more than 50 years of experience, he sees multiple areas where we are likely to see real progress in combined AI and robotic systems that augment humans in the next few years, for example:
1. In goods movement in warehouses, factories, in large retail stores and malls, as well as in distribution hubs, intelligent machines will optimize the stacking, packing and movement of goods.
2. In volume farming, intelligent machines will autonomously fertilize, remove weeds, pick berries, fruits and vegetables, and pack the produce into shipping containers for transport.
3. In eldercare, intelligent machines will allow people to stay in their own homes longer with an acceptable level of independence and dignity.
He also believes that there will be manifest benefits to humanity along the way, as AI will allow us to discover better world models that describe and advance human understanding, either explicitly or as "emergent properties" of these models.
Most importantly, despite some doomsday predictions that many jobs will be automated by AI and robotics, Brooks is firmly in the camp that the majority of jobs will be augmented, not automated and replaced, in line with the recent report by the World Economic Forum. So, he concludes that the future is not one where "we will all be able to retire and sit around and eat grapes while humanoid robots do everything for us."
This scenario—a sort of perpetual hedonistic indolence —is more dystopian than utopian since, as I have previously argued, humans have evolved to value the effort we apply, both individually and collectively.