LLMs, anthropocentric thinking, accuracy, and self-driving
If you’ve been comfortably ensconced at the bottom of a deep well for the past couple months, you might not be aware that certain VCs -- as well as, for that matter, Microsoft and Google -- are convinced that ChatGPT and other large language models (LLMs) are on the verge of taking over the world. OpenAI's GPT-3 — the most prominent large language model, so named because of the vast scale of its training data and parameter space — is a few years old at this point, but OpenAI seems to believe they've figured out how to make it passable for public access, mostly by recruiting poorly paid Kenyan gig workers to hard code in a lot of fail-safes to keep it from saying things that are wildly racist or sexist. They want the training feedback that's only accessible by opening it up to a wide audience, so now it is public, and in the public consciousness. There have been thousands of words written about these models, some by the models themselves, but I'm interested in what they might tell us about other applications of "AI"1, specifically autonomous cars.
Large language models are different from earlier generations of deep networks both in architecture and in scale of training data.
These models are a style of neural network called a "transformer". Architecturally, I don't think I could do a terribly good job explaining how they're different from the convolutional neural networks that had a burst of popularity starting a decade ago, but there are lots of good explanations out there. Here is the paper that launched the boom; the short form is that they threw away most of what seemed to be important in the architecture of neural networks and vastly scaled up the streamlined core components that pay attention to long-distance connections — between different parts of an image, or the beginning and end of a paragraph. Unlike previous machine learning systems, transformers are very good at encoding the ways that widely separated parts of data — the beginning and end of a paragraph, or the top left and bottom center of an image — relate to each other, a weakness of convolutional neural networks and other earlier generative models.
The other way that large language models — and, for that matter, the new generation of image generators like Dall-E and Stable Diffusion — are different from previous approaches is that the architecture is amenable to being trained with far, far more data. The largest GPT-3 model was trained with about a half a trillion word-like tokens, comprising a good fraction of all the writing that has ever been posted online in human history.
For these so-called generative models -- those that produce an image, or a block of text -- the shift to transformer architectures and vastly larger datasets has made for enormous leaps in the perceived quality of output. Generative image models like Dall-E and Midjourney are capable of producing extremely credible images in any style whatsoever, and large language models like GPT-3 can credibly answer English questions with multiple paragraphs of coherent, readable text. When people first experience these models, the experience can be startling and even overwhelming. Famously, a google engineer convinced himself that one of Google's internal large language models was in fact conscious, working himself up in into an ethical dither that eventually led to his leaving the project and being fired by Google. More recently, an oddly credulous New York Times reporter worked himself up into a lather after getting Microsoft's new LLM-based Bing search chat function to, for lack of a better phrase, get all freaky with him.
GPT-3 is not conscious. The psychologist — or, I guess, AI gadfly, at this point? — Gary Marcus has written, perhaps a little trollishly, that you could make a better case for the consciousness of a Roomba than for a large language model. But the fact that people interpret these models as conscious is illustrative. It is illustrative of how machine learning systems work as actual implemented systems in the world. It is also illustrative of some of the ways they fail to work, and will probably continue to fail to work.
All of this has some pretty substantial implications for self-driving cars.
One of the things that is going on in people’s reaction to these models is a variety of the anthropomorphic fallacy. That's our tendency, as humans, to impart human-like internal states to non-human (and even non-living) things. We humans are fundamentally extremely social -- more so than almost all other primates -- and much what we think of as "human-like" cognition revolves around how to accomplish complicated reasoning about (and interacting with) other people. The social nature of human cognition makes us extremely prone to imputing comprehensible internal states to other agents in the world. It's something we do constantly and basically automatically. This is incredibly helpful when we're interacting with other people -- it is another way of describing "theory of mind", our ability to intuit what other people are thinking -- and it's really at the center of how we're able to be cooperative. Arguably (if you ask comparative animal cognition expert Michael Tomasello) it's the core ability that led to human language and intelligence. This happens so robustly, and so automatically, that we impute comprehensible internal states to agents in the world that probably lack them (lobsters, say) and agents in the world that _definitely_ lack them (stuffed animals, an IKEA lamp). Large language models like GPT-3 are fundamentally in the business of exploiting the anthropomorphic fallacy.
In the user experience (UX) world, people talk about design metaphors. The "folders" containing "files" on a mac "desktop" are exploiting the metaphor of the physical desk. By giving these abstract organizational structures familiar names, you cause the people using them to import their existing understanding of how these structures relate to each other. You know, without having to learn anything new, that files go in folders, folders on a desktop are visible and can be arranged relative to each other, and so on.
"AI", when it comes to applications like LLM chatbots (and autonomous cars) can be thought of as a design metaphor for machine learning. You have a system that can synthesize information from an unimaginably vast pool of raw material, and the way you interact with it is as "an intelligence": you talk to it like a person, and it responds like a person.
That's a really powerful design metaphor, if you can pull it off. People are extraordinarily good at interacting with intelligences, certainly much better than they are at getting their desks organized. To a first approximation it's what our big humans brains are for — what they evolved to do — and it's something most every human has been practicing since birth. If your automated system passes the bar of seeming enough like a human, then people automatically know how to interact with it. It's probably the single richest UX metaphor you can use.
It's worth digging into the sheer richness of these interactions. The reason we so immediately understand how to interact with humans is that we can intuit what they are probably thinking. If we read a paragraph somebody else wrote, we put ourselves into something like the frame of mind that person had when they wrote it. We know the point they're trying to make, we probably have some insight into what they think about that point, we might even have some insight into their emotional state. It's not too much to say that we have a subjective, conscious experience that is shared with the author when we read something.
So when we read something written by a machine learning system, what happens? Well, roughly the same thing. Our brains do what they were evolved to do and put us in the headspace of the agent that wrote what we're reading. We intuit what the machine learning system was thinking. Except that the machine learning system wasn't thinking. Not in anything like the kind of way we imagine it to have been. That's the anthropomorphic fallacy: the machine learning system has no particular internal representation of the world — or at least, not one that's coherently similar to a human representation of the world — so all the narrative, perceptual understanding of the world that we imagine to have gone into the machine's output is imparted by us, the reader. We automatically fill in a huge amount of, well, thinking — internal goals, opinions, experiences — that wasn't involved in the production of the text. That filling in process is a lot of what makes the output seem so impressive. It really brings home the idea that this text was the product of intelligence. But all the aspects of the text that give us that impression are things that we as readers brought to the party. It's an illusion of cognitive depth.
That illusory cognitive depth explains a lot of the aspects of the public reaction to ChatGPT, and also a lot of the things that are tricky about it. Here's one example. We have a strong bias to assume that when we ask somebody a factual question, either they'll answer with a basically correct factual answer or there will be strong indications that they aren't. Their intention is signaled in all sorts of ways — there are whole books about it — but in general if you ask somebody a question and they answer with a clear, authoritative answer, that's because they are pretty sure they actually know the relevant thing. They could be wrong, of course, but most of the time when we're wrong about something we suspect that we might be, and that fact gets signalled. ChatGPT, on the other hand, has no such metaknowledge. When you ask it a question it draws on its inexhaustibly huge reservoir of training data and synthesizes the most likely response to your query. Is that the correct response? It really has no idea. All it can say is that it's the most likely response. So any question you ask it will get a completely straightforward answer, which may or may not be correct.
The illusion of cognitive depth also happens with autonomous cars. I've said for years that one of the big issues with autonomous cars is that their behavior has to be comprehensible in human terms. When they make driving decisions, if those driving decisions don't make sense to a human, then the metaphor will be broken, and people won't know how to act around them. People will start probing them in much the same way they are trying to probe and "jailbreak" ChatGPT. But it's actually when autonomous cars work well enough that the metaphor mostly holds that things get really dangerous. Because like with ChatGPT, we expect other road users to signal to use their confidence in the correctness of their actions — somebody who is trying to determine if it's safe to enter a vehicle's path behaves very differently from somebody who is confident it is safe to do so, or somebody who doesn't know the vehicle is there. Autonomous cars will, like ChatGPT, always make the driving decision that the models find to be the modal correct decision, but do not have an underlying theory of social driving that is leading to that decision. They will never be making driving moves because they want to let somebody yield, they will be making driving moves by very, very sophisticated pattern-matching. The lack of the social contextual signals that humans provide when driving will lead to a similar difficulty — on the part of human drivers, pedestrians, and others interacting with autonomous cars — in knowing whether those vehicles are operating from good or poor information.
In consequence, you'll see similar kinds of fallacies in how people evaluate autonomous cars. The early reaction — like Kevin Roose's first reaction to Bing search — will vastly overestimate the capabilities of these vehicles, because it will impute the level of internal deliberation and sophistication a human driver would need to do the same things. Only gradually will people realize that the actual performance of these vehicles, their ability to do the "right" thing, to be accurate in the ways that matter, is vastly insufficient to the task at hand.
My reason for using the scarequotes will become more apparent later but it's always good to keep track of the distinction between "machine learning" -- a robust set of algorithmic techniques and "artificial intelligence" -- a loose, often misused metaphor.