Is artificial intelligence as a term per se racist?
Artificial intelligence is an old term. It came into common usage, famously, in 1956, at the Dartmouth Summer Conference on Artificial Intelligence. I've written about this before, about how the field was born in a time of tremendous optimism about the usefulness of information processing metaphors for understanding all kinds of systems in the world. At the time of the summer conference, the attendees believed that implementing human-like intelligence in a computer system would be a relatively straightforward matter of implementing the models of cognition that the nascent field of cognitive science—founded on the premise that the human brain can be modeled as an information processing system—had developed and which effectively explained both behavioral and neuroscientific data from humans and animals. That this was wrong proved to be an extended and painful indictment of the overconfidence cognitive scientists had in their models of human intelligence. But in 1956 that reckoning was in the future, and at the time of the Dartmouth Conference the question of whether human intelligence was well characterized was likely pretty low on the list of concerns.
You can get a sense of this optimism by looking at one of the more famous interdisciplinary papers of the period, Newell and Simon's "General Problem Solver". In this paper, written by two of the founding lights of both Artificial Intelligence and Cognitive Science, they built a practical computational model capable of solving the logical, symbolic manipulation-type problems that they understood to be central to human intelligence. The question "what if these kinds of logical inferences are not actually central to human intelligence" didn't get asked. Which is too bad, because one of the things that the nascent AI industry learned to its dismay is that they very much are NOT central to intelligence, and the failure of these kinds of approaches led to the first "AI Winter", which almost killed the field.
I've told this story before. One aspect that I have been thinking about recently, though, is that the confidence the early AI researchers had in extant models of human intelligence mirrored a general confidence in those models, particularly on the part of psychologists and the organizations that employed them to quantify human behavior. In 1956 it was common knowledge that not only was "intelligence" a well-defined concept, it was easily testable—via IQ tests—and effectively quantifiable on an individual level. You could say, in other words, not only what intelligence was but how much of it somebody had, and you could do that empirically, reliably, and with complete scientific confidence. When the Dartmouth researchers set out to create "artificial intelligence", they had a very specific, well-defined, largely universal target to hit.
What I've been thinking about lately is that the specific, well-defined target was also deeply, intentionally, comprehensively racist. The history of measurement of IQ is ignominious at best; Charles Spearman, who developed the modern concept of an underlying intelligence factor, g, that could be elucidated by means of a set of different but correlated IQ tests, was explicitly a racist and eugenicist who was looking for a biological basis for the inferiority of other races. He's not an outlier, intelligence testing from its origins is a poisoned well. A poisoned well the products of which have been broadly discredited.
It wouldn't be fair to say that Spearman-style intelligence measures were the only thing the early AI researchers were considering in their early definitions of intelligence. Certainly Turing's imitation game was part of their thinking, as was the ever-present dream of a computer that could beat a human at chess. But central in all of their conceptions was that an artificial intelligence should be good at the things that a person who does well on IQ tests is good at. It should be good at the things that (white, male, Western) humans think of as indicative of intelligence. Fluent speech. Symbol manipulation. Complex zero-sum games.
There are two main problems with all of those measurements. The first is that none of them, as we have learned, is central to human intelligence. Chess was the first domino to fall (sorry), with human-competitive chess algorithms emerging that were pretty evidently in no way intelligent. DeepMind has had luck at all sorts of zero-sum video games. LLMs can produce fluent speech but have no idea if the language they are producing makes sense. As I've written before, it's pretty easy to get a sense of what all of these approaches are missing, and it's the things that humans—every human—are so effortlessly good at that we don't even notice how good we are. Things like social cognition. There's even a name for this: Moravec's Paradox. The pioneering roboticist Hans Moravec—one of the founders of transhumanism, as hard core a robotics enthusiast as there's been—noticed that the things that computers were bad at seemed specifically to be the things that computers were good at.
The second problem is that these kinds of measures are sociologically fraught. Linguistic fluidity, pattern matching on tests, chess: all of these are skills that are strongly linked to socioeconomic factors like race and family income. So when you're designing systems based around the ability to perform these tasks, you're designing a system with a certain very specific view of what cognitive skills are necessary in the world baked into it, and that view is one that is deeply, historically racist and sexist. But artificial intelligence, as a field, has consistently focused on them at least in part because the intellectual framework of what intelligence is continues to be derived from Spearman, Thurstone, and the other eugenicists who developed IQ. There isn't another coherent model of intelligence to work from.mora
So when you have a large language model that is being touted for its peformance on standardized tests like the GRE, or on its ability to do symbol matching, you are seeing the shadow of bad old eugenicist intelligence measurement rearing its head. There are other problems with the development and deployment of AI which serve—as researchers like Timnit Gebru have noted forcefully for years—to reinforce existing hierarchies and prejudices. But along with them the lack of good measurement tools for the actual cognitive abilities of large language models and the like leads to the acceptance of a sort of cod-intelligence that was born from—and continues to support—racial hierarchy and prejudice.