Thoughts on raising intelligent software

2023-03-17

Ted Chiang's story The Lifecycle of Software Objects is a literary exploration of what might happen if artificially intelligent beings were raised, instead of "trained". In the story, digital entities are given a virtual environment to explore, and interact with humans and with each other. Over the years, they develop similarly to how people or animals develop over time.

The dominant paradigm for training artificial intelligences is, of course, based on a very different idea. Even the terminology is different, in important ways. When we talk about natural intelligences, i.e. about people or about animals, training has to with attaining a specific ability. But we would never say that that general intelligence — or sentience, or consciousness — is a result of training. When it comes to artificial intelligence, though, some prominent voices in the field believe that we can generate general intelligence through training. Training for machine learning is not that different from training for people, or for animals. In all cases, the trainee is basically asked to do the same things over and over, improving from feedback on each attempt, until he/she/they/it is competent.

A lot of people have pointed out that it's a bit strange to think that general intelligence will arise from training a model on a handful, or even a few score of tasks. In the cases of recent large-scale models, those tasks are things like predicting the most likely next word, given context, or predicting a textual description of an image. The training data is important to the training process, but so is the task itself. Given the same training data, it's possible to train models on very different tasks, with very different outcomes.

Advocates for training-based approaches think that there's enough information about the process of thinking in the data, specifically in the context of the relevant tasks, that in mastering the tasks, an artificial intelligence can also learn to "think". Skeptics of this idea point out that there are kinds of reasoning fundamental to intelligence that cannot be captured by this kind of training. Two examples are causal reasoning — leading thinkers on causality sometimes disparagingly call this kind of machine learning "curve fitting" — and symbolic reasoning.

I think these skeptics are right, and that it will be important to find ways to integrate these kinds of reasoning. But I also think that to some degree they're missing the point. Integrating causal or symbolic reasoning into the training process will just expand the number of skills the artificial intelligence is capable of learning. I don't know of any methods in these fields that would allow a model to develop its own symbolic taxonomies, or its own strategies for learning causal relationships. My belief is that it will be difficult to develop such methods without reorienting our thinking about what it means to develop artificial intelligence.

I'm curious about what it would mean to approach artificial intelligence from the perspective of "raising" a model, instead of "training" it. What kinds of models are amenable to being "raised"? What kind of simulated environment can we use to raise them? How much of it can we automate? As implied by Ted Chiang's story (and actually, by another of his stories as well), I suspect it's less than we might think.