Mortal Komputation: On Hinton's argument for superhuman AI.
Last week in Cambridge was Hinton bonanza. He visited the university town where he was once an undergraduate in experimental psychology, and gave a series of back-to-back talks, Q&A sessions, interviews, dinners, etc. He was stopped on the street by random passers-by who recognised him from the lecture, students and postdocs asked to take a selfie with him after his packed lectures.
Things are very different from the last time I met Hinton in Cambridge: I was a PhD student, around 12 years ago, in a Bayesian stronghold safe from deep learning influence. There was the usual email about a visiting academic, with an opportunity to put your name down if you wanted a 30 minute 1:1 conversation with him. He told us he figured out how the brain worked (again)! The idea he shared back then would eventually transform to capsule networks. Of course everyone in our lab knew his work, but people didn't quite go as crazy.
While the craziness is partly explained by the success of deep learning, the Turing award, etc, it is safe to say that his recent change of heart on AI existential risk played a big role, too. I have to say, given all the press coverage I already read, I wasn't expecting much from the talks by way of content. But I was wrong there, the talks actually laid out a somewhat technical argument. And it worked - some very smart colleagues are now considering a change in their research direction towards beneficial AI.
I enjoyed the talks, but did I buy the arguments? I suppose I never really do. So I thought I'll try my best to write it up here, followed by a couple points of criticism I have been thinking about since then. Though touching on many topics, including subjective experiences and feelings LLMs might have, he very clearly said he only is qualified to comment on the differences between biological and digital intelligences, which he has studied for decades. Thus, I will focus on this argument, and whether this should, in itself, convince you to change or update your views on AI and X-risk.
Summary
- Hinton compares intelligence on digital and analogue hardware.
- Analogue hardware allows for lower energy cost but at the cost of mortality: algorithm and hardware are inseparable - the argument goes.
- Digital intelligence has two advantages: aggregating learning from parallel experiences, and backpropagation which is implausible on analogue hardware
- Hinton concludes these advantages can/will lead to superhuman digital intelligence.
- I critically evaluate the claims about both parallelism and the superiority of backprop over biologically plausible algorithms
Mortal Computation
For a long time Hinton, and others, considered our current neural network-based "artificial brains", which run on digital computers, to be inferior to biological brains. Digital neural networks fall short on energy-efficiency: biological brains consume much less energy even though by some measures they are orders of magnitude bigger and more complex than today's digital neural networks.
Hinton therefore set out to build more energy-efficient "brains" based on analogue hardware. Digital computers, he argues, achieve perfect separation of software and hardware by working at the level of abstraction of discrete bits. This enables computation that runs on one computer to be exactly reproduced on any other digital computer. In this sense, the software is immortal: if the hardware dies, the algorithm can live on on another computer. This immortality comes at a high energy price: ensuring digital computers work accurately, they consume quite a lot of energy.
This is in contrast with analogue hardware, which may contain flaws and slight variations in conductances. Thus every analogue computer is slightly different, and learning algorithms running in them have to adapt to the imperfections of analogue hardware. While they may consume a lot less energy, this also means that a "model" trained on one analogue machine cannot be easily ported to another piece of hardware as it has adapted to the specific flaws and imprecisions of the chip it was trained on. Brains running on analogue hardware are mortal: once the hardware dies, the algorithm dies with it.
tldr: anaogue intelligence is energy efficient but mortal, digital intelligence is immortal but energy-hungry
Advantages of digital brains
Hinton then realised that learning algorithms running on digital devices have advantages compared to "mortal" algorithms running on analogue hardware.
Parallelism: Since computation is portable, parallel copies of the same model can be run, and information/knowledge can be exchanged between these copies using high-bandwidth sharing of weights or gradient updates. Consequently, a digital "mind" might be performing tens of thousands of tasks in parallel, then aggregate the learnings from each of these parallel activities into a single brain. By contrast, analogue brains cannot be parallelised this way, because the imprecision of hardware makes communicating information about the contents of the model impossible. The best they can do is to "tell each other" what they learned, and exchange information using an inefficient form of knowledge distillation.
Backpropagation: In addition, a further advantage is that digital hardware allows for the implementation of algorithms like back-propagation. Hinton argued for a long time that backpropagation seems biologically implausible, and cannot be implemented on analogue hardware. The best learning algorithms Hinton could come up with for mortal computation is the forward-forward algorithm, which is resembles evolution strategies. Its updates are a lot noisier compared to backpropagated gradients, and it really doesn't scale to any decent sized learning problem.
These two observations: that digital computation can be parallelised, and enables a superior learning algorithm, backpropagation, which analogue brains cannot implement, lead Hinton to conclude that digital brains will eventually become smarter than biological brains, and based on recent progress he believes this may happen much sooner he had previously thought, within the next 5-20 years.
Does the argument hold water?
I can see a number of ways in which the new arguments laid out for why digital 'brains' will be superior to biological ones could be attacked. Here are the two main points of counterarguments:
How humans learn vs how Hinton's brains learn
Hinton's argument actually critically hinges on artificial neural networks being as efficient at learning from any single interaction as biological brains are. After all, it doesn't matter how many parallel copies of an ML algorithm you run if the amount of "learning" you get from each of those interactions is orders of magnitude smaller than what a human would learn. So let's look at this more closely.
Hinton actually considered a very limited form of learning: imitation learning or distillation. He argues that when Alice teaches something to Bob, Bob will change the weights of his brain so that he becomes more likely to say what Alice just told her in the future. This may be how an LLM might learn, but it's not how humans learn from interaction. Let's consider an example.
As a non-native English speaker, I remember when I first encountered the concept of irreversible binomials in English. I watched a language learning video whose content was very simple, something like:
"We always say apples and oranges, never oranges and apples.
We always say black and white, never white and black.
etc..."
Now, upon hearing this, I understood what this meant. I learnt the rule. Next time I said something about apples and oranges, I remembered that I shouldn't say "oranges and apples". Perhaps I made a mistake, I remembered the rule exists, felt embarrassed, and probably generated some negative reinforcement from which further learning occurred. Hearing this one sentence changed how I apply this rule in lots of specific circumstances, it didn't make me more likely to go around and tell people "We always say apples and oranges, never oranges and apples", I understood how to apply the rule to change my behaviour in relevant circumstances.
Suppose you wanted to teach an LLM a new irreversible binomial, for example that it should never say "LLMs and humans", it should always say "humans and LLMs" instead. With today's model you could either
- fine-tune on lots of examples of sentences containing "humans and LLMs", or
- show it RLHF instances where a sentence containing "humans and LLMs" was preferred by a human over a similar sentence containing "LLMs and humans"
- or prepend the above rule to the prompt in the future, storing the rule in-context. (this one doesn't seem like it would necessarily work well)
In contrast, you can simply tell this rule to a human, they will remember it, recognise if the rule is relevant in a new situation, and use it right away, perhaps even without practice. This kind of 'metacognition' - knowing what to learn from content, recognising if a mistake was made and learning from it - is currently is completely missing from LLMs, although as I wrote above, perhaps not for a very long time.
As a result, even if an LLM sat down with 10,000 physics teachers simultaneously, it wouldn't necessarily get 10,000 more value out of those interactions than a single biological brain spending time with a single physics teacher. That's because LLMs learn from examples, or from human preferences between various generated sentences, rather than by understanding rules and later recalling them in relevant situations. Of course, this may change very fast, this kind of learning from instruction may be possible in LLMs, but the basic point is:
there is a limit to how much learning digital brains can extract from interacting with the world currently
The "it will never work" type arguments
In one of his presentations, Hinton reminded everyone that for a long time, neural networks were completely dismissed: optimisation will get stuck in a local minimum, we said, they will never work. That turned out to be completely false and misleading, local minima are not a limitation of deep learning after all.
Yet his current argument involves saying that "analogue brains" can't have a learning algorithm as good as backpropagation. This is mostly based on the evidence that although he tried hard, he failed to find a biologically plausible learning algorithm that is as efficient as backpropagation in statistical learning. But what if that's just what we currently think? After all the whole ML community could convince ourselves that support vector machines were superior to neural networks? What if we prematurely conclude digital brains are superior to analogue brains just because we haven't yet managed to make analogue computation work better.
Summary and Conclusion
To summarise, Hinton's argument has two pillars:
- that digital intelligence can create efficiencies over analogue intelligence through parallelism, aggregating learning from multiple interactions into a single model
- and that digital intelligence enables fundamentally more efficient learning algorithms (backprop-based) which analogue intelligence cannot match
As we have seen, neither of these arguments are watertight, and both can be questioned. So how much credence should we put on this?
I say it passes my bar for an interesting narrative. However, as a narrative, I don't consider it much stronger than the ones we developed when we argued "methods based on non-convex optimisation won't work", or "nonparametric ML methods are ultimately superior to parametric ones", or "very large models will overfit".
Whether LLMs, perhaps LLMs with a small number of bells and whistles used creatively will pass the 'human level' bar (solve most tasks a human could accomplish through a text-based interface with the world)? I am currently equally skeptical of the theoretically motivated arguments either way. I personally don't expect anyone to be able to produce a convincing enough argument that it's not possible. I am a lot less skeptical about the whole premise than back in 2016 when I wrote about DeepMind's pursuit of intelligence.