I can’t be the only person who is confused about the many, and often contradictory, definitions of AGI floating around. I came in pretty late to the party, way later than most who are super involved in this. I partly blame it on the fact that Alphago was released when I was a child, and that my parents didn’t own a supercomputer while growing up. Just like everyone else, I have been constantly thinking about AGI timelines. I am deeply convinced that this is the most important century(yet) in human civilization, and if we play our cards right, we will unlock capabilities that even the most imaginative fiction writers could have never imagined. However, I have been constantly dismayed by the lack of concensuss on what important terms mean, starting with AGI.
Going by the classical definition, an AGI was a system that could pass the Turing test with as big of a sample of the general population as possible. However, once we learned how to fake conversation though simple hard-coded instructions, we moved the goal post a bit: AGI is a machine that can perform any intellectual tasks a human can do. Well, that conception also started falling apart as super-specialized ML models started taking by storm a lot of tasks that were considered markers of human ingenuinity: chess, go, startcraft, etc. We moved the post again and defined AGI as a machine that outperforms humans on economically significant tasks, however, the rise of industrial automation and the digitalization of everyday life changed the paradigm again. We were left with one definition, it is a machine that can surpass individual human intelligence. With the release of GPT-2, the world was introduced to models that were just plain better than most humans at most tasks that can be expressed through text. The concepts of zero-shot and many-shot learning pretty much revolutionized how we think about AI. As it shifted from a black box that can only do one thing to a black box that can do many things if prompted right. Here are a few papers showing some of those extraordinary abilities, such as being able to solve math olympiad problems, playing chess at a grandmaster level, and writing entire codebases from scratch autonomously.
I feel like, as capabilities of LLMs keep expanding, so does the different definitions of AGI. Some theorize that AGI is still many many years away, while others confidently claim models on the level of GPT-4 are AGI, or at least have seeds of AGI. I am pretty much writing this article to find out what resonates with what I have discovered in the last few years.
One advantage of being so late to the conversation is that I carry little to no intellectual baggage as to what different things mean or why. I started becoming interested in AI during the Deep Learning boom, so I am sadly part of a generation that never thought that statistical learning could lead to general intelligence. So, my world model has constructed AGI as a form of extremely competent Deep Reinforcement Learning Algorithm a la Alphazero. However, I have had to readjust my beliefs after ChatGPT and imitation learning through DPO and RLHF took off. This technology was just too good to not be AGI in my opinion, as it satisfied what my definition of AGI at the time was: a system as good as the median human at 50% of economically valuable tasks.
You might be asking yourself right now: Freeman, why did you set the standards so low?
Well, the simple answer is that 99% of humans are not as good as the median human at 50% of economically valuable tasks. Heck, I am skeptical such remarkable individuals exist. General intelligence is a bit of a misnomer, as humans do not really perceive intelligence using such a broad lense. We celebrate “intelligent” individuals mostly through their contributions in a very narrow field. For instance, a person who is very above average at playing the piano will be considered pretty intelligent in most circles. Same is true for someone who holds a post-graduate degree. Rarely do we observe such incredible polymaths that their contributions are as far and wide as to define more than five disciplines. Using human intelligence as a proxy for general intelligence would mean that something like AlphaGo or AlphaFold is not just General Intelligence, but superintelligence. I am pretty sure that very few individuals would agree with that statement. Instead, many want AGI to be as good as the best humans at ALL tasks.
The problem with an AGI that is good at all tasks is that the search space for intelligence itself is much much vaster than the search space required to, let’s say, play go or chess. It is very hard to perfectly articulate the right reward functions for intelligence outside of human intelligence. That’s why LLMs are so dumb but yet work so well: they model intelligence as a proxy of humanity’s collective written tradition. That is a great strategy, however, like many sceptics, I don’t see it ever reaching a level above what the best humans can produce as it is limited by human output.
Even though I am critical of LLMs as a path to superintelligence, I still hold my ground that they are indeed Generally Intelligent. Computer programs don’t need to understand any of the things they are outputting for them to be intelligent or good at the task(sorry, Yann Lecun). As long as they can do it as well and as reliably as a median human, all that will matter is that the task gets done. However, my prediction is that we are soon going to hit the upper assymptote of the capabilities unlocked by vanilla Transformer models. We might have already done that, as the jump from GPT-4 to GPT-4o didn’t feel as significant as the one from 3.5 to 4. There is a chance some clever bastard is going to discover a way to make LLMs 50% more efficient and smarter, but I still don’t think that, given the way those models are trained, they can achieve anything more than fit 99% of the data they were trained on, even given their emergent capabilities. Fitting 99% of the data and extrapolating from that would be an incredible achievement, probably the most significant in human history, but that would still not be enough to be considered superintelligence. The only way one can truly reach that level is by training on superintelligent data, which, as far as I know, we do not have yet.