How far are we from human level AGI?

Since Homo sapiens started to walk on the face of Earth approximately 200,000 years ago, they have simultaneously been navigating a realm of ideas and knowledge that is less tangible but certainly not less significant. Human history is marked by a series of discoveries and inventions that have shaped that very history. Some of these have not merely influenced our narrative but also potentially exerted an impact on our biology. For instance, the discovery of fire enabled primitive human beings to cook their food, thus enabling the redirection of calories to fuel our brain instead of our intestines, contributing to the development of our intelligence.

From the invention of the wheel to the creation of the steam engine, humans have ushered in the industrial revolution. Electricity paved the way for the development of technology as we know it, while the printing press expedited the widespread dissemination of novel ideas and culture, hastening the pace of innovation.

However, progress does not solely result from new material discoveries; it also emerges from new ideas. The history of the so-called Western world unfolds from the fall of the Roman Empire through the Middle Ages, experiencing a rebirth during the Renaissance and the Enlightenment, which emphasised the centrality of the human mind over an omnipotent deity. Yet, as human knowledge advanced, our species began to comprehend its own insignificance. Over two millennia following Socrates, we started “to know we knew nothing”, and our Earth was no longer considered the centre of the universe. The universe itself expanded, with us merely inhabiting a tiny speck of dust within it.

A change of perspective on reality

However, the 20th century might have been the most tumultuous century yet in terms of reshaping our understanding of the world. In 1931, Kurt Gödel published his incompleteness theorem. Merely four years later, in a continuation of the theme of “completeness,” Einstein, Podolsky, and Rosen presented the EPR paper titled “Can Quantum-Mechanical Description of Physical Reality Be Considered Complete?” This work was subsequently countered by Niels Bohr, demonstrating the actual validity of quantum physics.

While Gödel’s theorem demonstrated that not even Mathematics can aspire to conclusively prove everything -we will always have unprovable facts- Quantum theory illustrated that our world lacks determinism, rendering us incapable of predicting certain events, such as an electron’s speed and position, despite Einstein’s famously stated position that “God does not play dice with the universe.” In essence, our limitations extend beyond mere predictions or comprehension of occurrences within our physical realm. Regardless of our attempts to devise an alternative mathematical universe governed by any rules we conceive, such an abstract universe will invariably remain incomplete, harbouring undeniable facts that elude proof.

However, beyond mathematical statements, our world is also replete with philosophical statements describing realities that we find ourselves incapable of describing, fully articulating, comprehending or even simply defining.

In a manner reminiscent of the unsettling of the concept of “Truth” in the early 20th century, other notions such as “Art,” “Beauty,” and “Life” lack even a rudimentary consensus in terms of definition. Yet, these are not isolated cases; others exist, with “Intelligence” and “Consciousness” undoubtedly being included as well.

Defining Intelligence

In an attempt to bridge this gap, in 2007 Legg and Hutter formulated a definition of intelligence, stated in Universal Intelligence: A Definition of Machine Intelligence, which posits that “intelligence measures an agent’s ability to achieve goals in a wide range of environments.” Similarly, in Problem-Solving and Intelligence, Hambrick, Burgoyne, and Altmann argue that the ability to solve problems is not just an aspect or feature of intelligence – it is the essence of intelligence. There is a lexical similarity in these two statements as achieving goals can be associated with solving problems.

Broadening the perspective, in Mainstream Science on Intelligence: An Editorial with 52 Signatories, Gottfredson summarises the definition of several researchers with: “Intelligence is a very general mental capability that, among other things, involves the ability to reason, plan, solve problems, think abstractly, comprehend complex ideas, learn quickly, and learn from experience. It is not merely book learning, a narrow academic skill, or test-taking smarts. Rather, it reflects a broader and deeper capability for comprehending our surroundings – ‘catching on,’ ‘making sense’ of things, or ‘figuring out’ what to do.”

This definition augments the construct of intelligence beyond a mere “problem-solving skill” by introducing two pivotal dimensions: learning from experience and the ability to comprehend our surroundings. In other words, intelligence should not be seen as an abstract ability to find solutions to general problems, tout-court, but specifically as the ability to apply solutions learned from prior experiences to potentially distinct situations arising from our environment.

This underscores the intrinsic interrelation between intelligence and learning. In How we learn, Stanislas Dehaene defines learning as “learning is to form a model of the world” thereby implying that intelligence also necessitates the capacity to comprehend our surroundings and build an internal model to describe them. Thus, intelligence also requires, though it may not be exhaustive, the ability to create a model of the world.

How intelligent are current machines?

When discussing Artificial General Intelligence (AGI) versus Narrow AI, we often highlight their differences. Narrow AI, or weak AI, is widespread and successful, often outperforming humans in specific tasks. An illustrative example is the 2016 defeat in the game of Go by AlphaGo, a narrow AI, of the world champion Lee Sedol with a resounding 4-1 win. Nonetheless, the victory of amateur Kellin Perline in 2023 using a tactic the AI did not detect, illustrates the limitations of narrow AI in certain situations. It lacked the human ability to recognize an uncommon strategy and adapt accordingly.

In fact, at a very basic level, every data scientist, even the most inexperienced, understands that every machine learning model, on which AI is based, including the simplest models, needs to strike a balance between bias and variance. This means learning from the data so that solutions can be understood and generalised rather than memorised. Narrow AI, leveraging the computational power and memory capacity of computers, can relatively effortlessly generate intricate models based on extensive observed data. However, these models often fail to generalise once the conditions change slightly.

It is like formulating a gravitational theory, based on observations, that only works on Earth and then to realise that objects are much lighter on the moon. If we base our knowledge on the theory of gravity using variables rather than numbers, we will understand how to quickly predict gravitational strength on every planet or satellite using the correct values. However, if we exclusively employ numerical equations devoid of symbols, we will not be able to correctly generalise those equations to other worlds without rewriting them.

To rephrase, AI may not truly “learn” but rather distils information or experiences. Instead of forming a comprehensive model of the world, AI creates a synopsis.

Despite the hype, have we actually reached AGI?

The definition of AGI, as it is now commonly understood, is that of an AI system that can understand and reason across many cognitive domains at the human level or beyond. This is in contrast to current narrow AI systems that are specialised for specific tasks (like the AlphaGo example we used). AGI denotes an AI system endowed with comprehensive, human-level intelligence spanning diverse realms of abstract thought.

This, as mentioned, requires the ability to create a model of the world that is consistent with experience and permits accurate postulations of predictions.

Aligned with the perspectives of most AI researchers and authorities, we remain several years away from attaining genuine AGI, although projections for its arrival vary widely. In the AGI Safety Literature Review, Everitt, Lea, Hutter state, “Surveys among AI researchers have found median predictions for AGI between 2040 and 2061, with estimates varying widely, from never to just a few years into the future.” What is certain is that AGI is not among us just yet.

A recent paper titled Sparks of Artificial General Intelligence: Early experiments with GPT-4 states: “We contend that […] GPT4 is part of a new cohort of LLMs […] that exhibit more general intelligence than previous AI models. We discuss the rising capabilities and implications of these models. We demonstrate that, beyond its mastery of language, GPT-4 can solve novel and difficult tasks that span mathematics, coding, vision, medicine, law, psychology and more, without needing any special prompting. Moreover, in all of these tasks, GPT-4’s performance is strikingly close to human-level performance, and often vastly surpasses prior models such as ChatGPT. Given the breadth and depth of GPT-4’s capabilities, we believe that it could reasonably be viewed as a nearly (yet still incomplete) version of an artificial general intelligence (AGI) system”.

The catch? The paper is from Microsoft, an OpenAI partner.

As quoted in a May New York Times article, Carnegie Mellon professor Maarten Sap said: “The Sparks of AGI is an example of some of these big companies co-opting the research paper format into PR pitches.” In an interview with IEEE Spectrum, researcher and robotics entrepreneur Rodney Brooks underscored that in evaluating the capabilities of systems like ChatGPT, we often “mistake performance for competence.”

Mistaking performance for competence means, put it in another way, creating a synopsis of the world instead of a model of the world.

One of the most glaring issues has to do with the data the AI model is trained on. Most models are trained on text alone and do not have the ability to speak, hear, smell or live in the real world. As I have previously posited in Chat-GPT, alias Johnny-come-lately, this scenario bears resemblance to Plato’s allegory of the Cave, wherein individuals perceive mere shadows on a wall rather than authentic beings. Even if able to create a model of the world, their world is a textual-only world, syntactically correct but not inherently semantically comprehensive. This environment lacks the “common sense” emanating from direct perception, thus experiencing a glaring dearth.

What are the main limitations of current Large Language Models?

Another one of the most debated challenges of Large Language Models (LLMs) such as Chat-GPT or GPT-4 is their tendency to hallucinate. Hallucination refers to the tendency of these models to fabricate references and facts or to be sometimes completely nonsensical. Hallucinations stem from the lack of understanding of the cause-effect relationship between events.

In Is ChatGPT a Good Causal Reasoner? A Comprehensive Evaluation, the authors conclude that “ChatGPT has a serious causal hallucination issue, where it tends to assume causal relationships between events, regardless of whether those relationships actually exist”. They further state that “ChatGPT is not a good causal reasoner, but a good causal interpreter”, highlighting again its capacity to distil connections when explained, but its incapability to infer them through the construction of an existing world model where they naturally reside. While the article focuses on ChatGPT, this can arguably be extended to any LLMs.

Essentially, we can discern that LLMs are good at recognising and extracting causal relationships from data, but lack the ability to actively reason about novel causal scenarios on their own. They possess capabilities for causal induction through observation, but not for causal deduction.

The distinction highlights a limitation – the system can recognise causal patterns, but lacks the ability for abstract causal reasoning. It is not generating new causal insights, but merely interpreting causal links from data.

However, if intelligence entails learning from experience, and learning translates to creating a model of the world that we can use to understand our surroundings, causal deduction constitutes a fundamental element of learning and, consequently, of intelligence, – a facet that existing models are missing. This stands as a pivotal milestone we must achieve to progress towards AGI (or at least one of the crucial elements).

Conclusion

As the early 20th century demonstrated, reality is often different from the intuition we develop through our daily observations. Just as physicists were unable to grasp quantum mechanics at the beginning of the 20th century due to its defiance of our human intuition, we are now constructing entities confined to a limited slice of reality, even smaller than what we, as humans, can experience. Analogous to how we eventually grasped a reality that defied our experience, we will truly reach AGI when and only when we will be able to create systems which will have the ability to question their own reality, capable of self-enquiry and, at the very least, using causal deduction to formulate a plausible model of the world.

This prospect could signify another stride in human history toward acknowledging our progressively diminishing significance within this universe.

Appendix:

Definitions of Narrow AI and AGI by the Stanford Human Centered Artificial Intelligence

Narrow AI is intelligent systems for one particular thing, e.g., speech or facial recognition.
Human-level AI, or Artificial General Intelligence (AGI), seeks broadly intelligent, context-aware machines. It is needed for effective social chatbots or human-robot interaction.

Definitions of Narrow AI and AGI by the three main LLMs

ChatGPT (OpenAI):

Narrow AI (Weak AI): Narrow Artificial Intelligence, commonly referred to as Narrow AI or Weak AI, pertains to artificial intelligence systems that are designed and trained to perform specific, well-defined tasks within a limited domain. These systems demonstrate intelligence and exhibit proficiency in the designated task, often surpassing human capabilities, yet they lack the capacity to generalise their knowledge or adapt to tasks outside their predetermined scope. Narrow AI excels at specialised functions such as language translation, image recognition, and game playing, but its abilities do not extend to comprehending context beyond its programmed expertise.
AGI (Artificial General Intelligence): Artificial General Intelligence, denoted as AGI, signifies a form of artificial intelligence that embodies human-like cognitive capabilities across a diverse array of intellectual tasks. Unlike Narrow AI, AGI possesses the capacity to understand, learn, and apply knowledge across various domains, simulating human reasoning, problem-solving, and adaptability. AGI systems have the potential to tackle novel situations, understand natural language, and engage in creative thinking, paralleling the depth and breadth of human intelligence. Achieving AGI represents a milestone where machines attain a level of cognitive versatility akin to human minds, capable of autonomously navigating a spectrum of tasks without domain-specific constraints.

Bard (Google):

Narrow AI (ANI) is a type of AI that is designed to solve a specific problem or task. For example, a narrow AI chatbot can be used to have conversations with humans, but it cannot perform other tasks, such as driving a car or playing chess.
Artificial general intelligence (AGI) is a hypothetical type of AI that would have the ability to perform any intellectual task that a human being can. AGI would be able to learn, reason, and solve problems in a way that is indistinguishable from human intelligence.

Here is a table that summarises the key differences between narrow AI and AGI:

Feature	Narrow AI	AGI
Purpose	Solves a specific problem or task	Can perform any intellectual task
Intelligence	Limited to the task it is designed for	General intelligence, similar to humans
Ability to learn	Can learn to improve its performance on a specific task	Can learn new tasks and adapt to new situations
Ability to reason	Can reason within the context of the task it is designed for	Can reason in a general way, similar to humans
Ability to solve problems	Can solve problems within the context of the task it is designed for	Can solve problems in any domain, similar to humans

It is important to note that narrow AI is a real and existing technology, while AGI is still a hypothetical concept. However, there is a lot of research and development happening in the field of AI, and it is possible that AGI may be achieved in the future.

Claude (Anthropic):

Narrow AI: Narrow artificial intelligence (narrow AI) is an AI system designed and trained to perform a single, specific task. Narrow AI systems do not have general intelligence and cannot transfer their skills to other tasks. Common examples of narrow AI include chess engines, self-driving cars, and chatbots. The formal definition of narrow AI is: An artificial intelligence system that is focused on one narrow task and goal. Narrow AI systems have a limited range of capabilities and cannot transfer knowledge or skills to other domains or tasks. Narrow AI systems are not capable of general intelligence comparable to humans.
AGI: Artificial general intelligence (AGI) refers to an AI system with general cognitive abilities and the capacity to understand or learn any intellectual task that a human being can. AGI systems have a broad range of capabilities and can apply knowledge and skills to a variety of domains and tasks. The formal definition of AGI is: An artificially intelligent system that exhibits intelligence and capabilities at least equal to that of a human across a wide range of cognitive domains and tasks. AGI systems display general intelligence that allows for knowledge transfer across different domains and the ability to understand or learn any intellectual task at the level of the average human. AGI systems are capable of fluid reasoning, knowledge retention, self-improvement, and human-level comprehension and problem solving skills.

Valentino Zocca

How far are we from human level AGI?

Leave a Reply Cancel reply