AI has long struggled with truth and correctness, and ironically, much of the problem stems from the way human thinking shapes these systems. But a new wave of artificial intelligence is emerging—one that sheds the constraints of human logic and ventures into the unknown, potentially advancing machine learning far beyond human capabilities.
DeepMind’s AlphaGo marked a turning point in AI development. Unlike its predecessors, which relied heavily on human instruction and knowledge, AlphaGo learned to play the game of Go with no human input, only using a process called self-play reinforcement learning. This meant it discovered the rules and strategies on its own by playing countless games and analyzing outcomes.
This self-teaching model was later applied to chess with AlphaZero, another milestone in AI history. In 100 matches against reigning champion Stockfish, AlphaZero won 28 and tied the rest. Unlike traditional chess AIs, which had been trained on human strategies, AlphaZero developed its own understanding of the game without knowing famous moves like the Queen’s Gambit or any grandmaster’s playbook. It relied solely on cold, hard logic—learning from millions of games through trial and error.
When AlphaZero ventured into games like shogi, Dota 2, and Starcraft II, it continued to break ground because it was no longer bound by the limits of human thinking. Without trying to emulate human strategies, it developed its own playstyles, taking advantage of its unique cognitive strengths. As a result, it didn’t just beat humans—it surpassed them in unexpected ways.
The lesson? Human knowledge, while vast, can be a limiting factor for AI. When AIs are allowed to explore and experiment independently, they can achieve things that no human mind would ever predict.
Large language models (LLMs) like ChatGPT have been trained on vast amounts of human-generated text. While they excel at language and communication, they often “hallucinate” or provide incorrect information with overconfidence. This is because language, unlike facts, lives in gray areas. LLMs rely heavily on reinforcement learning from human feedback, where users select the best-sounding responses, regardless of whether they are objectively correct.
But OpenAI’s latest o1 model is starting to break away from this dependence on human input. Inspired by AlphaGo’s experimental approach, o1 incorporates trial and error to refine its answers. Unlike previous models that acted as advanced autocomplete systems, o1 goes through a process called “chain of thought,” reasoning its way through problems during a brief “thinking time” before answering.
This shift introduces reinforcement learning into o1’s problem-solving process, allowing the model to experiment and learn from its successes and failures, much like AlphaGo. In areas where clear right or wrong answers exist—like coding or factual questions—o1 is beginning to outperform even experts, building its understanding from scratch rather than relying on human-dictated steps.
While LLMs are still confined to the limitations of language, a deeper shift is underway in AI research—one that moves beyond text and into reality itself. AIs embedded in robot bodies are starting to develop their own understanding of the physical world through trial and error, similar to AlphaGo’s approach to games.
Freed from the constraints of human disciplines like physics or chemistry, these embodied AIs will probe and experiment with the world, creating knowledge that no human could ever uncover. Without the biases of human thought, they may soon stumble upon new scientific truths and technologies that lie beyond our imagination.
Despite the potential for rapid advancements, real-world learning comes with challenges. Unlike simulations, which can run at high speeds, reality operates at one minute per minute, and robots must physically interact with the world. Nevertheless, AI’s ability to share knowledge and pool insights through swarm learning could significantly accelerate progress.
Companies like Tesla, Figure, and Sanctuary AI are racing to develop humanoid robots capable of functioning in the real world. Once this happens, these robots could begin experimenting and learning about the world in ways that surpass human understanding. Although their discoveries may seem bizarre to us, they will undoubtedly push the boundaries of science and technology.
OpenAI’s o1 model represents more than just another step in AI development—it offers a glimpse into the future, where AIs will evolve past human knowledge. As they move beyond human language and reasoning, these alien machines will begin to solve problems and unlock truths in ways we can’t even comprehend.
With AI’s continuous advancements, the question is no longer whether they will surpass humans in every conceivable way—it’s just a matter of time.
What a time to be alive!
By Impact Lab