Demis Hassabis (DeepMind Co-founder) – Towards General Artificial Intelligence | MIT (Jul 2016)


Chapters

00:00:17 AI Development at DeepMind and the AlphaGo Project
00:03:08 DeepMind: Fusing Silicon Valley and Academia for Artificial General Intelligence Research
00:08:51 Deep Reinforcement Learning and Grounded Cognition: A Framework for Building General Intelligence
00:16:19 Neural Turing Machines for Symbolic Reasoning
00:21:07 Unlocking the Mysteries of Go: From Rules to the Deep Blue Challenge
00:31:15 Training AlphaGo: From Supervised Learning to Reinforcement Learning and Monte Carlo Tree Search
00:39:53 AlphaGo's Performance Against Computer Programs and Human Players
00:42:08 AlphaGo: Defeating Professional Go Players
00:45:21 AlphaGo's Creative Play and Intuition in Go
00:52:11 AlphaGo's Surprising Moves and Cultural Impact
00:55:57 AlphaGo's Impact on Go and AI's Potential for Real-World
01:02:58 Supervised Learning vs. Reinforcement Learning in AI Development

Abstract

The Evolution of AI: From Chess to Go, and Beyond

Breaking Ground in AI: Demis Hassabis and the DeepMind Odyssey

Demis Hassabis, a figure synonymous with contemporary AI advancements, recently delivered an enlightening talk at MIT. Hassabis, whose journey from chess prodigy to AI luminary encompasses extensive studies in computer science, successful gaming ventures, and neuroscience research, is the mastermind behind DeepMind, an organization that has redefined the landscape of artificial intelligence.

DeepMind’s Inception and Ambitious Mission

DeepMind, founded in 2010 and later joining forces with Google in 2014, embarked on a mission to unravel the enigma of intelligence. Its goal is to replicate the versatility and adaptability of human intelligence in machines, thereby enabling them to autonomously tackle a diverse array of challenges. The organization has grown into a large team of over 200 research scientists and engineers. Demis Hassabis views DeepMind as an “Apollo program for AI,” aiming to fundamentally solve intelligence and subsequently use it to address various challenges.

Philosophical Underpinnings and Methodological Approaches

DeepMind’s philosophy is centered on creating general-purpose learning algorithms. These algorithms are designed to automatically learn from raw inputs and experiences without pre-programming, exhibiting flexibility and adaptability necessary for unforeseen situations. This is in stark contrast to narrow AI systems limited to specific tasks. DeepMind’s AGI is designed to be flexible, adaptive, and potentially inventive, capable of operating across a wide range of tasks and handling unexpected scenarios, unlike the narrow AI exhibited by the chess-playing computer Deep Blue.

Advancing the Frontiers of AI Through Reinforcement Learning

Reinforcement learning (RL), where an AI agent learns from interactions within an environment guided by reward signals, is at the heart of DeepMind’s strategy. This approach, inspired by the brain’s dopamine-driven learning mechanisms, is seen as pivotal in achieving general intelligence.

From Virtual Playgrounds to Real-World Challenges

DeepMind posits that true AI must be grounded in rich sensory-motor experiences. Virtual worlds and games, with limitless data and controlled testing environments, are ideal platforms for developing and testing AI algorithms. These environments provide unlimited training data and independent benchmarks, offering a challenging and diverse environment for AI development.

Deep Reinforcement Learning: The Fusion of Learning and Perception

DeepMind’s approach to AI, combining deep learning with reinforcement learning, is exemplified in Deep Reinforcement Learning (DRL). DRL, initially tested on the Atari 2600 platform, demonstrated AI’s ability to master various games, signaling a significant leap in AI capabilities. The approach involves an agent interacting with an environment, receiving rewards for positive actions and penalties for negative ones, and learning to maximize rewards. DeepMind’s DQN algorithm, a deep reinforcement learning system, plays Atari games without prior knowledge of rules or structure, achieving human-level performance.

Artificial Hippocampus and Neural Turing Machines (NTMs):

DeepMind’s exploration into systems neuroscience has led to the development of the Neural Turing Machine (NTM), a model combining neural networks with a memory storage system, allowing for symbolic reasoning and complex problem-solving. NTMs address the need for large-scale, controllable memory in neural networks. They comprise a recurrent neural network, similar to a CPU, and an extensive memory store, like a K-N memory system, allowing for learning and manipulation of memory through gradient descent. NTMs open up new avenues for symbolic reasoning and can tackle classic AI problems, such as the Shrewsbury class of problems, involving block manipulation and scene understanding. Mini Shroudelou, a simplified 2D blocks world problem, demonstrates NTMs’ problem-solving capabilities. Trained using reinforcement learning, NTMs improve over time and are tested on unseen scenarios. They show promise in solving logic puzzles and graph problems, with a significant publication on NTMs expected later in the year.

The AlphaGo Phenomenon: A Milestone in AI

AlphaGo, combining neural networks with advanced planning techniques, marked a watershed moment in AI history. Mastering Go, a game known for its vast search space and reliance on intuition, AlphaGo demonstrated unprecedented AI sophistication. AlphaGo overcame Go’s challenges, which include a huge branching factor and the need for intuition and calculation. It used two deep neural networks, a policy network and a value network, trained through supervised learning and reinforcement learning, respectively. The policy network predicts moves, while the value network evaluates board positions. AlphaGo’s use of Monte Carlo Tree Search, assisted by neural networks, allowed for more efficient search processes, demonstrating its originality and potential.

Training AlphaGo: A Blend of Human Mimicry and Self-Learning

AlphaGo’s training involved a blend of human mimicry and self-learning. It utilized a policy network trained through supervised learning to mimic human moves, and a value network to predict game outcomes. Additionally, it underwent millions of self-play games, constantly improving its strategies and generating a new dataset of expert-level games.

Monte Carlo Tree Search: Enhancing AI Strategy

The incorporation of Monte Carlo Tree Search (MCTS) in AlphaGo’s design revolutionized its strategy. MCTS, augmented by neural networks, allowed for a more efficient search process, evaluating the desirability of moves based on a blend of action value and prior probability.

AlphaGo’s Astonishing Achievements and Future Implications

AlphaGo’s stunning victories, including a 5-0 win against Fan Hui and a 4-1 triumph over Lee Sedol, not only shocked the Go community but also ignited global discourse on AI’s potential and limitations. AlphaGo’s use of original strategies, especially the famous “Move 37” against Lee Sedol, showcased an AI’s ability to generate novel strategies, challenging traditional Go strategies. DeepMind’s vision extends beyond mastering games. Its technologies, developed through projects like AlphaGo, are being directed towards solving real-world problems in healthcare, robotics, and personal assistance. The success of AlphaGo has also spurred interest in demystifying deep learning models, with initiatives like Virtual Brain Analytics at the forefront. Plans are underway to train AI using only reinforcement learning, starting from a blank slate. This approach could unlock new fields of expertise and understanding, while ensuring AI generalizes beyond games to real-world applications.

Pioneering a New Era of Artificial Intelligence

DeepMind, under Hassabis’s leadership, is not just transforming our understanding of games; it is reshaping our perspective on artificial intelligence. From the depths of ancient Go strategies to the potential of AI in addressing complex real-world challenges, DeepMind’s journey is a testament to the boundless possibilities within the field of artificial intelligence.


Notes by: datagram