Demis Hassabis (DeepMind Co-founder) – Presentation at Cambridge Society for the Application of Research (Apr 2017)


Chapters

00:00:05 Record-Breaking Cambridge Science Festival Event
00:02:54 Explorations at the Edge of Knowledge: DeepMind's Quest for Artificial General
00:10:07 General Intelligence Through Deep Reinforcement Learning
00:13:28 Machine Intelligence in the Ancient Game of Go
00:19:42 How AlphaGo Revolutionized the Game of Go
00:26:37 AlphaGo's Innovative Strategies in the Game of Go
00:29:06 Surprising Moves in AlphaGo vs. Lee Sedol
00:32:15 Intuition and Creativity in AlphaGo's Gameplay
00:37:07 AI as Meta-Solution: Innovations in Go, Energy, and Medicine
00:48:42 AI for Accelerating Scientific Innovation

Abstract

“The Future of Intelligence: Insights from the Cambridge Science Festival”

Introduction

Speaker Introduction: David Wallace warmly welcomed everyone, especially those attending the Cambridge Science Festival. He emphasized that there are no planned tests and explained the safety procedures in case of an alarm. Wallace requested attendees to turn off their mobile phones to avoid interference with the recording.

Event Popularity: The talk’s popularity exceeded expectations, with over 700 poster downloads from the website. The theatre’s maximum capacity of 300 was reached, prompting the arrangement of an overspill room. Wallace expressed his gratitude to the team, particularly Valerie Anderson, Edward Briffa, and Andrew Shepard, for their efforts in managing the large turnout.

Demis Hassabis Introduction: Hassabis’s reputation and achievements made an elaborate introduction unnecessary. Wallace suggested that anyone unfamiliar with Hassabis could search for “Demis Hassabis” or “DeepMind” online.

Invitation to Demis Hassabis: Wallace invited Hassabis to the stage, expressing his gratitude for his presence.

The Cambridge Society for the Application of Research (CSAR) recently hosted an extraordinary event as part of the Cambridge Science Festival, featuring renowned AI researcher and DeepMind co-founder, Demis Hassabis. This gathering, attracting over 700 enthusiasts with its overwhelming response, served as a platform for Hassabis to share his insights into artificial intelligence (AI), specifically focusing on DeepMind’s revolutionary work in general learning systems and reinforcement learning.

The Essence of DeepMind’s Mission

DeepMind’s Unique Research Culture and Mission: DeepMind combines the best aspects of academia and start-up culture to create a unique research environment that fosters innovation and scientific progress. Its mission is to solve intelligence, with the belief that once intelligence is understood, it can be used to solve various other problems.

General Purpose Learning Machine: DeepMind aims to create the world’s first general purpose learning machine, a system that can learn automatically from raw inputs and operate across a wide range of tasks without being explicitly programmed.

Artificial General Intelligence (AGI) vs. Narrow AI: AGI is a system that can learn and operate across a wide range of tasks, while narrow AI is an expert system that relies on heuristics and rules programmed by humans. Narrow AI, exemplified by IBM’s Deep Blue, lacks generality and true intelligence.

Reinforcement Learning: The Path to AGI

Reinforcement Learning as a Framework for Intelligence: Demis Hassabis describes reinforcement learning, a learning approach inspired by animal learning, as sufficient for general intelligence.

Mathematical and Biological Basis: AIXI, a mathematical framework by Demis Hassabis’ co-founder, suggests that infinite compute time and memory enable general intelligence. Reinforcement learning, implemented by the dopamine system in humans, is a biologically plausible solution to intelligence.

Combining Deep Learning and Reinforcement Learning: Deep reinforcement learning combines deep learning for real perception with reinforcement learning for planning and goal selection. DQN, a deep reinforcement learning system, achieved human-level performance on iconic Atari games.

Key Features of DQN: DQN took pixel inputs from the screen, processing 30,000 numbers per frame. The system had no prior knowledge about game controls, points, or video streams. DQN learned by playing the game multiple times, discovering rules, controls, and structure through experience.

Success of DQN: DQN mastered hundreds of Atari games at or above the level of human experts. This breakthrough demonstrated the potential of general learning systems. The code for DQN was open-sourced, allowing researchers to explore the algorithm further.

DeepMind’s pathway to achieving AGI is through reinforcement learning (RL), where an agent interacts with its environment to attain goals, guided by observations, rewards, and a continuously evolving statistical model. This approach enables the agent to make real-time decisions aimed at maximizing its goal achievement. RL’s potential is underpinned by both mathematical frameworks, like AIXI, and biological evidence, suggesting its central role in achieving general intelligence.

Go: An Ancient Game of Strategy and Elegance

Game Overview: Go is an ancient game of strategy and elegance that originated in China over 3,000 years ago. It is played on a 19×19 grid, where two players take turns placing black and white stones with the aim of surrounding their opponent’s stones and capturing territory. The winner is the player with the most territory at the end of the game.

History and Cultural Significance: Go holds a significant place in Asian culture, particularly in China, Korea, and Japan, where it is considered an art form. Confucius included Go among the four essential arts for scholars to master.

Professional Go Players: Professional Go players dedicate their lives to the game, beginning training at a young age and undergoing rigorous daily training. They live with their Go master and train 12 hours a day, 7 days a week. The top Go players are highly skilled and respected in their respective cultures.

Complexity and Challenges: Go’s complexity stems from its vast search space and the difficulty in evaluating mid-game positions. Unlike chess, where an evaluation function can assess the position and determine who is winning, Go’s constructive nature, where players build up territory rather than destroying pieces, makes it harder to assess the advantage during the game.

AlphaGo: Mastering the Complex Game of Go

Intuition and Creativity in Go: Human Go professionals rely on intuition, instinct, and creativity, rather than calculating abilities, to excel at the game. They often make moves based on feeling rather than explicit plans, using pattern matching and intuition to guide their decision-making.

AlphaGo’s Neural Network Systems: AlphaGo utilized neural networks and reinforcement learning to mimic human intuition and achieve mastery in Go.

Policy Network: The policy network, trained by analyzing 100,000 games from amateur players, predicted human moves and narrowed down the search space by identifying the top probable moves in a given position.

Value Network: The value network, trained to predict the outcome of a game from any board position, outputted a value between 0 and 1, indicating the probability of a win for each side.

Combining Networks with Monte Carlo Tree Search: AlphaGo combined the policy and value networks with Monte Carlo Tree Search, a searching algorithm, to make seemingly intractable problems tractable. This combination enabled AlphaGo to effectively navigate the vast search space of Go.

AlphaGo’s Achievements: In a historic event, AlphaGo defeated the European champion and Lee Sedol, the world champion in Go, in a public $1 million challenge match. These victories garnered significant media attention and marked a paradigm shift in AI research.

AlphaGo: A Testament to DeepMind’s Vision

AlphaGo: AlphaGo, an advanced system that progressively evolved into AlphaGo Zero, is a testament to DeepMind’s mission and capabilities. Its mastery of Go, a game previously considered insurmountable for AI due to its complexity and the subtleties involved in gameplay, highlighted the potential of AI in tasks requiring strategic depth and long-term planning.

Beyond Technical Achievement: AlphaGo’s triumph went beyond technical achievement. Its play illuminated aspects of intuition and creativity, revealing a profound understanding of the game that surpassed conventional strategies. This challenged existing notions of creative and intuitive play in AI and sparked renewed interest in Go, propelling the game to greater heights.

AlphaGo: A Revolutionary Move and Insights into Intuition and Creativity

AlphaGo’s Revolutionary Move 37 in Go

Move 37’s Significance: AlphaGo’s Move 37 in Go was considered pivotal and decisive in winning the game. It was a creative move that surprised and astounded professional Go players. The move was made 50 moves in advance and was perfectly positioned to decide a crucial fight on the board.

Professional Commentary: Professional Go players were surprised by Move 37, which showcased AlphaGo’s advanced strategic thinking and creativity.

Revolutionary Impact: Move 37 was a revolutionary move that highlighted AlphaGo’s innovative thinking throughout the game.

Lisa Dole’s Contribution: Lisa Dole played a crucial role in AlphaGo’s victory by contributing her expertise in Go strategy.

AlphaGo’s Recognition of the Unusual

Lee Sedol’s Divine Move: In game four, Lee Sedol played an incredible move that confused AlphaGo’s value network, leading to a mis-evaluation of the position. Chinese commentators called it a “God move,” highlighting the spiritual significance of such moves in Go.

AlphaGo’s Recognition of the Rare: AlphaGo recognized the rarity of Lee Sedol’s move, assigning it a probability of only 0.007%. This move highlighted the limitations of AlphaGo’s self-play training, as it had never encountered such a move before.

Hassabis’ Vision: AI Beyond Games

While AlphaGo’s triumph in Go is monumental, Hassabis envisions AI’s role far beyond games. He believes AI can push the boundaries of human knowledge, collaborating with humans to discover new solutions in fields like drug discovery, climate change, and personalized education. This collaboration, he posits, can unlock human potential and lead to groundbreaking innovations.

Ethical Considerations and the Future

Hassabis emphasizes the importance of responsible AI deployment, ensuring that its benefits are equitably distributed. He dreams of leveraging AI to accelerate scientific discovery, addressing not only specific challenges but also the systemic complexity that characterizes modern society’s greatest issues.

The Ethical Use of AI and the Importance of AI-Assisted Science:

– Hassabis emphasizes that technology itself is neutral and its impact on society depends on how it is deployed and the distribution of its benefits.

– He believes that AI can be a powerful tool for scientific innovation and medical advancements.

– He expresses his desire to make AI scientists or AI-assisted science possible to accelerate the pace of scientific discovery and innovation worldwide.

– Hassabis’s personal dream has always been to contribute to the field of AI and harness its potential to advance science and medicine.

– He concludes his presentation by reiterating the importance of using technology ethically and responsibly for the benefit of everyone.

Conclusion

The insights shared at the Cambridge Science Festival paint a picture of a future where AI, driven by a deep understanding of reinforcement learning and general intelligence systems, collaborates with humanity. This partnership holds the promise of solving intricate problems, advancing scientific knowledge, and enhancing our world, provided it is guided by ethical principles and a commitment to the common good.


Notes by: OracleOfEntropy