Demis Hassabis (DeepMind Co-founder) – Presentation at Cambridge Society for the Application of Research (Apr 2017)
Chapters
00:00:05 Record-Breaking Cambridge Science Festival Event
Speaker Introduction: David Wallace welcomed everyone, especially those attending the Cambridge Science Festival. He emphasized that there are no planned tests and explained the safety procedures in case of an alarm. Wallace requested attendees to turn off their mobile phones to avoid interference with the recording.
Event Popularity: The talk’s popularity exceeded expectations, with over 700 poster downloads from the website. The theatre’s maximum capacity of 300 was reached, prompting the arrangement of an overspill room. Wallace expressed his gratitude to the team, particularly Valerie Anderson, Edward Briffa, and Andrew Shepard, for their efforts in managing the large turnout.
Demis Hassabis Introduction: Hassabis’s reputation and achievements made an elaborate introduction unnecessary. Wallace suggested that anyone unfamiliar with Hassabis could search for “Demis Hassabis” or “DeepMind” online.
Invitation to Demis Hassabis: Wallace invited Hassabis to the stage, expressing his gratitude for his presence.
00:02:54 Explorations at the Edge of Knowledge: DeepMind's Quest for Artificial General
DeepMind’s Unique Research Culture and Mission: DeepMind combines the best aspects of academia and start-up culture to create a unique research environment that fosters innovation and scientific progress. Its mission is to solve intelligence, with the belief that once intelligence is understood, it can be used to solve various other problems.
General Purpose Learning Machine: DeepMind aims to create the world’s first general purpose learning machine, a system that can learn automatically from raw inputs and operate across a wide range of tasks without being explicitly programmed.
Artificial General Intelligence (AGI) vs. Narrow AI: AGI is a system that can learn and operate across a wide range of tasks, while narrow AI is an expert system that relies on heuristics and rules programmed by humans. Narrow AI, exemplified by IBM’s Deep Blue, lacks generality and true intelligence.
Reinforcement Learning as a Framework for Intelligence: DeepMind uses reinforcement learning to develop intelligent systems that can interact with their environment, receive observations and rewards, and select actions to achieve goals. This involves building statistical models of the environment, planning and imagining future actions, and executing actions in real time.
00:10:07 General Intelligence Through Deep Reinforcement Learning
Introduction: Demis Hassabis describes reinforcement learning, a learning approach inspired by animal learning, as sufficient for general intelligence.
Mathematical and Biological Basis: AIXI, a mathematical framework by Demis Hassabis’ co-founder, suggests that infinite compute time and memory enable general intelligence. Reinforcement learning, implemented by the dopamine system in humans, is a biologically plausible solution to intelligence.
Combining Deep Learning and Reinforcement Learning: Deep reinforcement learning combines deep learning for real perception with reinforcement learning for planning and goal selection. DQN, a deep reinforcement learning system, achieved human-level performance on iconic Atari games.
Key Features of DQN: DQN took pixel inputs from the screen, processing 30,000 numbers per frame. The system had no prior knowledge about game controls, points, or video streams. DQN learned by playing the game multiple times, discovering rules, controls, and structure through experience.
Success of DQN: DQN mastered hundreds of Atari games at or above the level of human experts. This breakthrough demonstrated the potential of general learning systems. The code for DQN was open-sourced, allowing researchers to explore the algorithm further.
AlphaGo: AlphaGo, an extension of the techniques used in DQN, is the focus of the rest of the presentation.
00:13:28 Machine Intelligence in the Ancient Game of Go
Game Overview: Go is a game between two players who take turns placing black and white stones on a 19×19 grid. The aim is to surround your opponent’s stones and capture territory. The winner is the player who has the most territory at the end of the game.
History and Cultural Significance: Go has a long history of over 3,000 years, originating in China. It is particularly popular in Asia, where it is considered an art form. Confucius included Go among the four essential arts for scholars to master.
Professional Go Players: Professional Go players begin training at a young age and dedicate their lives to the game. They live with their Go master and train 12 hours a day, 7 days a week. The top Go players are highly skilled and respected in their respective cultures.
Complexity and Challenges: Go is a complex game with an enormous search space and an average of 200 possible moves per position. Unlike chess, it is difficult to write an evaluation function to assess the position and determine who is winning. The game’s constructive nature, where players build up territory rather than destroying pieces, makes it harder to evaluate mid-game positions.
00:19:42 How AlphaGo Revolutionized the Game of Go
Intuition and Creativity in Go: Human professionals rely on intuition, instinct, and creativity, rather than calculating abilities, to excel at Go. Unlike chess, Go players often make moves based on feeling rather than explicit plans. They rely on pattern matching and intuition to guide their decision-making.
AlphaGo’s Neural Network Systems: AlphaGo utilized neural networks and reinforcement learning to mimic human intuition and achieve mastery in Go.
Policy Network: The first neural network, the policy network, was trained to predict human moves by analyzing 100,000 games from amateur players. It narrows down the search space by identifying the top probable moves in a given position.
Value Network: The second neural network, the value network, was trained to predict the outcome of a game from any board position. It outputs a value between 0 and 1, indicating the probability of a win for each side.
Combining Networks with Monte Carlo Tree Search: AlphaGo combined the policy and value networks with Monte Carlo Tree Search, a searching algorithm, to make seemingly intractable problems tractable. This combination enabled AlphaGo to effectively navigate the vast search space of Go.
AlphaGo’s Achievements: AlphaGo defeated handcrafted Go programs and the European champion in private matches. In a public $1 million challenge match, AlphaGo defeated Lee Sedol, the world champion in Go, 4-1, a decade ahead of expert predictions. The games were viewed by 200 million people and garnered significant media attention.
Significance of AlphaGo’s Victory: AlphaGo’s victory was notable not only for its triumph but also for its creative approach to the game. It demonstrated the program’s ability to learn from and surpass human experts, leading to a paradigm shift in AI research.
00:26:37 AlphaGo's Innovative Strategies in the Game of Go
Move 37: AlphaGo played an unexpected move on the right-hand side of the board, called a shoulder hit. This move surprised professional Go players and commentators.
The Significance of the Fifth Line: In Go, there are two critical lines: the third and fourth lines. Traditionally, playing on the third line indicates taking territory on the side of the board. Playing on the fourth line signifies taking power and influence in the center of the board. For 3,000 years, it was believed that playing on the third line and the opponent playing on the fourth line was an equal trade.
AlphaGo’s Innovation: AlphaGo played on the fifth line, one line higher than the traditional fourth line. This move gave away territory to the opponent but gained power and influence in the center. AlphaGo’s move challenged the traditional wisdom of Go and showed a new way to gain an advantage.
Reevaluation of Power and Influence: AlphaGo’s move suggests that humans may have underestimated the value of power and influence in the center of the board. This move opened up new possibilities and strategies in Go.
00:29:06 Surprising Moves in AlphaGo vs. Lee Sedol
Move 37’s Significance: AlphaGo’s Move 37 in Go was considered pivotal and decisive in winning the game. It was a creative move that surprised and astounded professional Go players. The move was made 50 moves in advance and was perfectly positioned to decide a crucial fight on the board.
Professional Commentary: Michael Redmond, a 9 Dan professional Go player, was surprised by Move 37 and initially misplaced the stone on the demo board. Chris Garlick, Redmond’s colleague, thought the computer operator had misclicked due to the unexpected nature of the move.
Revolutionary Impact: Move 37 was a revolutionary move that showcased AlphaGo’s advanced strategic thinking and creativity. There were numerous examples of AlphaGo’s innovative moves throughout the game.
Lisa Dole’s Contribution: Lisa Dole played a crucial role in AlphaGo’s victory by contributing her expertise in Go strategy. The team won the first three games against Lee Sedol, creating intense tension, especially in the first game.
00:32:15 Intuition and Creativity in AlphaGo's Gameplay
AlphaGo’s Uncertain Victory: AlphaGo’s internal testing showed it was stronger than the European champion, but self-play training raised concerns about overfitting. The first game against Lee Sedol anchored AlphaGo’s rating analysis, confirming its predicted victory.
Lee Sedol’s Divine Move: In game four, Lee Sedol played an incredible move that confused AlphaGo’s value network, leading to a mis-evaluation of the position. Chinese commentators called it a “God move,” highlighting the spiritual significance of such moves in Go.
AlphaGo’s Recognition of the Unusual: AlphaGo recognized the rarity of Lee Sedol’s move, assigning it a probability of only 0.007%. This move highlighted the limitations of AlphaGo’s self-play training, as it had never encountered such a move before.
Impact of the Match: The match garnered immense attention, with 280 million viewers and 35,000 press articles. There was a surge in Go board sales and increased interest in Go clubs, indicating a renewed enthusiasm for the game.
Operationalizing Intuition and Creativity: Intuition is regarded as implicit knowledge acquired through experience, which cannot be consciously accessed or expressed. The quality of intuition can be tested by evaluating the moves chosen in a given Go position. Creativity is the ability to synthesize existing knowledge to produce novel ideas in pursuit of a goal. AlphaGo demonstrated both intuition and creativity within the constrained domain of the game of Go.
00:37:07 AI as Meta-Solution: Innovations in Go, Energy, and Medicine
Exploring the Edge of Knowledge: Demis Hassabis’s mission at DeepMind is to explore the “absolute binary of knowledge” and push the boundaries of human understanding.
Perfecting AlphaGo: DeepMind continued to develop AlphaGo after defeating Lee Sedol, aiming to fix knowledge gaps, create delusions to confuse AlphaGo, and pull it into unexplored areas of the knowledge landscape.
Interpreting AlphaGo’s Knowledge: DeepMind conducted fMRI experiments to compare the representations AlphaGo used to those of human Go players, seeking insights into Go motifs and strategies.
Achieving Perfection: AlphaGo’s new system achieved a perfect score against top Go players online, demonstrating its strategic superiority and unveiling new innovations in the game.
Human-AI Collaboration: Koji, the Chinese and world number one Go player, emphasized the potential for human and AI collaboration to discover new truths in Go.
AlphaGo as a Meta-Solution: Hassabis sees AlphaGo as a meta-solution to information overload and system complexity, potentially aiding in drug discovery, material design, medical diagnosis, and other fields.
Real-World Applications: AlphaGo-like techniques are being applied to data center optimization, resulting in significant energy savings for Google.
Mastering Complex Systems: Hassabis believes that AI can help humans master complex systems like disease, macroeconomics, and climate, which are beyond the full grasp of even the most intelligent individuals.
Ethical Considerations: Hassabis emphasizes the need for ethical and responsible use of AI, given its potential power.
00:48:42 AI for Accelerating Scientific Innovation
The Neutrality of Technology: Demis Hassabis emphasizes that technology itself is neutral and its impact on society depends on how it is deployed and the distribution of its benefits.
Benefits of AI for Science and Medicine: Hassabis believes that AI can be a powerful tool for scientific innovation and medical advancements.
Accelerating Scientific Innovation: He expresses his desire to make AI scientists or AI-assisted science possible to accelerate the pace of scientific discovery and innovation worldwide.
Personal Motivation for Working in AI: Hassabis’s personal dream has always been to contribute to the field of AI and harness its potential to advance science and medicine.
Conclusion: Hassabis concludes his presentation by reiterating the importance of using technology ethically and responsibly for the benefit of everyone.
Abstract
“The Future of Intelligence: Insights from the Cambridge Science Festival”
Introduction
Speaker Introduction: David Wallace warmly welcomed everyone, especially those attending the Cambridge Science Festival. He emphasized that there are no planned tests and explained the safety procedures in case of an alarm. Wallace requested attendees to turn off their mobile phones to avoid interference with the recording.
Event Popularity: The talk’s popularity exceeded expectations, with over 700 poster downloads from the website. The theatre’s maximum capacity of 300 was reached, prompting the arrangement of an overspill room. Wallace expressed his gratitude to the team, particularly Valerie Anderson, Edward Briffa, and Andrew Shepard, for their efforts in managing the large turnout.
Demis Hassabis Introduction: Hassabis’s reputation and achievements made an elaborate introduction unnecessary. Wallace suggested that anyone unfamiliar with Hassabis could search for “Demis Hassabis” or “DeepMind” online.
Invitation to Demis Hassabis: Wallace invited Hassabis to the stage, expressing his gratitude for his presence.
The Cambridge Society for the Application of Research (CSAR) recently hosted an extraordinary event as part of the Cambridge Science Festival, featuring renowned AI researcher and DeepMind co-founder, Demis Hassabis. This gathering, attracting over 700 enthusiasts with its overwhelming response, served as a platform for Hassabis to share his insights into artificial intelligence (AI), specifically focusing on DeepMind’s revolutionary work in general learning systems and reinforcement learning.
The Essence of DeepMind’s Mission
DeepMind’s Unique Research Culture and Mission: DeepMind combines the best aspects of academia and start-up culture to create a unique research environment that fosters innovation and scientific progress. Its mission is to solve intelligence, with the belief that once intelligence is understood, it can be used to solve various other problems.
General Purpose Learning Machine: DeepMind aims to create the world’s first general purpose learning machine, a system that can learn automatically from raw inputs and operate across a wide range of tasks without being explicitly programmed.
Artificial General Intelligence (AGI) vs. Narrow AI: AGI is a system that can learn and operate across a wide range of tasks, while narrow AI is an expert system that relies on heuristics and rules programmed by humans. Narrow AI, exemplified by IBM’s Deep Blue, lacks generality and true intelligence.
Reinforcement Learning: The Path to AGI
Reinforcement Learning as a Framework for Intelligence: Demis Hassabis describes reinforcement learning, a learning approach inspired by animal learning, as sufficient for general intelligence.
Mathematical and Biological Basis: AIXI, a mathematical framework by Demis Hassabis’ co-founder, suggests that infinite compute time and memory enable general intelligence. Reinforcement learning, implemented by the dopamine system in humans, is a biologically plausible solution to intelligence.
Combining Deep Learning and Reinforcement Learning: Deep reinforcement learning combines deep learning for real perception with reinforcement learning for planning and goal selection. DQN, a deep reinforcement learning system, achieved human-level performance on iconic Atari games.
Key Features of DQN: DQN took pixel inputs from the screen, processing 30,000 numbers per frame. The system had no prior knowledge about game controls, points, or video streams. DQN learned by playing the game multiple times, discovering rules, controls, and structure through experience.
Success of DQN: DQN mastered hundreds of Atari games at or above the level of human experts. This breakthrough demonstrated the potential of general learning systems. The code for DQN was open-sourced, allowing researchers to explore the algorithm further.
DeepMind’s pathway to achieving AGI is through reinforcement learning (RL), where an agent interacts with its environment to attain goals, guided by observations, rewards, and a continuously evolving statistical model. This approach enables the agent to make real-time decisions aimed at maximizing its goal achievement. RL’s potential is underpinned by both mathematical frameworks, like AIXI, and biological evidence, suggesting its central role in achieving general intelligence.
Go: An Ancient Game of Strategy and Elegance
Game Overview: Go is an ancient game of strategy and elegance that originated in China over 3,000 years ago. It is played on a 19×19 grid, where two players take turns placing black and white stones with the aim of surrounding their opponent’s stones and capturing territory. The winner is the player with the most territory at the end of the game.
History and Cultural Significance: Go holds a significant place in Asian culture, particularly in China, Korea, and Japan, where it is considered an art form. Confucius included Go among the four essential arts for scholars to master.
Professional Go Players: Professional Go players dedicate their lives to the game, beginning training at a young age and undergoing rigorous daily training. They live with their Go master and train 12 hours a day, 7 days a week. The top Go players are highly skilled and respected in their respective cultures.
Complexity and Challenges: Go’s complexity stems from its vast search space and the difficulty in evaluating mid-game positions. Unlike chess, where an evaluation function can assess the position and determine who is winning, Go’s constructive nature, where players build up territory rather than destroying pieces, makes it harder to assess the advantage during the game.
AlphaGo: Mastering the Complex Game of Go
Intuition and Creativity in Go: Human Go professionals rely on intuition, instinct, and creativity, rather than calculating abilities, to excel at the game. They often make moves based on feeling rather than explicit plans, using pattern matching and intuition to guide their decision-making.
AlphaGo’s Neural Network Systems: AlphaGo utilized neural networks and reinforcement learning to mimic human intuition and achieve mastery in Go.
Policy Network: The policy network, trained by analyzing 100,000 games from amateur players, predicted human moves and narrowed down the search space by identifying the top probable moves in a given position.
Value Network: The value network, trained to predict the outcome of a game from any board position, outputted a value between 0 and 1, indicating the probability of a win for each side.
Combining Networks with Monte Carlo Tree Search: AlphaGo combined the policy and value networks with Monte Carlo Tree Search, a searching algorithm, to make seemingly intractable problems tractable. This combination enabled AlphaGo to effectively navigate the vast search space of Go.
AlphaGo’s Achievements: In a historic event, AlphaGo defeated the European champion and Lee Sedol, the world champion in Go, in a public $1 million challenge match. These victories garnered significant media attention and marked a paradigm shift in AI research.
AlphaGo: A Testament to DeepMind’s Vision
AlphaGo: AlphaGo, an advanced system that progressively evolved into AlphaGo Zero, is a testament to DeepMind’s mission and capabilities. Its mastery of Go, a game previously considered insurmountable for AI due to its complexity and the subtleties involved in gameplay, highlighted the potential of AI in tasks requiring strategic depth and long-term planning.
Beyond Technical Achievement: AlphaGo’s triumph went beyond technical achievement. Its play illuminated aspects of intuition and creativity, revealing a profound understanding of the game that surpassed conventional strategies. This challenged existing notions of creative and intuitive play in AI and sparked renewed interest in Go, propelling the game to greater heights.
AlphaGo: A Revolutionary Move and Insights into Intuition and Creativity
AlphaGo’s Revolutionary Move 37 in Go
Move 37’s Significance: AlphaGo’s Move 37 in Go was considered pivotal and decisive in winning the game. It was a creative move that surprised and astounded professional Go players. The move was made 50 moves in advance and was perfectly positioned to decide a crucial fight on the board.
Professional Commentary: Professional Go players were surprised by Move 37, which showcased AlphaGo’s advanced strategic thinking and creativity.
Revolutionary Impact: Move 37 was a revolutionary move that highlighted AlphaGo’s innovative thinking throughout the game.
Lisa Dole’s Contribution: Lisa Dole played a crucial role in AlphaGo’s victory by contributing her expertise in Go strategy.
AlphaGo’s Recognition of the Unusual
Lee Sedol’s Divine Move: In game four, Lee Sedol played an incredible move that confused AlphaGo’s value network, leading to a mis-evaluation of the position. Chinese commentators called it a “God move,” highlighting the spiritual significance of such moves in Go.
AlphaGo’s Recognition of the Rare: AlphaGo recognized the rarity of Lee Sedol’s move, assigning it a probability of only 0.007%. This move highlighted the limitations of AlphaGo’s self-play training, as it had never encountered such a move before.
Hassabis’ Vision: AI Beyond Games
While AlphaGo’s triumph in Go is monumental, Hassabis envisions AI’s role far beyond games. He believes AI can push the boundaries of human knowledge, collaborating with humans to discover new solutions in fields like drug discovery, climate change, and personalized education. This collaboration, he posits, can unlock human potential and lead to groundbreaking innovations.
Ethical Considerations and the Future
Hassabis emphasizes the importance of responsible AI deployment, ensuring that its benefits are equitably distributed. He dreams of leveraging AI to accelerate scientific discovery, addressing not only specific challenges but also the systemic complexity that characterizes modern society’s greatest issues.
The Ethical Use of AI and the Importance of AI-Assisted Science:
– Hassabis emphasizes that technology itself is neutral and its impact on society depends on how it is deployed and the distribution of its benefits.
– He believes that AI can be a powerful tool for scientific innovation and medical advancements.
– He expresses his desire to make AI scientists or AI-assisted science possible to accelerate the pace of scientific discovery and innovation worldwide.
– Hassabis’s personal dream has always been to contribute to the field of AI and harness its potential to advance science and medicine.
– He concludes his presentation by reiterating the importance of using technology ethically and responsibly for the benefit of everyone.
Conclusion
The insights shared at the Cambridge Science Festival paint a picture of a future where AI, driven by a deep understanding of reinforcement learning and general intelligence systems, collaborates with humanity. This partnership holds the promise of solving intricate problems, advancing scientific knowledge, and enhancing our world, provided it is guided by ethical principles and a commitment to the common good.
Demis Hassabis' background in chess and gaming shaped DeepMind's culture and approach to AI, emphasizing long-term planning and resilience. AlphaGo's success was driven by Hassabis' vision, innovative strategies, and focus on transfer learning and intrinsic rewards....
The Royal Society explores the future of AI technologies, while DeepMind focuses on developing AI systems that can adapt and learn like humans. AI has the potential to revolutionize various fields but also poses challenges related to ethics, safety, and societal impact....
Demis Hassabis' journey in AI spans from early fascination with chess and game programming to spearheading revolutionary achievements like AlphaFold and GATO, while also emphasizing the ethical development of AI and its potential to expand human knowledge and understanding. Hassabis envisions AI as a tool for humanity's advancement, scientific discovery,...
AI is redefining creativity and problem-solving through learning systems, but still faces challenges in replicating human creativity and consciousness. AI holds promise for solving societal issues and scientific progress when used responsibly and ethically....
DeepMind, led by Demis Hassabis, aims to solve intelligence and utilize it to address real-world challenges, going beyond mastering games like Go. It employs general-purpose learning algorithms that can handle unforeseen situations, making AI more flexible and adaptive....
DeepMind, co-founded by Demis Hassabis, has achieved AI breakthroughs in games, protein folding, and more, while emphasizing ethical considerations and responsible AI development. DeepMind's journey showcases AI's potential for scientific discovery and societal impact....
Demis Hassabis, a game-playing prodigy, left competitive gaming to pursue AI, combining neuroscience and computer science to create AI systems that learn like the human brain. Hassabis' motivation is not money, but achieving great things and leaving a lasting legacy....