Demis Hassabis (DeepMind Co-founder) – The Power of Self-Learning Systems | Institute for Advanced Study (May 2019)
Chapters
00:00:13 A New Approach to Artificial Intelligence: Self-Learning Systems
Introduction: Demis Hassabis, founder of DeepMind, presents his vision for artificial intelligence, focusing on self-learning systems. DeepMind’s mission is to understand intelligence, recreate it artificially, and use it to solve various problems.
Expert Systems vs. Learning Systems: Traditional AI approaches rely on hard-coded knowledge, limiting their ability to handle unexpected situations. Learning systems, inspired by neuroscience, learn solutions directly from raw data and can generalize to new tasks.
Reinforcement Learning: Reinforcement learning is a technique used in self-learning systems, where the agent interacts with an environment, receives rewards or punishments, and learns to take actions that maximize rewards. The agent builds a model of the world, evaluates potential actions, and selects the best one to achieve its goal.
Active Learning: Self-learning systems engage in active learning, where their actions influence the experiences they encounter and the data they can learn from. The agent’s choices drive its learning process, allowing it to adapt and improve over time.
Key Points: DeepMind’s focus on self-learning systems aims to create a general-purpose learning system capable of solving a wide range of problems. Learning systems have the potential to generalize to new tasks and come up with novel solutions, even those unknown to the system’s creators. Reinforcement learning enables agents to learn from interactions with their environment, building models of the world and selecting actions that maximize rewards. Active learning allows agents to participate in their own learning by choosing actions that drive their experience and shape their future learning opportunities.
00:10:15 Evolution of AlphaGo to AlphaZero in Game Playing
Deep RL and Atari Games: DeepMind combined deep learning with reinforcement learning (Deep RL) to train a neural network on Atari games using raw pixel inputs and no prior knowledge of rules or controls. The system learned to maximize the score and outperformed human players, demonstrating end-to-end learning from raw pixels to actions and decision-making.
AlphaGo: AlphaGo used Deep RL to master the game of Go, a complex game with exponentially more possible board configurations than atoms in the universe. AlphaGo defeated the world champion Lee Sedol in 2016, showcasing the system’s ability to learn and play complex games with minimal human input.
AlphaGo Zero: AlphaGo Zero was designed to eliminate reliance on human knowledge and bootstrapping from human games. Starting from random play, it learned to play Go solely through self-play and achieved a higher level of performance than the original AlphaGo.
AlphaZero: AlphaZero generalized AlphaGo’s approach to any two-player perfect information game. It learned to play chess, shogi, and Go from scratch, starting from random moves and improving through self-play. AlphaZero surpassed human experts in all three games, demonstrating its ability to generalize learning across different games.
Chess and Deep Blue: Demis Hassabis highlights the limitations of expert systems like IBM’s Deep Blue, which defeated Garry Kasparov in 1997. While Deep Blue was a technical achievement, it lacked the general intelligence to perform tasks beyond chess, such as understanding language or tying shoelaces.
Essence of Intelligence: Demis Hassabis emphasizes the need for AI systems to possess general intelligence and not be limited to specific tasks. AlphaZero’s ability to generalize learning across different games suggests progress toward developing AI systems with a broader range of capabilities.
00:18:29 Self-play Reinforcement Learning and the Success of AlphaZero
AlphaZero’s General Approach: Unlike Deep Blue, AlphaZero’s strategy eschews handcrafted heuristics, replacing them with self-play reinforcement learning and Monte Carlo tree search. This simplified approach led to a more elegant and streamlined algorithm.
Neural Network Structure and Training: AlphaZero employs a single neural network to predict both move probabilities (policy network) and game evaluations (value network). The system engages in self-play, generating a vast dataset for neural network training. Through iterative updates, the neural network continuously improves its decision-making.
Evaluation against Traditional Chess Engines: AlphaZero faced off against Stockfish 8, the reigning world champion among traditional chess engines. In a 1,000-game match, AlphaZero achieved a remarkable 155-6 victory with draws, demonstrating its overwhelming dominance.
Rapid Learning and Generalization: AlphaZero attained superhuman chess-playing capabilities after just four hours of training. Its learning trajectory showcased an incredible optimization process, visible in real-time. The system’s generalization capabilities were tested and confirmed in Shogi, a Japanese chess variant.
Conclusion: AlphaZero’s success in multiple strategic games highlights its potential to excel in any two-player perfect information game, given sufficient training through self-play. This breakthrough represents a significant step toward developing general artificial intelligence capable of mastering complex tasks with minimal human intervention.
Key Differences Between AlphaZero and Traditional Chess Engines: AlphaZero efficiently balances different factors in decision-making compared to traditional chess engines. AlphaZero dynamically evaluates positions, making decisions based on current context. AlphaZero focuses on mobility over materiality, preferring dynamic attacking chess. AlphaZero’s style challenges ingrained assumptions and heuristics in traditional chess engines.
AlphaZero’s Strengths and Advantages: AlphaZero’s evaluation function is superior to that of traditional chess engines. AlphaZero’s ability to generalize knowledge to new positions allows creative play. AlphaZero’s search efficiency is significantly higher than that of traditional chess engines.
AlphaZero’s Style and Influence on Chess: AlphaZero’s style is characterized by dynamic attacking chess and mobility over materiality. AlphaZero’s style has influenced top grandmasters, including Magnus Carlsen. AlphaZero’s unique style reflects the inherent truth of the game, as stated by Garry Kasparov.
AlphaZero’s Expansion to Real-Time Strategy Games: AlphaZero’s principles have been extended to real-time strategy games like StarCraft II. AlphaStar, a system based on AlphaZero, has achieved top professional performance in StarCraft II. AlphaStar demonstrates the potential for AlphaZero-like systems to handle hidden information and complex action spaces.
Conclusion: AlphaZero’s innovative approach to chess has led to a new style of play, challenging traditional assumptions and pushing the boundaries of machine intelligence. Its success in real-time strategy games further highlights the potential of these systems to excel in complex environments with hidden information and vast action spaces.
00:33:15 AI-Accelerated Scientific Discovery: Protein Folding with AlphaFold
Key Challenges in AI: Unsolved problems such as unsupervised learning, memory, abstract concepts, symbolic reasoning, and language understanding remain. AI systems are not yet capable of general cognition.
Applying AI to Real-World Problems: AI algorithms have matured enough to address real-world issues. AI has commercial applications in healthcare, energy optimization, and voice synthesis. DeepMind focuses on using AI to accelerate scientific discovery.
Criteria for AI Applications: Search through a massive combinatorial space. Clear objective function or metric for optimization. Sufficient ground truth data or an accurate simulator.
Multidisciplinary Science Team at DeepMind: Collaboration with academics in various fields to address impactful problems. Projects in quantum chemistry, protein folding, retinal disease detection, mathematics, and theorem proving.
AlphaFold: Protein Folding System: Proteins are crucial molecular machines essential for life. The protein folding problem involves predicting the 3D structure of a protein from its amino acid sequence. Significance in understanding diseases, drug discovery, and protein design.
Training AlphaFold: Supervised learning with a neural network using known protein structures from a database. Augmentation with random amino acid sequences and data. Outputting distributions for angles and pairwise distances between amino acids.
Predicting Protein Structures: Providing the system with a new amino acid sequence. Using numerical optimization to predict the 3D structure.
00:40:15 AI-Assisted Science: Unlocking Insights from Big Data
Introduction of DeepMind’s Protein Folding Algorithm: DeepMind created an algorithm that can predict the 3D structure of proteins from their amino acid sequence. The algorithm uses a neural network to estimate angle distributions and a scoring system to assess the goodness of the structure. The algorithm was trained on a dataset of known protein structures and was found to be highly accurate in predicting new structures.
DeepMind’s Victory in CASP: DeepMind’s algorithm competed in the CASP competition, an international competition for protein folding prediction. The algorithm was the most accurate in predicting structures for 25 out of 43 proteins in the hardest category. This victory demonstrated the algorithm’s ability to make accurate predictions of protein structures.
Challenges in Protein Folding: Despite the success of the algorithm, it is still not perfect and is not yet able to achieve the accuracy needed for biological applications. The algorithm is still being refined and explored, with the goal of achieving higher accuracy in protein folding predictions.
AI as a Solution to Data Deluge and Complexity: DeepMind’s founder, Demis Hassabis, views AI as a potential solution to the challenge of data overload and complex systems. He believes that AI can help to extract insights and structure from large amounts of data.
AI as an Automated Process for Converting Information into Knowledge: Hassabis sees intelligence as a process that converts unstructured information into useful knowledge. He believes that AI has the potential to automate this process and make it more efficient.
The Promise and Responsibility of AI: Hassabis acknowledges the enormous promise of AI but also emphasizes the need for responsible and ethical use of this technology. He believes that AI should benefit everyone in society and not just a few individuals or companies.
AI and the Mysteries of the Mind: Hassabis suggests that by studying AI and intelligence, we may gain a better understanding of the human mind and its unique properties. He believes that comparing AI with the human mind can help us unravel the mysteries of consciousness, creativity, and dreams.
Abstract
Harnessing the Power of Self-Learning: Demis Hassabis and the Future of AI
The Vision of Self-Learning Systems in AI
In the field of artificial intelligence, few have charted as visionary a course as Demis Hassabis, founder of DeepMind. His recent presentation at the Institute for Advanced Study articulated a bold and transformative perspective: the future of AI lies in self-learning systems. Hassabis contends that understanding and artificially recreating intelligence is crucial to solving a myriad of complex problems. This shift from traditional expert systems, confined by hard-coded knowledge, to learning systems capable of adapting and handling unpredicted scenarios, marks a significant turning point in AI’s evolution.
Deep Learning and Reinforcement Learning: A New Era
DeepMind’s approach, anchored in deep learning and reinforcement learning, represents a sophisticated form of trial-and-error learning. Here, an AI agent learns to interact with its environment, drawing conclusions and refining strategies based on rewards or punishments. This form of active learning, where agents’ actions directly influence their learning experiences, paves the way for AI systems to not only adapt to new tasks but also to devise innovative solutions, potentially surpassing human capabilities.
AlphaGo and AlphaZero: Pioneering Achievements
AlphaGo’s groundbreaking achievement in learning to play Go, a feat that culminated in its victory over human experts, exemplifies the prowess of self-learning systems. Its successor, AlphaZero, took this a step further by not only mastering Go but also excelling in other two-player perfect information games like chess and Shogi. Unlike IBM’s Deep Blue, which defeated Garry Kasparov in chess but lacked adaptability, AlphaZero’s versatility and general problem-solving capabilities signify a move towards more human-like AI systems.
The Training and Impact of AlphaZero
The training loop of AlphaZero, which combines self-play reinforcement learning with Monte Carlo tree search, highlights a significant departure from traditional methods. It employs a single neural network for both move prediction and game state evaluation, enhancing efficiency and enabling the system to achieve superhuman performance rapidly. Notably, AlphaZero’s victory over the then-world champion chess engine Stockfish 8, after just four hours of training, underscores its exceptional efficiency and adaptability.
The Influence and Potential of AlphaZero
Beyond its technical achievements, AlphaZero has left a mark on human endeavors. Its influence on top grandmasters, including Magnus Carlsen, and its recognition by Garry Kasparov for reflecting the “truth” of chess, underscore its impact on human understanding of these games. Furthermore, the application of AlphaZero’s principles to complex, real-time strategy games like StarCraft II suggests its potential to handle more intricate and less predictable scenarios.
Challenges and Opportunities in AI’s Future
Despite these advancements, AI faces significant challenges, including unsupervised learning, memory, abstract concept learning, symbolic reasoning, and language understanding. DeepMind’s application of AI to scientific discovery, as demonstrated by AlphaFold’s breakthrough in protein folding prediction, reveals the immense potential of AI in fields requiring the processing of vast data and the unraveling of complex systems. This capability positions AI as a meta-solution, poised to transform unstructured data into actionable knowledge, thereby accelerating scientific discoveries and societal progress.
DeepMind’s Ethical and Collaborative Approach
In navigating the future of AI, DeepMind underscores the importance of responsible and ethical development. By fostering collaboration among scientists, society, and stakeholders, DeepMind aims to ensure that AI serves as a beneficial tool for humanity. The exploration of the human mind through AI, delving into the intricacies of consciousness, creativity, and dreams,
offers a unique window into understanding the unique properties of human intelligence.
Conclusion
Demis Hassabis’ vision for AI, with its emphasis on self-learning systems, represents a transformative shift in the field. The successes of AlphaGo and AlphaZero, demonstrating unparalleled adaptability and efficiency, have not only revolutionized AI but also greatly influenced human understanding in diverse domains. As AI continues to evolve, addressing key challenges and harnessing its potential for scientific discovery and societal benefit will be crucial. DeepMind’s commitment to ethical development and collaboration sets a standard for navigating the future of AI, ensuring that it remains a force for positive change and understanding in our increasingly complex world.
Incorporated Supplemental Updates
DeepMind has combined deep learning and reinforcement learning to train a neural network on Atari games. This system, using raw pixel inputs and without prior knowledge of rules or controls, learned to maximize scores and outperform human players. This achievement showcases the system’s ability for end-to-end learning from raw pixels to actions and decision-making. Furthermore, AlphaGo, using Deep RL, mastered the game of Go, a complex game with more possible board configurations than atoms in the universe. In 2016, AlphaGo defeated the world champion Lee Sedol, showcasing its ability to learn and play complex games with minimal human input. AlphaGo Zero, designed to eliminate reliance on human knowledge, started from random play and learned to play Go solely through self-play, achieving a higher performance level than the original AlphaGo.
AlphaZero generalized AlphaGo’s approach to any two-player perfect information game. It learned to play chess, shogi, and Go from scratch, starting from random moves and improving through self-play. This system surpassed human experts in all three games, demonstrating its ability to generalize learning across different games. Unlike Deep Blue, AlphaZero’s strategy replaces handcrafted heuristics with self-play reinforcement learning and Monte Carlo tree search, leading to a more elegant and streamlined algorithm. AlphaZero employs a single neural network for both move probabilities and game evaluations. Through self-play, it generates a vast dataset for neural network training, continuously improving its decision-making through iterative updates. In a 1,000-game match against Stockfish 8, the reigning world champion among traditional chess engines, AlphaZero achieved a remarkable 155-6 victory with draws, demonstrating its overwhelming dominance.
AlphaZero’s novel chess style and features include a balance of different factors in decision-making, dynamic evaluation of positions, a focus on mobility over materiality, and challenging ingrained assumptions in traditional chess engines. Its strengths include a superior evaluation function and the ability to generalize knowledge to new positions, allowing creative play and significantly higher search efficiency than traditional chess engines. AlphaZero’s style, characterized by dynamic attacking chess and mobility over materiality, has influenced top grandmasters, including Magnus Carlsen. Its principles have been extended to real-time strategy games like StarCraft II, with AlphaStar achieving top professional performance in StarCraft II, demonstrating the potential for AlphaZero-like systems to handle hidden information and complex action spaces.
AI faces key challenges like unsupervised learning, memory, abstract concepts, symbolic reasoning, and language understanding. However, AI algorithms have matured enough to address real-world issues, with applications in healthcare, energy optimization, and voice synthesis. DeepMind focuses on using AI to accelerate scientific discovery. DeepMind’s multidisciplinary science team collaborates with academics in various fields, working on projects in quantum chemistry, protein folding, retinal disease detection, mathematics, and theorem proving. AlphaFold, a protein folding system, uses supervised learning with a neural network and has significantly impacted understanding diseases,
drug discovery, and protein design. It predicts the 3D structure of proteins from amino acid sequences, a significant advance in biological sciences.
DeepMind’s protein folding algorithm, trained on a dataset of known protein structures, is highly accurate in predicting new structures. Despite challenges in achieving the accuracy needed for biological applications, this algorithm marked a victory in the CASP competition, an international competition for protein folding prediction. The algorithm’s ability to make accurate predictions of protein structures for 25 out of 43 proteins in the hardest category demonstrated its prowess in this field.
AI is seen as a potential solution to the challenge of data overload and complex systems. Hassabis views intelligence as a process that converts unstructured information into useful knowledge and believes AI can automate and make this process more efficient. While acknowledging AI’s enormous promise, he emphasizes the need for responsible and ethical use of this technology, advocating for AI to benefit everyone in society. Hassabis also suggests that studying AI and intelligence may offer insights into the human mind and its properties, helping unravel the mysteries of consciousness, creativity, and dreams.
In summary, Demis Hassabis’ vision for AI and DeepMind’s innovative developments like AlphaGo, AlphaZero, and AlphaFold, alongside their ethical approach to AI development, highlight the transformative potential of self-learning systems in AI. These advancements not only underscore AI’s capabilities in game-playing and scientific discovery but also point to a future where AI could contribute significantly to understanding and solving complex problems in various domains.
DeepMind's AlphaFold revolutionized protein structure prediction, while its AI systems achieved breakthroughs in games like Go and chess, leading to ethical considerations and advancements in AI development. DeepMind's approach to AI emphasizes the scientific method, responsible deployment, and treating AI with respect and caution....
Demis Hassabis' journey in AI spans from early fascination with chess and game programming to spearheading revolutionary achievements like AlphaFold and GATO, while also emphasizing the ethical development of AI and its potential to expand human knowledge and understanding. Hassabis envisions AI as a tool for humanity's advancement, scientific discovery,...
AI has revolutionized protein structure prediction with AlphaFold, leading to breakthroughs in biology and drug discovery. AI's versatility in scientific discovery extends to quantum chemistry, mathematics, fusion, and genomics, offering a powerful tool for scientific exploration....
DeepMind's AI achievements in gaming and scientific discovery exemplify AI's potential to surpass human intelligence and solve complex real-world problems. AI's journey reflects its evolving role as a tool for understanding and innovation, with implications for healthcare, energy management, and scientific discovery....
Demis Hassabis' background in chess and gaming shaped DeepMind's culture and approach to AI, emphasizing long-term planning and resilience. AlphaGo's success was driven by Hassabis' vision, innovative strategies, and focus on transfer learning and intrinsic rewards....
AI has evolved from simple logic systems to complex machine learning models, showing potential in solving complex problems and achieving AGI. Balancing innovation and ethical considerations is crucial to ensure AI's positive impact on society....
The Royal Society explores the future of AI technologies, while DeepMind focuses on developing AI systems that can adapt and learn like humans. AI has the potential to revolutionize various fields but also poses challenges related to ethics, safety, and societal impact....