Demis Hassabis (DeepMind Co-founder) – The Power of Self-Learning Systems | Institute for Advanced Study (May 2019)


Chapters

00:00:13 A New Approach to Artificial Intelligence: Self-Learning Systems
00:10:15 Evolution of AlphaGo to AlphaZero in Game Playing
00:18:29 Self-play Reinforcement Learning and the Success of AlphaZero
00:26:15 Properties and Advantages of AlphaZero
00:33:15 AI-Accelerated Scientific Discovery: Protein Folding with AlphaFold
00:40:15 AI-Assisted Science: Unlocking Insights from Big Data

Abstract

Harnessing the Power of Self-Learning: Demis Hassabis and the Future of AI

The Vision of Self-Learning Systems in AI

In the field of artificial intelligence, few have charted as visionary a course as Demis Hassabis, founder of DeepMind. His recent presentation at the Institute for Advanced Study articulated a bold and transformative perspective: the future of AI lies in self-learning systems. Hassabis contends that understanding and artificially recreating intelligence is crucial to solving a myriad of complex problems. This shift from traditional expert systems, confined by hard-coded knowledge, to learning systems capable of adapting and handling unpredicted scenarios, marks a significant turning point in AI’s evolution.

Deep Learning and Reinforcement Learning: A New Era

DeepMind’s approach, anchored in deep learning and reinforcement learning, represents a sophisticated form of trial-and-error learning. Here, an AI agent learns to interact with its environment, drawing conclusions and refining strategies based on rewards or punishments. This form of active learning, where agents’ actions directly influence their learning experiences, paves the way for AI systems to not only adapt to new tasks but also to devise innovative solutions, potentially surpassing human capabilities.

AlphaGo and AlphaZero: Pioneering Achievements

AlphaGo’s groundbreaking achievement in learning to play Go, a feat that culminated in its victory over human experts, exemplifies the prowess of self-learning systems. Its successor, AlphaZero, took this a step further by not only mastering Go but also excelling in other two-player perfect information games like chess and Shogi. Unlike IBM’s Deep Blue, which defeated Garry Kasparov in chess but lacked adaptability, AlphaZero’s versatility and general problem-solving capabilities signify a move towards more human-like AI systems.

The Training and Impact of AlphaZero

The training loop of AlphaZero, which combines self-play reinforcement learning with Monte Carlo tree search, highlights a significant departure from traditional methods. It employs a single neural network for both move prediction and game state evaluation, enhancing efficiency and enabling the system to achieve superhuman performance rapidly. Notably, AlphaZero’s victory over the then-world champion chess engine Stockfish 8, after just four hours of training, underscores its exceptional efficiency and adaptability.

The Influence and Potential of AlphaZero

Beyond its technical achievements, AlphaZero has left a mark on human endeavors. Its influence on top grandmasters, including Magnus Carlsen, and its recognition by Garry Kasparov for reflecting the “truth” of chess, underscore its impact on human understanding of these games. Furthermore, the application of AlphaZero’s principles to complex, real-time strategy games like StarCraft II suggests its potential to handle more intricate and less predictable scenarios.

Challenges and Opportunities in AI’s Future

Despite these advancements, AI faces significant challenges, including unsupervised learning, memory, abstract concept learning, symbolic reasoning, and language understanding. DeepMind’s application of AI to scientific discovery, as demonstrated by AlphaFold’s breakthrough in protein folding prediction, reveals the immense potential of AI in fields requiring the processing of vast data and the unraveling of complex systems. This capability positions AI as a meta-solution, poised to transform unstructured data into actionable knowledge, thereby accelerating scientific discoveries and societal progress.

DeepMind’s Ethical and Collaborative Approach

In navigating the future of AI, DeepMind underscores the importance of responsible and ethical development. By fostering collaboration among scientists, society, and stakeholders, DeepMind aims to ensure that AI serves as a beneficial tool for humanity. The exploration of the human mind through AI, delving into the intricacies of consciousness, creativity, and dreams,

offers a unique window into understanding the unique properties of human intelligence.

Conclusion

Demis Hassabis’ vision for AI, with its emphasis on self-learning systems, represents a transformative shift in the field. The successes of AlphaGo and AlphaZero, demonstrating unparalleled adaptability and efficiency, have not only revolutionized AI but also greatly influenced human understanding in diverse domains. As AI continues to evolve, addressing key challenges and harnessing its potential for scientific discovery and societal benefit will be crucial. DeepMind’s commitment to ethical development and collaboration sets a standard for navigating the future of AI, ensuring that it remains a force for positive change and understanding in our increasingly complex world.

Incorporated Supplemental Updates

DeepMind has combined deep learning and reinforcement learning to train a neural network on Atari games. This system, using raw pixel inputs and without prior knowledge of rules or controls, learned to maximize scores and outperform human players. This achievement showcases the system’s ability for end-to-end learning from raw pixels to actions and decision-making. Furthermore, AlphaGo, using Deep RL, mastered the game of Go, a complex game with more possible board configurations than atoms in the universe. In 2016, AlphaGo defeated the world champion Lee Sedol, showcasing its ability to learn and play complex games with minimal human input. AlphaGo Zero, designed to eliminate reliance on human knowledge, started from random play and learned to play Go solely through self-play, achieving a higher performance level than the original AlphaGo.

AlphaZero generalized AlphaGo’s approach to any two-player perfect information game. It learned to play chess, shogi, and Go from scratch, starting from random moves and improving through self-play. This system surpassed human experts in all three games, demonstrating its ability to generalize learning across different games. Unlike Deep Blue, AlphaZero’s strategy replaces handcrafted heuristics with self-play reinforcement learning and Monte Carlo tree search, leading to a more elegant and streamlined algorithm. AlphaZero employs a single neural network for both move probabilities and game evaluations. Through self-play, it generates a vast dataset for neural network training, continuously improving its decision-making through iterative updates. In a 1,000-game match against Stockfish 8, the reigning world champion among traditional chess engines, AlphaZero achieved a remarkable 155-6 victory with draws, demonstrating its overwhelming dominance.

AlphaZero’s novel chess style and features include a balance of different factors in decision-making, dynamic evaluation of positions, a focus on mobility over materiality, and challenging ingrained assumptions in traditional chess engines. Its strengths include a superior evaluation function and the ability to generalize knowledge to new positions, allowing creative play and significantly higher search efficiency than traditional chess engines. AlphaZero’s style, characterized by dynamic attacking chess and mobility over materiality, has influenced top grandmasters, including Magnus Carlsen. Its principles have been extended to real-time strategy games like StarCraft II, with AlphaStar achieving top professional performance in StarCraft II, demonstrating the potential for AlphaZero-like systems to handle hidden information and complex action spaces.

AI faces key challenges like unsupervised learning, memory, abstract concepts, symbolic reasoning, and language understanding. However, AI algorithms have matured enough to address real-world issues, with applications in healthcare, energy optimization, and voice synthesis. DeepMind focuses on using AI to accelerate scientific discovery. DeepMind’s multidisciplinary science team collaborates with academics in various fields, working on projects in quantum chemistry, protein folding, retinal disease detection, mathematics, and theorem proving. AlphaFold, a protein folding system, uses supervised learning with a neural network and has significantly impacted understanding diseases,

drug discovery, and protein design. It predicts the 3D structure of proteins from amino acid sequences, a significant advance in biological sciences.

DeepMind’s protein folding algorithm, trained on a dataset of known protein structures, is highly accurate in predicting new structures. Despite challenges in achieving the accuracy needed for biological applications, this algorithm marked a victory in the CASP competition, an international competition for protein folding prediction. The algorithm’s ability to make accurate predictions of protein structures for 25 out of 43 proteins in the hardest category demonstrated its prowess in this field.

AI is seen as a potential solution to the challenge of data overload and complex systems. Hassabis views intelligence as a process that converts unstructured information into useful knowledge and believes AI can automate and make this process more efficient. While acknowledging AI’s enormous promise, he emphasizes the need for responsible and ethical use of this technology, advocating for AI to benefit everyone in society. Hassabis also suggests that studying AI and intelligence may offer insights into the human mind and its properties, helping unravel the mysteries of consciousness, creativity, and dreams.

In summary, Demis Hassabis’ vision for AI and DeepMind’s innovative developments like AlphaGo, AlphaZero, and AlphaFold, alongside their ethical approach to AI development, highlight the transformative potential of self-learning systems in AI. These advancements not only underscore AI’s capabilities in game-playing and scientific discovery but also point to a future where AI could contribute significantly to understanding and solving complex problems in various domains.


Notes by: Flaneur