John Hennessy (Alphabet Chairman) – Language Consortium Keynote (May 2019)


Chapters

00:01:39 RISC Architecture: A Revolution in Computer Design
00:06:25 Computing Technology Evolution and Its Impact
00:11:35 The End of Moore's Law and the Rise of Energy Efficiency
00:15:49 Limits of Speculation and Instruction-Level Parallelism
00:19:17 Limits of Conventional Multicore Processors
00:22:08 Challenges and Opportunities in Hardware and Software Efficiency
00:26:25 Novel Approaches to Processor Design for Domain-Specific Languages
00:38:06 Rethinking Processor Architectures for Specialized Domains
00:43:58 Emerging Techniques in Domain-Specific Architectures
00:48:22 Networking for Artificial Intelligence and Beyond
00:51:03 Machine Learning Challenges and Opportunities
00:54:56 Architectural Innovation in Computing

Abstract

Exploring the Evolution of Computing: From RISC Innovation to Domain-Specific Architectures, Quantum Computing, and Artificial Intelligence

“Transforming Computing: From RISC Revolution to Quantum Computing, Domain-Specific Architectures, and Artificial Intelligence”



In the rapidly evolving landscape of computing, the journey from the pioneering days of Reduced Instruction Set Computing (RISC) to the contemporary focus on Domain-Specific Architectures (DSAs), quantum computing, and artificial intelligence (AI) represents a monumental shift. Central figures like John Hennessy and David Patterson laid the groundwork with RISC, influencing today’s processors, DSPs, GPUs, and TPUs. Their work, culminating in the Turing Award, revolutionized computer architecture. Concurrently, challenges like the slowdown of Moore’s Law and the end of Dennard Scaling have propelled the shift towards energy efficiency and specialized architectures. This article explores the trajectory of these developments, highlighting the critical transition from general-purpose computing to a future dominated by DSAs, quantum computing, and the ever-growing importance of energy efficiency and approximation techniques in processing.



John Hennessy’s Leadership and Contributions to RISC Architecture

John Hennessy’s contributions to the computing field are profound and span leadership, research, and education. As president of Stanford University, he oversaw the expansion of the engineering quad and established the Knight-Hennessy Scholars Program, the largest fully endowed graduate level scholarship program in the world. His commitment to leadership extends to his book, Leading Matters, exploring the nature and teachability of leadership.

Hennessy’s influence in computer architecture began with his founding of MIPS in the early 1980s, pioneering the RISC architecture. RISC simplified the instruction set and turned the CPU into a simple pipeline for faster processing. This inspired subsequent processors and domain-specific processors like DSPs, GPUs, and TPUs. Hennessy’s role in RISC led to the development of the PISA architecture used by Tofino. His two bestselling international textbooks on computer architecture, co-authored with Dave Patterson, have been widely used for over 30 years. Together, they received the Turing Award in 2017 for their contributions to RISC architecture and their influential textbook.

The Dilemma of Moore’s Law and Dennard Scaling

While Moore’s Law has been a guiding principle in the semiconductor industry for decades, predicting a doubling of transistors every two years, it has been more an aspiration than a law. The end of Dennard Scaling, which predicted constant power per square millimeter as transistors got smaller, has marked a critical turning point. This slowdown, coupled with the shift in application from desktops to mobile and cloud computing, has emphasized the need for energy efficiency. The thermal power limit of processors, leading to reduced clock speeds and core shutdowns, further accentuates the challenge.

Addressing Performance and Energy Efficiency Crises

The slowdown in single-core performance growth and DRAM development has prompted a reevaluation of processor design. Energy efficiency has become paramount, especially in cloud computing, where the capital costs of servers and cooling/power infrastructure are comparable. Thermal power limits further exacerbate the challenge.

The Paradigm Shift to Multicore Processors and DSAs

Instruction-level parallelism (ILP) and Amdahl’s Law have presented challenges in improving performance. ILP’s limits and Amdahl’s Law, which states that parallelization speedup is limited by the non-parallelizable fraction of the program, have driven the industry towards multicore processors. These processors allow parallel execution of multiple threads or programs, improving performance and efficiency. However, this shift brings thermal dissipation and increased energy consumption challenges. RISC’s focus on instruction set efficiency has become crucial for power-sensitive devices, leading to the rise of domain-specific architectures tailored for specific applications like GPUs for graphics and TPUs for deep learning.

The Role of Domain-Specific Architectures

DSAs, exemplified by Google’s TPUs and GPUs, offer substantial performance gains by tailoring architecture to specific applications. These architectures employ approximation techniques like reduced numerical precision, enhancing efficiency in scenarios ranging from deep learning to weather prediction. The integration of DSAs across application, compilation, DSL, and architecture levels is pivotal for overcoming the limits imposed by traditional computing paradigms.

Quantum Computing and Future Trends

Quantum computing emerges as a frontier in the computing landscape, albeit with significant practical difficulties. Maintaining coherent system states and building large, low-error quantum machines pose substantial challenges. Meanwhile, advancements in material science, like carbon nanofiber and 3D stacking, hold promise for improved energy efficiency and performance in traditional computing.

AI and Networking: A Look at the Future

– AI networks may primarily interconnect systems engaged in learning and training processes.

– Current top-of-rack switches can handle Netflix’s peak global data traffic at any given moment. The ratio of data handled for training versus video streaming is significant.

– The discussion shifts to the integration of ML and reduced functions within switches. The question arises whether networking experts or ML specialists will lead this integration or if a merger will occur.

– The technology space is becoming more fragmented, leading to variations in hardware and packaging. Different devices may require specialized ML processors, such as for camera or voice recognition.

– The discussion moves beyond first-generation ML, acknowledging that this field is still in its early stages. The future of AI is uncertain, but it is expected to evolve and potentially move beyond supervised learning.

Challenges and Potential Solutions in Computer Architecture

Thermal Limitations and Instruction Set Efficiency: Thermal dissipation power limits the number of active cores in a processor. Power limitation impacts Amdahl’s law effect, reducing parallel performance. Instruction set efficiency becomes a key driver for power-efficient devices.

The Shortcoming of Modern Programming Languages: Modern programming languages prioritize software productivity over execution efficiency. Python, as an example, can be highly inefficient in execution compared to C.

Hardware-Centric Approach and Domain-Specific Ideas: General-purpose processors face a deadlock in performance improvement. Domain-specific architectures are the only viable solution. Domain-specific languages will be crucial for these architectures.

Efficiency Gains through Optimization: A study by MIT researchers demonstrates significant efficiency improvements in matrix multiplication. Optimizations such as using C, parallel loops, memory blocking, and vector instructions resulted in a 65,000-fold speedup. Potential for substantial performance gains through various techniques.

Tailoring Architectures to Application Needs: Domain-specific architectures aim to achieve performance by closely aligning with application requirements. Examples include GPUs for graphics, network processors, and deep learning accelerators.

Emerging Trends in Computer Architecture

Energy Efficiency in Computing: Research is actively exploring ways to reduce energy consumption in computing, including optimizing memory usage, minimizing control overhead, and utilizing systolic arrays.

Domain-Specific Architectures (DSAs): DSAs have gained traction due to their ability to deliver superior performance and energy efficiency for specific applications. They offer advantages such as simpler parallelism models, improved memory hierarchy usage, and tailored programming models.

Performance and Energy Efficiency of DSAs: DSAs can achieve significantly better performance per watt compared to general-purpose processors. Roofline models illustrate the relationship between arithmetic intensity, memory bandwidth, and arithmetic bandwidth in determining performance.

Demand for DSA Performance: The demand for high-performance computing in domains like deep learning is rapidly growing, driving the need for specialized hardware with massive compute capabilities.

Rethinking Architecture and Software Models: DSA challenges conventional notions of architecture design, requiring a rethinking of the interface between software models and hardware. This opens up new possibilities for optimizing performance and energy efficiency.

Tight Integration for Success: Successful implementation of DSA requires tight integration across multiple levels of the stack, from applications to underlying architecture. This involves understanding application characteristics, compiler optimization techniques, domain-specific languages, and the appropriate hardware architecture.

Conclusion

The computing world is at a pivotal juncture, transitioning from general-purpose processors to specialized architectures and quantum computing. This shift necessitates a rethinking of architectural and interface designs, emphasizing energy efficiency, parallelism, and domain-specific solutions. The future of computing, influenced by lessons from the past and innovations in the present, looks towards a landscape where domain-specific architectures and quantum computing redefine what’s possible in processing power and efficiency.


Notes by: crash_function