Jeff Dean (Google) (Jun 2008)

Jeff Dean (Google Senior Fellow) – Google I/O 2008 – Underneath the Covers at Google (Jun 2008)

Chapters

00:00:15 Building a Reliable and Scalable Infrastructure at Google

00:08:38 Building a Scalable Storage and Processing Infrastructure

00:14:14 Understanding Data Processing Through MapReduce

00:19:28 MapReduce and Bigtable: Distributed Computing at Google

00:24:24 Bigtable: A Distributed Multi-Dimensional Sparse Map

00:28:49 Bigtable: Scalable Wide-Column Database for Big Data Processing

00:35:00 Automating Data Processing and Storage Across Data Centers

00:38:46 Google's Architecture for Search, Machine Translation, and Code Management

00:49:12 Distributed Systems and Software Engineering at Google

Code Search:
Jeff Dean developed an internal tool to search all of Google’s source code quickly. This tool allows developers to find examples and uses of specific routines, helping with maintenance and code changes.

Disciplined Software Engineering Practices:
Google follows disciplined software engineering practices, including code reviews, design reviews, and extensive testing. Individual modules and whole systems are thoroughly tested. Continuous testing ensures all tests are always running.

Development Languages:
Most development work is done in C++, Java, and Python. C++ is used for performance-critical code, like web queries. Java is used for lower volume applications, like the advertising front end. Python is used for configuration and other tasks.

Distributed Engineering Sites:
Google has moved from a single engineering site in Mountain View to multiple sites worldwide. This allows them to hire talented candidates regardless of location. Coordination and communication challenges arise with distributed teams. Techniques like online documentation, video conferencing, and careful interface design help manage these challenges.

Collaboration and Learning:
Google’s diverse range of problems and expertise encourages collaboration and learning. Developers can work on different areas and share knowledge. Jeff Dean’s collaboration with the machine translation team led to a new large language model serving system.

Hardware for Masters and Slaves:
Google uses the same hardware for masters and slaves in its distributed systems. This simplifies operations and allows for better scaling across machines.

Multi-Core Machines:
Google embraces multi-core machines due to their parallelizable nature. Multi-core machines are seen as clusters of small machines with good interconnects. Google’s parallelization experience makes it easy to utilize multi-core machines.

Storage Architecture:
Google prefers to keep computation and data in the same place, using cheap disks attached to individual machines. This approach provides more spindles, storage capacity, and computation near the disk for a lower cost. Commercial database systems with high-performance disk requirements may still use storage area networks.

Challenges for New Recruits:
Google’s unique problems and solutions require new recruits to learn and adapt to its systems. Extensive online documentation and training material help onboard new programmers. It takes time for new hires to become productive and develop new applications.

00:58:40 Onboarding New Developers at Google

Abstract

Navigating Google’s Technological Landscape: Innovations, Challenges, and Practices

Google’s technological infrastructure, based on commodity PCs and innovative systems like MapReduce and Bigtable, is a cornerstone in large-scale computing. This comprehensive analysis delves into the various facets of Google’s approach, from the use of standard hardware to sophisticated software engineering practices and data handling techniques. The article highlights the unique challenges posed by this setup, the efficiency of Google’s distributed computing models, and the company’s emphasis on collaborative software development and training programs for new recruits.

Main Ideas and Details

Google’s Hardware Platform and Challenges

Google leverages low-end commodity PCs to maximize cost-effectiveness, evolving from heterogeneous to homogeneous setups. The company initially used various machines from other research groups but later switched to a homogeneous setup using motherboards, disks, and power supplies. They pack machines densely to utilize space efficiently and improve cooling. Current designs emphasize airflow, neat cable management, and front-facing connectors. The design includes densely packed machines running Linux and in-house software tailored to handle failures like missing racks and corrupted traffic.

Infrastructure and Services

Google focuses on building infrastructure to transform commodity machines into a coherent system. The talk will discuss how queries are served on Google.com and delve into the machine translation system. Insights into interesting data, engineering style, and approaches to managing the source code base will be provided.

Google File System (GFS) and MapReduce

GFS and MapReduce are central to Google’s data handling, offering reliable storage and efficient large-scale data processing. GFS stores data in chunks across machines with a centralized master, while MapReduce simplifies programming for large-scale data processing through map and reduce functions.

Advantages of MapReduce

MapReduce is a programming framework for processing large datasets in parallel across a distributed cluster of computers. It consists of map tasks, which process data in parallel, and reduce tasks, which combine the results of the map tasks. The master node assigns tasks to worker nodes, which execute the tasks and return the results. MapReduce is designed to be fault-tolerant. If a worker node fails, the master node will assign the tasks that were running on that node to other worker nodes. This ensures that the job will continue to run even if some of the worker nodes fail. MapReduce provides centralized status reporting and monitoring. This allows users to see the progress of their jobs and to identify any problems that may occur. MapReduce is used to run a wide variety of jobs at Google. It is used to process data for web search, machine learning, and other applications. The number of MapReduce jobs run per day has been growing steadily over time. MapReduce is tightly integrated with Google’s infrastructure, allowing users to easily access and process data stored in Google’s distributed file system.

Transition to Bigtable

Bigtable is a distributed database system that provides a higher-level view of data than a file system. It is used to store and process large amounts of structured data. Bigtable is designed to be scalable, reliable, and fault-tolerant. Bigtable is a distributed multi-dimensional sparse map. It maps from a row, column name, and timestamp to cell contents. Cell contents are treated as an uninterpreted binary stream of bytes. Bigtable is used for various applications, including satellite imagery processing, Orkut, and batch-style data processing. It can handle petabytes of data and manage load balancing and machine failures efficiently.

Software Engineering at Google

Google’s engineering culture is defined by a single shared code base, promoting collaboration and continuous integration. Practices include code reviews, automated testing, and a focus on learning and adapting to Google’s unique systems.

Challenges and Training for New Recruits

New recruits at Google face a steep learning curve due to unique systems and technologies, addressed through extensive training programs. Training includes half-day courses on essential systems and a hands-on approach to learning. Jeff Dean, a prominent figure at Google, emphasizes the importance of newcomers being prepared for a lot of learning and having a strong desire to learn. He encourages them to ask questions when needed and acknowledges that there is still much to learn when joining Google.

Background and Additional Information

In conclusion, Google’s approach to technology is defined by its strategic use of commodity hardware, innovative software solutions, and a collaborative engineering culture. The company’s systems like MapReduce and Bigtable showcase its prowess in handling large-scale data processing and storage, while its software engineering practices ensure robust, efficient, and collaborative development. Google’s focus on training and adapting new recruits to its unique technological environment further underlines its commitment to maintaining a cutting-edge and cohesive technological ecosystem.

Notes by: ChannelCapacity999

Jeff Dean (Google Senior Fellow) – Google I/O 2008 – Underneath the Covers at Google (Jun 2008)

Chapters

Abstract

Related posts: