Hal Varian (Google Chief Economist) – Royal Statistical Society | Statistics at Google (Jan 2013)


Chapters

00:00:06 Statistics and Data Science at Google
00:07:01 Data Analysis at Google
00:10:25 The Role of a Chief Economist at Google
00:13:28 Projects and Roles of Statisticians at Google
00:17:23 Understanding Google's Advertising Auction Model
00:26:44 Machine Learning Innovations at Google
00:30:28 Empirical Bayes for Publisher Quality Assessment
00:35:56 Measuring Incrementality of Ad Clicks and Mobile Queries
00:40:59 Google Query Data as Predictors of Economic Indices
00:43:30 Google Consumer Surveys: A Novel Method for Gathering User Feedback
00:45:40 Data Analytics Hiring Strategies in the Tech Industry

Abstract

Hal Varian and the Evolution of Google’s Data-Driven Decision Making

Leveraging Big Data for Business Insights: The Story of Hal Varian at Google

In the rapidly evolving world of technology, the integration of data analysis into business strategies stands paramount. This article delves into the significant contributions of Hal Varian, a key figure in Google’s data-driven transformation, and the broader implications of his work in the tech industry.

The Multifaceted Career of Hal Varian

Hal Varian’s educational journey, marked by degrees in math, economics, and statistics, laid the groundwork for his multifarious career. His academic tenure at MIT, where he focused on statistics and economic theory, was a prelude to his pivotal role in the tech industry. In 1995, Varian’s expertise led him to the forefront of academia as the founding dean of UC Berkeley’s School of Information Management. His transition to Google in 2007 marked a significant shift, with his initial focus on the ad auction system evolving into broader areas like query growth forecasting, advertiser churn analysis, and lifetime value estimation. Varian’s emphasis on extracting meaningful insights from data reflects a deep understanding of the significance of refined analytical tools in today’s data-rich environment.

Background of Hal Varian

Hal Varian, renowned for his expertise in economics and statistics, began his journey in academia as a professor and dean at several prestigious universities before joining Google in 2007. As the Chief Economist at Google, Varian underscores the rising importance of statistics, particularly in Silicon Valley. His belief that companies should develop robust data analysis capabilities to make informed decisions is evident in his career trajectory. Varian’s education in mathematics and economics at UC Berkeley laid the foundation for his future achievements. His experience teaching statistics to economics majors at MIT and his research in economic theory and modeling equipped him with a unique perspective on data analysis. Varian’s interest in the Internet’s economic implications emerged early in his career, leading to his role as dean of the School of Information at UC Berkeley. There, he co-authored “Information Rules,” a significant work in the field. His move to Google in 2002, initially as a consultant and later full-time in 2007, was a turning point. His work at Google, including developing the ad auction system and various analytical methods, showcases his ability to apply statistical knowledge in practical, impactful ways.

The Birth and Growth of AdStats

Under Varian’s guidance, the formation of Google’s AdStats team marked a pivotal point in the company’s approach to data analysis. This team, composed of statisticians, computer engineers, and data specialists, played a crucial role in optimizing Google’s primary revenue source, the ad system. The establishment of this team was the beginning of a more data-centric culture within Google, leading to an expansion of data analysis services across various departments. Google proactively responded to the growing need for data analysis expertise by hiring analysts with strong quantitative backgrounds, enhancing both system performance and decision-making processes.

The integration of statistical analysis at Google took several forms. The AdStats team delved into the intricacies of Google’s ad system, a key revenue generator. The demand for statistical analysis grew, leading teams to hire their own analysts from fields such as statistics, mathematics, operations research, and finance. These analysts worked in tandem with engineers and management to optimize Google’s systems and understand the environment in which they operated. However, this decentralized model had its challenges, including duplicated efforts and limited knowledge sharing. To address these issues, Varian established a central organization for analysts to foster interaction and information exchange. This initiative included a monthly newsletter showcasing analysis work from different teams and biannual meetings for knowledge exchange, featuring external speakers and presentations. An internal statistics mailing list further facilitated query resolution and discussion among over 650 subscribers, indicating a robust and active community of analysts at Google.

Innovations in Google’s Advertising Model

Google’s advertising model is characterized by its unique auction system, where advertisers bid on keywords to secure ad positions. This system, supported by tools like the bid simulator and logistic regression models, is crucial for optimizing ad performance and enhancing user experience. The model’s sophistication extends to website optimization techniques such as sequential testing and the multi-armed bandit approach, significantly improving efficiency and effectiveness.

Empirical Models and Longitudinal Analysis in Google’s Strategy

Varian’s team at Google employs advanced statistical methods like empirical Bayes models for publisher quality scoring and longitudinal models for correlating Google’s revenue with economic indicators. These methods allow for more precise forecasting and scenario planning,

essential in a dynamic economic landscape. Publisher quality scores are determined based on the observed performance over time, using an empirical Bayes model that assigns initial scores based on the distribution of scores of existing publishers in the same country. These scores are then updated with more data, incorporating additional predictors like country, language, and vertical. The model also accounts for survivorship bias by correcting for differences between publishers that continue and those that exit. An asymmetric loss function is used to account for the varying consequences of misclassifying publishers, ensuring a more balanced and accurate assessment.

Exploring Query Incrementality and Search Insights

Google’s analysis efforts extend to understanding the incrementality of ad clicks and mobile queries. Techniques like difference-in-differences analysis are used to gauge the impact of technological changes on user behavior. Tools like Google Correlate and Insights for Search provide insights into search query patterns, aiding in predicting economic indicators and enhancing the search experience. For example, state-level GDP betas are used to estimate Google revenue at the state level and correlate it with economic indicators using a longitudinal model, facilitating scenario planning for economic recovery. The incrementality of ad clicks is estimated using statistical models that compare actual clicks with counterfactual scenarios, while mobile query incrementality is analyzed using difference in differences analysis, considering user-specific and seasonal fixed effects. This comprehensive approach allows Google to refine its understanding of user behavior and the impact of new technologies.

The Future of Data Analysis at Google and Beyond

Google’s commitment to advancing data analysis is evident in its use of Google search queries to predict economic indicators. Google Correlate allows users to explore relationships between search queries and real-world events, uncovering correlations that may not be immediately apparent. Studies using Google Correlate have shown strong correlations between specific search queries and unemployment trends, for instance. Google’s researchers have developed models that use query data to predict economic time series like unemployment, inflation, and retail sales, utilizing statistical techniques such as Kalman filters and Bayesian variable selection. The aim is to automate the identification of relevant predictors in Google query data for various economic series, streamlining the forecasting process.

The Chief Economist’s role at Google encompasses a wide range of responsibilities, including revenue analysis, program evaluation, predictive modeling, experimentation, auction design, policy advising, and interpreting macroeconomic issues. The computational infrastructure also benefits from the contributions of economists. The use of consumer surveys and collaborative systems for experimentation demonstrates the breadth of Google’s data-driven approaches. Google’s hiring practices reflect the importance of statistical analysis, machine learning, and data interpretation skills, indicative of a broader industry trend.

In conclusion, Hal Varian’s contributions at Google underscore the transformative power of data analysis in business strategy and decision-making. His work and that of his teams emphasize the increasing relevance of statistical modeling and data-driven insights in the tech industry and beyond, leading to innovative solutions in a data-centric world.

Google Consumer Surveys

Hal Varian introduced Google Consumer Surveys as a new method for gathering consumer insights. This system presents users with brief survey questions in exchange for access to gated content on publisher websites, covering a wide range of topics. The high response rates of up to 35-40% and the inferred demographics based on web browsing behavior provide valuable insights for businesses.

Insights into Data Analytics, Experimentation, and Hiring Practices at Google

Google’s experimentation in data analytics includes various collaborative systems to conduct experiments on webpages, search results, and ads, emphasizing the importance of establishing causal relationships. The hiring process at Google focuses on candidates with broad statistical knowledge, practical experience in coding languages like Python, and skills in machine learning, visualization, and communication. Analysts must quickly understand problems, use appropriate tools, and complete tasks efficiently. The growing demand for individuals capable of interpreting and communicating insights from data is not limited to Google, as seen in companies like Intuit, Visa, MasterCard, Walmart, and Safeway. Google’s current recruitment for statisticians in the UK highlights the company’s ongoing commitment to expanding its data analytics capabilities.


Notes by: OracleOfEntropy