Hal Varian (Google) (Sep 2017)

Hal Varian (Google Chief Economist) – NBER Economics of AI Conference (Sep 2017)

Chapters

00:00:02 Machine Learning: From Theory to Business Applications

00:03:04 Machine Learning: Data Scarcity, Ownership, and Competitive Advantage

Resources for Machine Learning:
Machine learning requires various resources, but cloud computing services like Amazon, Google, and Microsoft provide affordable data infrastructure. However, challenges in integrating with cloud service providers persist, requiring the expertise of system integrators.

Scarcity and Growth of Expertise:
Expertise in machine learning is currently scarce but rapidly increasing with the growth of computer science enrollments and specialization in machine learning.

Obtaining Labeled Data:
Hal Varian: Labeled data, specific to a problem or firm, is crucial for machine learning. Acquiring labeled data can be achieved through various methods:

Digital Exhaust:
Offering services like Google 411 can generate labeled data through user interactions, such as voice recognition systems learning from user input.

Flickr Example:
Using services like Flickr, users can label data by selecting relevant images, such as pictures of horses, contributing to the availability of labeled data.

Data Acquisition Methods:
Hiring humans to label data, purchasing data from providers, data sharing through organizations or government mandates, and accessing data from consortiums are other options for obtaining labeled data.

Data Control, not Ownership:
Varian emphasizes the significance of data control rather than ownership, as data is a non-rival good accessible to multiple parties simultaneously. Control is achieved through contractual agreements that enforce data usage and access.

Competitive Advantage through Data:
Opinions vary on whether data access grants incumbents a significant competitive advantage. While incumbents may have an advantage, Varian stresses the importance of expertise, knowledge, information, and data acquisition methods to compete effectively.

Data Sharing Regulations:
Carl Shapiro notes the limited application of antitrust laws, especially in the United States, for mandating data access, unlike in Europe. Specific regulations or legislation may be needed to address data sharing for public policy reasons or in areas like health and safety.

Future of Driverless Cars:
Varian suggests that driverless cars are likely to be regulated, leading to mandated data interchange protocols for public policy reasons. He mentions the potential for driverless cars to coordinate data exchange, improving efficiency and safety on the roads.

Decreasing Returns to Data:
Varian challenges the notion of increasing returns to data, stating that empirical evidence suggests strong decreasing returns in various applications. He presents charts demonstrating this phenomenon in the context of dog breed recognition and boson detection.

00:10:17 Economics of Artificial Intelligence Adoption and Integration

00:17:41 Machine Learning Services: Pricing, Data, and Competition

00:23:02 Antitrust Issues and Policies for Machine Learning and Artificial Intelligence

00:33:10 AI, Data Ownership, and Antitrust

00:36:47 Data Sharing: Balancing Benefits and Risks in the Digital Economy

Abstract

Unveiling the Complex World of AI: From Data’s Role to Market Dynamics

In the rapidly evolving landscape of artificial intelligence (AI) and machine learning, several key themes emerge as pivotal in shaping the future of these technologies. Central to this development is the role of data, specifically labeled data, in training sophisticated machine learning models. Advances in hardware, such as GPUs and TPUs, have significantly enhanced performance, while platforms like TensorFlow and competitions hosted by Kaggle are democratizing AI development. This article delves into the multifaceted aspects of AI, including data infrastructure, expertise, competitive dynamics, adoption patterns across industries, and the implications for antitrust and market structure.

Data’s Pivotal Role in Machine Learning

Machine learning’s ability to predict and learn from raw data hinges on the availability of labeled data. This necessity is highlighted in various applications, from home price forecasting to personalized medicine. The importance of data infrastructure, provided by cloud computing giants like Amazon, Google, and Microsoft, cannot be overstated, offering cost-effective solutions for data storage and manipulation.

Machine Learning and Its Industrial Applications:

Machine learning aims to predict outcomes based on input features, transforming raw data into useful information. Traditional methods relied on numerical features and rules defined by programmers, but modern deep learning approaches analyze raw data directly for feature extraction. Labeled data, such as images with predefined categories, is essential for training machine learning models. Hardware advancements like GPUs and TPUs, along with software platforms like TensorFlow, have accelerated machine learning development and real-world applications.

The Growing Expertise and Competitive Landscape

The rapid growth in machine learning expertise, fueled by increasing enrollment in specialized computer science programs, is transforming the competitive landscape. While data ownership remains a secondary concern, data control emerges as a crucial factor, with contracts regulating access and usage. Interestingly, despite the competitive advantage held by incumbents due to data access, new entrants can still penetrate the market, thanks to low barriers to entry and various data acquisition methods.

Data in Machine Learning: Resource, Accessibility, and Control:

Cloud computing services offer affordable data infrastructure, enabling machine learning, but challenges in integrating with these services persist. Expertise in machine learning is growing rapidly, addressing the scarcity of skilled professionals. Labeled data, specific to a problem or firm, is crucial for machine learning. Various methods exist to acquire labeled data, including digital exhaust, user-generated content, hiring humans, data sharing, and consortiums. Data control, rather than ownership, is essential as data is a non-rival good. Contractual agreements enforce data usage and access. Incumbents with data access may have an advantage, but expertise, knowledge, and data acquisition methods are also critical for competition. Data sharing regulations vary across jurisdictions, with some advocating for mandated access for public policy reasons or in specific sectors like health and safety.

AI Adoption and Industry Dynamics

The adoption of AI technology varies significantly across firms and industries, influenced by factors like efficiency, productivity, and absorptive capacity. Early adopters like General Electric, Facebook, Uber, and UnitedHealth are reshaping their respective domains, though challenges like data issues and organizational hurdles persist. Vertical integration decisions, especially in the cloud computing sphere, are shaping market power dynamics, with public policy considerations around privacy and data sharing becoming increasingly relevant.

I.O. Perspectives on AI Adoption, Competition, and Policy:

AI’s impact on income distribution and labor markets may be more significant than concerns about monopoly power. Bottlenecks and monopoly control could slow down AI adoption, which is overall beneficial for welfare. AI adoption varies across firms and industries due to differences in efficiency, productivity, and absorptive ability. Early adopters may gain an advantage, but imitation and barriers to entry influence long-term dominance. Geographic patterns and spillovers may also emerge. Vertical integration affects market power and data availability. Cloud-based services offer one-stop shopping for AI capabilities, but specific application data may be harder to purchase. Privacy rules and mandated data sharing could influence data sharing and purchasing/selling patterns.

Bertrand Competition:

Machine learning creates a potential issue with Bertrand competition due to intense competition and low marginal costs.

Data’s Role in Image Recognition:

In the ImageNet competition, improvements in algorithms, expertise, and hardware were key factors, while data size remained constant.

Pricing of Services:

The fixed costs for creating machine learning environments are high, while marginal costs are very low. Providers compete intensely due to low marginal costs, leading to competitive pricing structures.

Learning by Doing:

Image recognition costs are very low, around a tenth of a cent per image, across multiple providers, leading to intense competition.

Effects on Minimum Efficient Scale:

Machine learning may increase minimum efficient scale by turning variable costs like labeling into fixed costs. However, if machine learning is inexpensive, it could lower the barrier to entry for startups.

Pricing and Price Discrimination:

Machine learning may enable more sophisticated price discrimination, extending existing trends in information technology.

Algorithmic Collusion:

Pricing algorithms and AI competing against each other could potentially engage in illegal algorithmic collusion. Questions on legal implications and enforcement arise.

Antitrust Considerations and Economic Implications

The intersection of AI with antitrust raises complex questions about market structure and competition. The role of machine learning in price discrimination and the potential for algorithmic collusion are areas of concern, prompting discussions on the balance between consumer welfare and market competition. The need to distinguish between static and dynamic efficiency is emphasized, with a focus on promoting technologies that benefit consumers by reducing production costs.

Static vs. Dynamic Efficiency:

The focus should be on antitrust issues that may slow the adoption of AI technologies and prevent consumers from reaping the benefits.

Data as an Essential Facility:

Essential facilities doctrine may not be the most appropriate framework for addressing data sharing issues. Government funding could play a role in creating publicly available data sets with public good characteristics.

Consumer Information and Price Discrimination:

Advances in image recognition and search technology may improve consumers’ ability to compare products, mitigating some price discrimination concerns.

Algorithmic Collusion:

Algorithmic collusion has existed before, but now machines themselves can learn to collude. It is unclear whether computers would collude more or less than humans due to factors such as backward induction and the credibility of punishment regimes.

Broader Societal Impacts and Policy Challenges

AI’s implications extend beyond market dynamics, touching upon broader societal issues like inequality and well-being. The tension between data ownership and control, the role of antitrust in income distribution, and the challenges of personalized pricing and algorithmic collusion are key topics. Additionally, the debate on whether data sharing is excessive underscores the complex interplay between technological advancement and societal impact.

Data Ownership and Scarcity:

Comparison to the mobile phone number portability debate. Potential roles for public policy to ensure data sharing.

Chad’s Perspective:

IO studies of AI should involve more than antitrust. Efficiency isn’t the sole consideration; distributional effects and how things are made also matter. One-stop shopping may impact entry costs, but the effects may not yet be visible. Data shows entry rates have been falling across sectors for 30 years, with unexplained reasons.

Ben’s Take:

Extending Chad’s point to consider political economy implications. Inequality and market power dynamics, such as Amazon’s advantages in retail. Data on entry rates and dynamics across sectors should be further investigated.

Economic and Social Impacts of Data Sharing, Autonomy, and Algorithmic Pricing

Impact of Data Sharing:

– Data sharing raises concerns about potential public good features and whether government subsidies are necessary.

– There is an argument that data sharing may already be excessive due to negative consequences like identity theft, which are not easily incorporated into data prices.

Autonomy and the Gig Economy:

– The inability to be one’s own boss may contribute to the anger observed among lower-skilled workers.

– The loss of autonomy and the desire for more hands-on work may have implications for general social well-being.

Ownership and Control of Data:

– Ownership and control of data are distinct from patents, as data is generally not patentable.

– Proprietary information and the secrecy surrounding it are key considerations.

Algorithmic Collusion:

– Algorithmic collusion is a topic of interest for economists, but its impact may be exaggerated.

– The sophistication of players is less of an issue compared to the noise in the information channel.

– Determining whether tacit collusion is occurring can be challenging.

Conclusion

The adoption of AI and machine learning technologies is a multifaceted phenomenon, influenced by a myriad of factors ranging from data infrastructure to public policy. Understanding these elements is crucial for comprehending AI’s impact on competition, innovation, and economic welfare. As the landscape continues to evolve, staying abreast of these developments becomes increasingly important for businesses, policymakers, and society at large.

Notes by: MatrixKarma

Hal Varian (Google Chief Economist) – NBER Economics of AI Conference (Sep 2017)

Chapters

Abstract

Related posts: