Andy Bechtolsheim (Arista Co-Founder) – Open Storage Townhall (Jul 2013)


Chapters

00:00:17 Open Storage Revolution: Transforming the Landscape of Storage
00:02:18 Journey of Open Storage: From ZFS to Data Integrity
00:06:50 OpenSolaris: A Thriving Community Driving Storage Innovation
00:10:33 Advanced Snapshot and Replication Technologies in OpenSolaris
00:14:12 Technological Advancements Transforming Storage Solutions
00:21:30 Open Storage Community and Technology Overview
00:32:12 Open Storage Platform: Scalability, Flexibility, and Community
00:42:52 Storage Solutions for High-Growth Markets
00:45:10 Open Storage and Its Applications
00:47:55 Open Storage Platform: A New Era in Storage Technology

Abstract

Revolutionizing Data Storage: The Emergence and Impact of Open Storage and ZFS

Abstract

This article explores the transformative journey of data storage, highlighting the challenges of proprietary systems and the revolutionary impact of open storage solutions, with a particular focus on the development and capabilities of the ZFS file system. By exploring the origins, technological advancements, community involvement, and future prospects of open storage, the article provides a comprehensive understanding of how these innovations are reshaping the landscape of data storage and management.



Sun Microsystems and the Open Storage Revolution

John Fowler, Executive Vice President at Sun Microsystems, expresses his excitement about discussing open storage, a major revolution in computing over the past 20 years. Fowler recalls the transition in the server marketplace from proprietary technology to open source technologies, leading to improved price performance and a thriving development community. He highlights the efforts of Sun and a large community in bringing the same revolution to storage, aiming to transform storage price performance, cost basis, and community involvement. Fowler emphasizes the challenges faced by individuals and organizations, including performance issues and the proprietary nature of software interactions. He explains Sun’s long-term efforts in building a technology foundation and programs that integrate developers and communities to change the storage landscape. Fowler welcomes Jeff Bonwick, a Sun Fellow and creator of ZFS, and Matthew Beyer, the leader of the open storage community group.

Problematic Legacy Systems

Historically, data storage has been hampered by proprietary systems that stifle innovation and impose high costs. These systems, often limited in their community support, restricted advancements in storage technology, making scalability and performance improvements a challenging endeavor.

Open Storage: A Revolutionary Solution

The open storage revolution, spearheaded by Sun, sought to dismantle these barriers. By creating an open technology base and nurturing a community around it, Sun enabled a significant leap in storage technology, making advanced features accessible to a broader audience. This movement has its roots in the success of open-source software in the server domain, demonstrating the potential of community-driven development.

ZFS and Data Integrity

Data integrity is crucial in storage systems, but people often overlook its importance until it’s too late. Disk drives have a published error rate, resulting in one bad sector every 10 terabytes, which is significant in today’s high-speed data processing environment. Modern disk drives have complex firmware to handle data integrity, but firmware bugs can still cause errors. Traditional storage systems rely on proprietary firmware for data integrity, which is not transparent to users.

ZFS employs end-to-end data integrity, incorporating technology previously only available in proprietary firmware. ZFS’s open-source nature allows users to verify and understand the data integrity mechanisms, fostering trust and confidence. ZFS has been adopted by various companies and has gained significant popularity in the open-source community. ZFS is a core component of the OpenStorage software stack in OpenSolaris. ZFS enables users to achieve data integrity and advanced storage capabilities without expensive proprietary hardware. The open-source nature of ZFS encourages community involvement, innovation, and cost-effective solutions.

Genesis of ZFS

ZFS’s inception was born out of frustration with the complexities inherent in traditional storage systems, particularly when managing disk upgrades. Jeff Bonwick and Tim Marsland, the architects behind ZFS, were inspired by the simplicity of memory management, leading to the concept of virtual storage. This approach sought to emulate the success of virtual memory systems in simplifying and optimizing storage management.

ZFS: An Active Community, Breakthrough Economics, and Data Management

ZFS is the most active community within OpenSolaris. It has over 1,300 subscribers and generates significant mail traffic. Developers continuously innovate, attracting users to the site for code and solutions. Nexenta, a partner in OpenSolaris, built their own NAS software appliance using ZFS, offering robustness and cost-effectiveness.

ZFS eliminates expensive proprietary storage by providing data integrity and management. It radically improves economics, allowing users to deploy storage at less than $1 a gigabyte. Traditional high-end arrays cost around $10 to $20 per gigabyte, while ZFS offers similar or better performance at less than $1 per gigabyte. Organizations like Neotactics have used ZFS for over a year and appreciate its enterprise-level functionality at affordable costs.

Data Integrity: A Core Principle of ZFS

At the heart of ZFS lies a commitment to end-to-end data integrity. This is crucial in an era of increasing error rates in high-capacity drives. While traditional systems rely on complex and potentially bug-prone firmware for error correction, ZFS democratizes data integrity technology, previously exclusive to proprietary systems, by incorporating it into an open-source framework.

ZFS in the Open Source Arena

ZFS’s availability as an open-source solution has led to its widespread adoption across various communities. This adoption is not limited to enthusiast circles but extends to companies recognizing the system’s ability to deliver enterprise-grade features on cost-effective hardware. The open-source nature of ZFS spurs ongoing innovation and collaboration, contributing to its continuous enhancement and success.

Snapshot and Rollback: Enhancing Data Management

A notable feature of ZFS is its support for unlimited snapshots. This functionality enables a range of applications, from template cloning for zone deployments to robust data backup and recovery strategies.

Geographic Replication and CIFS Integration

ZFS stands out for its ability to provide live remote replication, facilitating data synchronization across continents. Additionally, OpenSolaris’s integrated CIFS support, including native handling of Windows security IDs, enhances ZFS’s utility in diverse environments.

Comprehensive Solutions and Future Technologies

Sun’s storage platform, enriched by ZFS, offers a comprehensive suite of solutions, ranging from file systems and Windows integration to security measures and various replication strategies. The platform’s versatility is evident in its support for multiple storage exports, including file systems, block devices, and iSCSI targets.

As the industry evolves, technologies like Flash memory and solid-state disks are gaining prominence, offering exceptional performance compared to traditional drives. ZFS adeptly leverages Flash as a secondary cache, enhancing data throughput and reliability.

Challenges and Opportunities in Global Adoption

While the open storage concept garners interest worldwide, particularly in rapidly growing economies, challenges persist. Concerns about stability and readiness for large-scale deployments, along with the need for cost-effective scalability, shape the landscape of open storage adoption. Sun’s strategy to cater to these markets involves not only technological innovation but also community engagement and education.

The Future of Open Storage

The trajectory of open storage, exemplified by the success of ZFS, indicates a promising future. The blend of cost efficiency, scalability, and technological sophistication positions open storage as a viable alternative to traditional proprietary systems, especially in the burgeoning fields of internet infrastructure and high-performance computing. As the community around open storage continues to grow and innovate, the potential for its application in diverse environments, from streaming media to large-scale database management, becomes increasingly evident. Open storage, with its roots in community collaboration and technological ingenuity, stands as a cornerstone of modern data management, poised to shape the future of how we store, access, and leverage our ever-expanding digital universe.



Supplemental Updates

OpenSolaris’s Advanced Storage Features

_ZFS Boot Support and Package Management_

OpenSolaris now includes ZFS boot support, allowing users to boot directly from a ZFS file system. The newly developed package system takes advantage of ZFS’s snapshotting capabilities, creating a snapshot before installing a package, enabling easy rollback in case of issues.

_Unlimited Snapshots and Fast Replication_

ZFS allows for an unlimited number of snapshots, unlike traditional systems with fixed limits or intervals. Snapshots can be created in constant time, enabling various use cases like creating template snapshots for easy cloning of multiple zones. Geographic replication is supported, allowing users to replicate data across continents efficiently.

_Native CIFS Support_

OpenSolaris offers improved CIFS support, translating incoming CIFS requests directly without relying on POSIX translations. It incorporates Windows security IDs, providing a more robust security model and better naming scheme for Windows integration.

OpenStorage: Unlocking Scalability and Performance with Commodity Hardware

_Overview_

OpenStorage, built on OpenSolaris, revolutionizes the storage industry by delivering a complete software platform that integrates with Windows, supports protocols like NDMP, and offers advanced features like geographic replication, snapshots, thin provisioning, and block storage. This platform enables businesses to unlock scalability and performance at cost-effective prices.

_Hardware Advancements Driving Cost-Effective Performance_

1. _Moore’s Law and Open Source Software Stack:_ CPU performance doubles every 18 months, leading to improved software performance. Solaris CFS leverages multi-core architecture for optimal performance.

2. _Serial Attached SCSI (SAS):_ SAS changes the economics of disk attachment. It is faster and more cost-effective than Fibre Channel Attach. SAS enables the attachment of thousands of disks directly to powerful controllers.

_OpenSolaris Enables Scalability and Flexibility_

OpenStorage built on OpenSolaris runs on Intel and AMD architectures. Customers can build and use any server at any scale, eliminating proprietary hardware costs. It addresses the needs of large-scale customers, like government agencies, who require petabyte-scale storage solutions.

_Flash Memory and Solid-State Disks_

1. _Performance Advantages:_ SSD devices offer outstanding performance, delivering 10,000 IOPS per device. They outperform conventional disk drives in terms of IOPS per dollar.

2. _Cost Considerations:_ SSDs are more expensive on a dollar-per-gigabyte basis compared to conventional disks. Flash technology is expected to become more cost-effective over time.

3. _Integration with CFS:_ CFS understands the concept of a memory cache and can extend it to flash memory. Flash can be managed as a cache, accelerating transaction-type workloads. CFS automatically handles the combination of flash and conventional hard disks for optimal performance.

OpenStorage: A Community-Driven Approach to File System Innovation

_ZFS and Flash Memory for High-Speed Data Access_

ZFS leverages flash memory to provide high-speed data access for synchronous requests. It uses flash as a buffer to store file system data evicted from the main memory cache, effectively expanding the working set and reducing latency for commonly accessed data.

_ZFS Community Growth and Enthusiasm_

The ZFS community encompasses a diverse group of individuals, organizations, and community leaders who actively contribute to the project. The community has grown significantly, with a notable increase in global participation and expertise.

_Community Contributions and Knowledge Sharing_

The ZFS community is characterized by a collaborative spirit. Sun engineers actively assist community members in contributing to the project, fostering knowledge sharing and innovation beyond Sun’s own resources.

_Professional Services and Training for ZFS Implementation_

Sun offers professional services and training to help customers implement ZFS. This support enables organizations to leverage ZFS effectively, providing guidance and resources for successful deployment.

_Types of Flash Memory and Their Usage in ZFS_

ZFS employs single-level cell (SLC) flash memory for enterprise storage due to its high write endurance, enabling hundreds of thousands of write cycles per block. SLC flash offers superior durability compared to multi-level cell (MLC) flash, which is commonly found in consumer devices and has limited write cycles.

_Transactional Object Store: The Core Abstraction of ZFS_

The Data Management Unit (DMU) serves as the core abstraction in ZFS, providing a transactional object store. This allows for the creation of atomic transactions, ensuring that multiple operations on a set of objects are executed either completely or not at all.

_Leveraging DMU for File Systems and Volumes_

The DMU provides a flexible foundation for building file systems and volumes. It enables the implementation of a POSIX file system on top of the DMU, demonstrating its versatility in supporting different storage needs.

_DMU as a Potential Plug-in for MySQL_

The DMU’s transactional capabilities make it a suitable plug-in for MySQL, enabling the creation of a storage layer for the database.

_Lustre Integration with ZFS_

ZFS and Lustre, a high-performance computing file system, share a common architectural goal of providing a transactional object store. Lustre is transitioning from EXT3 to ZFS, recognizing the benefits of ZFS’s transactional nature for high-performance data storage.

_InfiniBand for Lustre and Open Storage_

InfiniBand is a preferred choice for high-performance computing environments, including Lustre. Its credit-based flow control and RDMA support enable efficient data movement with minimal packet drops.

_InfiniBand’s Potential in Commercial Settings_

While InfiniBand has not yet gained significant traction in traditional enterprise environments, its cost-performance advantages and transparent block-level access through FCOIB suggest potential for wider adoption.


Notes by: Random Access