For designers of enterprise systems, ensuring that hardware performance keeps pace with application demands is a mind-boggling exercise. The most troubling performance challenge is storage I/O. Spinning media, while exceptional in scaling areal density, will unfortunately never keep pace with I/O requirements. The most cost-effective way to break through these storage I/O limitations is by incorporating high-performance SSDs (solid-state drives) into the systems.
While we often read in the press that SSDs will soon banish HDD (hard-disk drive) technology to the realm of tape storage, the fact is that SSD technology has only recently become ready for the enterprise. Not all SSDs are alike, and very few are appropriate for use as primary storage devices in enterprise computing systems. Using flash storage in media players is fundamentally different from deploying the technology in 24/7 mission-critical operations.
With the advent of this new category of solid-state device, the potential for using SSDs in enterprise systems has become a reality, with profound implications for system performance. At the same time, leveraging the power of SSDs is difficult, and even identifying true enterprise-class SSDs is a major challenge. With these challenges in mind, we develop in this article a framework that can be used to assess SSD technology and determine its enterprise-readiness.
The very first enterprise-class SSD was introduced in 2007. One of the key architectural dimensions of the product was that it combined the best attributes of two memory technologies: flash and DRAM. That dimension coupled with complex controller technology results in an entirely new class of SSDs for markets where performance is the key reason why customers would use SSDs instead of HDDs.
The primary applications that are now benefiting from this technology are those that are heavily dependent upon the drive I/O performance—for example, in enterprise storage and server applications where the I/O performance of the drives has a direct impact on the overall system performance and where cost is measured in performance and not just in capacity (cost/performance). In such applications, increased I/O is equal to increased revenue for the end user.
Two examples of such applications are:
Another advantage of enterprise-class SSDs is eradicating latency. These SSDs provide access times in microseconds, rendering data-access times that are more like DRAM. With response times of this magnitude, SSDs behave more like main memory, yet possess all the “comforts” of stable disk storage in terms of persistence, communication protocols, form factor, etc. Enterprise SSDs provide native support for long-block data transfers (e.g., 520, 524, 528) that are obligatory for data-integrity and compatibility reasons in enterprise systems. In addition, because of the solid-state nature of SSDs, the mechanicals of a host system do not induce performance penalties as they do with HDDs.
The key to achieving the right performance, capacity, and cost profile is to use NAND flash as the media in these drives. DRAM is too costly and obviously not a persistent storage medium (it loses data when power is removed). Future storage technologies claim to have an alternative means to address the needs of SSDs, but those technologies are not available today, have no clear path to scaling capacity-wise, and are too far away in terms of viability to be considered for SSD design for many years and possibly decades. NAND is the ultimate media choice, but it comes with its own series of design challenges.
NAND flash uses floating-gate technology and comes in two varieties: SLC (single-level cell, meaning the technology can store a single bit per memory cell) and MLC (multilevel cell, meaning the technology can store multiple bits per cell). SLC NAND flash costs twice as much as MLC flash; however, it is much more reliable and has much better endurance (write/erase cycles). It is also much faster (read/write speed) than MLC flash.
It is extremely important to note that not all NAND sources are the same. Although the theoretical benefits of SLC NAND and the specifications of all SLC devices suggest they are superior to MLC NAND, the reality is that not all SLC is as specified. An important take-away is that one must diligently test and screen the NAND in its fully packaged form, as well as within the SSD as part of the standard manufacturing test to ensure that the devices are appropriate for enterprise duty cycles.
In an overview of the common SLC NAND flash characteristics, the technology can be characterized as follows:
These divergent characteristics create quite the conundrum in terms of media management.
With these NAND flash characteristics, the first generation of low-end SSDs has the following performance characteristics:
Purely from a performance standpoint, the most pressing issue is achieving write speeds, particularly random write speeds, especially in workloads with both random reads and writes. Fundamentals of NAND programming introduce significant delays in the write performance of drives. At the heart of this phenomenon is the need to erase a block before writing. This process introduces latency and is the reason notebook SSDs are so slow in writes, particularly random writes and particularly when the host sends mixed reads and writes. Thus, the drive access patterns typified in the enterprise render these one-dimensional notebook SSDs worthless as a result of flaws in architecture and design.
To achieve these ultimate drive-level performance characteristics, an entirely different product architecture is required from that used in notebook SSDs. An enterprise-class SSD is an optimized memory system with complex, tightly integrated hardware and software. At the heart of such a product is an elaborate chipset that performs all the vital communication protocols, as well as the critical media management. NAND is both a wonderful and a fickle medium. While the mechanical characteristics of NAND-based drives are obviously improved when compared with HDDs, NAND has unique reliability challenges not common to HDDs. It is the role of the SSD chipset and the coordinated manufacturing process that ultimately renders an SSD enterprise class.
An enterprise SSD implements high levels of parallel flash access and combines an optimal mix of two memory technologies—DRAM and NAND—to achieve the performance and reliability required by enterprise applications.
DRAM is extremely fast but requires power to maintain the data. DRAM devices are also used on disk drives, generally for cache to enhance the performance, but when used in disk drives they add to the risk of losing the data if power is lost. With enterprise-class SSDs, the DRAM is used for cache, but the drive adds power backup so when power is turned off, the device has enough power left to write the data from DRAM into flash (like the hibernate feature on laptops).
Enterprise SSDs use relatively large DRAM capacities within the drive (in the realm of a gigabyte). The use of DRAM helps the system overcome the biggest shortcomings of NAND, most notably random write performance. It allows the SSD to gather random writes and to use the good performance characteristics of NAND (relatively good sequential writes) to write the data at very high speeds. There is potential skepticism around the use of DRAM technology because all data written to flash would then become random data, but as explained earlier, NAND flash has very good random read characteristics.
By using DRAM in combination with power backup, the drive can effectively create a very fast nonvolatile memory device that can achieve access times and latencies that are 150 times faster than those of mechanical drives. This is one of the unique characteristics of SSDs that cannot be emulated by mechanical disk drives, no matter how many are used in parallel.
Attaining the right levels of reliability for mission-critical applications requires that the drive maintain perfect data integrity, preventing any data loss or metadata corruption that would prevent the drive from rebooting following power removal. To achieve this, an enterprise-class SSD must have full data-path protection within the drive. Although this feature is currently used in enterprise-class HDDs, it is implemented in only one SSD. Figure 2 shows the salient architectural features of an enterprise SSD, including all the fully protected internal data flows.
Beyond the data-path protection, there are three pillars of media management that the drive must implement in a coordinated fashion:
Extensive full data-path error detection/correction. An interesting phenomenon in SSDs is that the incidence of errors increases exponentially through utilization; this translates into increasing effort from the drive to perform correction as the drive is exercised. Ultimately, all reads from the media will need to be corrected, which requires extensive and deep ECC (error correction checking) coverage. Performing such ECC well into the latter years of the drive’s life without impacting system-level performance is a significant design challenge and is addressed only by enterprise-class SSDs.
Wear leveling. The SSD controller logic proactively deposits the writes into the optimal physical location in the NAND array such that no blocks are uneven or unduly worn. This is a delicate balance as the drive needs enough active wear leveling to ensure even drive wear, but the drive must not induce too much unnecessary data movement that could significantly over-exercise through excessive rewriting of data (see the section on write amplification later in this article).
Bad-block management. The process of managing bad blocks involves actively gauging the vitality of each independent block in the entire NAND array to ensure that bad blocks are removed from rotation and replaced with good blocks so that no data goes into corrupted blocks. An enterprise-class SSD implements bad-block management algorithms with multiple screens to determine the health of a block and to optimize the usable life of a block, keeping it in rotation long enough to extract the maximum usable life without jeopardizing reliability or performance.
One area of concern with SSDs is the potential for drive corruption while the drive is moving data for housekeeping—the background operations required to manage the use and wear of the NAND blocks, as cited previously. The drive must constantly track and manage the physical utilization of the NAND array to ensure that the host gets maximum vitality from the drive. This process is run by both hardware and firmware, so the efficacy and reliability vary from vendor to vendor.
The two critical areas to assess are:
The final facet of this SSD housekeeping process is that most SSDs in the market that are not optimized for enterprise applications suffer from performance problems, because the background NAND management process ultimately becomes a foreground bottleneck. The notebook class of SSD needs relief (in the form of idle time) from the heavy enterprise duty cycles in order to do the housekeeping of the NAND. Enterprise applications need drives to be poised for high performance 24/7 and thus cannot allow idle time.
Another important technique implemented in enterprise SSDs is the over-provisioning of NAND capacity, which is a vital means of achieving optimal performance and reliability. In the enterprise, there is no tolerance for varying performance in the drive. A drive cannot expect to have idle time available as a convenience to perform critical tasks. Having additional NAND within the drive will allow the drive to perform critical housekeeping tasks as background operations and then incorporate prepared blocks following background sanitization and preparation. When implemented properly, this technique significantly reduces write amplification and optimizes performance.
While it is great to think through the sheer performance improvement of one drive technology versus another, let us now focus on the profound impact this has at the system level. Not only does SSD technology dramatically bolster system-level performance, but it also addresses one of the other most pressing issues in the data center: power reduction.
Enterprise-class SSDs present a compelling combination of performance and power savings that makes the technology a vital part of the storage technology spectrum. Figure 3 illustrates this power savings, comparing the power requirements to deliver 135,000 IOPS for an STEC enterprise SSD and a typical enterprise HDD.
It is important to note that SSDs have exceptional performance in small random transfers where the performance is optimized on 512-byte, 1-KB, 2-KB, 4-KB, and 8-KB random reads and writes. Once you have identified the appropriate role SSDs will play within the storage hierarchy, there are ways in which the system can tune access patterns to achieve maximum performance and reliability.
One key is to achieve the appropriate alignment of transfer sizes, which varies by product. Thus, you must know the SSD vendor intimately in order to implement optimal techniques to help achieve optimal performance and reliability.
In terms of compatibility, enterprise SSDs will work seamlessly within systems as drop-in replacements for HDDs. All of the aforementioned features are run entirely within the drive; SSDs are not dependent upon host-side file systems to perform all of the elaborate media management schemes. Various OEMs will develop unique ways to optimize system-level code to extract optimal performance, but there will be no requirement for the host to modify the manner in which the drives are addressed.
To extract maximum system-level performance benefits, the key is to use SSDs as a high-performance storage tier. As tiering of storage technologies proliferates (i.e., utilization of FC HDD as Tier 1, SATA HDD for Tier 2 and lower, and tape for archival), enterprise SSDs deliver an unprecedented performance profile; thus, they provide an entirely new tier of performance. Enterprise SSDs deliver performance more like main memory, so they can be used as a replacement for main memory, as implemented by Sun Microsystems with its ZFS acceleration design. SSDs can also be used to replace multiple high-performance HDDs, as EMC has implemented with its Symmetrix system. The convention is to refer to enterprise SSDs as Tier 0. Figure 4 shows a sample storage architecture, with an enterprise SSD in the Tier 0 position.
Enterprise storage and server OEMs are universally embracing SSDs to achieve the optimal balance of application demands, processor utilization, and cost. Clearly, SSD technology is emerging as the solution of choice for companies that need to improve the delivery of mission-critical applications while controlling costs and simplifying management. Not all SSDs are alike, however: to be truly enterprise class, a drive must be designed with the performance and reliability nuances of flash in mind. Drives that do not reflect these nuances will disappoint, as they will perform poorly and fail early—but those drives that are designed around flash will allow the technology to reach its full, disruptive potential.
MARK MOSHAYEDI is president and CTO of STEC, where he has been for more than 16 years. Prior to that he worked in a variety of roles spanning engineering to sales at various companies including Texas Instruments, Sony, and Fujitsu. Throughout his career, he has specialized in storage and memory technologies. He earned his B.S. in electrical engineering from the University of California at Irvine and his M.B.A. from Pepperdine University.
PAT WILKISON is vice president of marketing and business development for STEC. He is responsible for STEC’s products, spanning definition to introduction to management. He is also responsible for new market development. He earned his B.S. in systems engineering from West Point and his M.B.A. from the University of Southern California.
Originally published in Queue vol. 6, no. 4—
see this item in the ACM Digital Library
Thanumalayan Sankaranarayana Pillai, Vijay Chidambaram, Ramnatthan Alagappan, Samer Al-Kiswany, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau - Crash Consistency
Rethinking the Fundamental Abstractions of the File System
Adam H. Leventhal - A File System All Its Own
Flash memory has come a long way. Now it's time for software to catch up.
Michael Cornwell - Anatomy of a Solid-state Drive
While the ubiquitous SSD shares many features with the hard-disk drive, under the surface they are completely different.
Marshall Kirk McKusick - Disks from the Perspective of a File System
Disks lie. And the controllers that run them are partners in crime.