A Conversation with Jeff Bonwick and Bill Moore:
The future of file systems
This month ACM Queue speaks with two Sun engineers who are bringing file systems into the 21st century. Jeff Bonwick, CTO for storage at Sun, led development of the ZFS file system, which is now part of Solaris. Bonwick and his co-lead, Sun Distinguished Engineer Bill Moore, developed ZFS to address many of the problems they saw with current file systems, such as data integrity, scalability, and administration. In our discussion this month, Bonwick and Moore elaborate on these points and what makes ZFS such a big leap forward.
From Here to There, the SOA Way:
SOA is no more a silver bullet than the approaches that preceded it.
Back in ancient times, say, around the mid ’80s when I was a grad student, distributed systems research was in its heyday. Systems like Trellis/Owl and Eden/Emerald were exploring issues in object-oriented language design, persistence, and distributed computing. One of the big themes to come out of that time period was location transparencythe idea that the way that you access an object should be independent of where it is located. That is, it shouldn’t matter whether an object is in the same process, on the same machine in a different process, or on another machine altogether. Syntactically, the way that I interact with that object is the same; I’m just invoking a method on the object.
Ground Control to Architect Tom...:
Can you hear me?
Project managers love him, recent software engineering graduates bow to him, and he inspires code warriors deep in the development trenches to wonder if a technology time warp may have passed them by. How can it be that no one else has ever proposed software development with the simplicity, innovation, and automation being trumpeted by Architect Tom? His ideas sound so space-age, so futuristic, but why should that be so surprising? After all, Tom is an architecture astronaut! Architecture astronauts such as Tom naturally think at such high levels of innovation because they spend much of their time in high orbit where low oxygen levels counterbalance the technological shackles imposed by reality.
Hard Disk Drives: The Good, the Bad and the Ugly!:
HDDs are like the bread in a peanut butter and jelly sandwich.
HDDs are like the bread in a peanut butter and jelly sandwich—sort of an unexciting piece of hardware necessary to hold the “software.” They are simply a means to an end. HDD reliability, however, has always been a significant weak link, perhaps the weak link, in data storage. In the late 1980s people recognized that HDD reliability was inadequate for large data storage systems so redundancy was added at the system level with some brilliant software algorithms, and RAID (redundant array of inexpensive disks) became a reality. RAID moved the reliability requirements from the HDD itself to the system of data disks. Commercial implementations of RAID range from n+1 configurations (mirroring) to the more common RAID-4 and RAID-5, and recently to RAID-6, the n+2 configuration that increases storage system reliability using two redundant disks (dual parity). Additionally, reliability at the RAID group level has been favorably enhanced because HDD reliability has been improving as well.
Only Code Has Value?:
Even the best-written code can’t reveal why it’s doing what it’s doing.
A recent conversation about development methodologies turned to the relative value of various artifacts produced during the development process, and the person I was talking with said: the code has "always been the only artifact that matters. It’s just that we’re only now coming to recognize that." My reaction to this, not expressed at that time, was twofold. First, I got quite a sense of déjà-vu since it hearkened back to my time as an undergraduate and memories of many heated discussions about whether code was self-documenting. Second, I thought of several instances from recent experience in which the code alone simply was not enough to understand why the system was architected in a particular way.
Standardizing Storage Clusters:
Will pNFS become the new standard for parallel data access?
Data-intensive applications such as data mining, movie animation, oil and gas exploration, and weather modeling generate and process huge amounts of data. File-data access throughput is critical for good performance. To scale well, these HPC (high-performance computing) applications distribute their computation among numerous client machines. HPC clusters can range from hundreds to thousands of clients with aggregate I/O demands ranging into the tens of gigabytes per second.
Storage Virtualization Gets Smart:
The days of overprovisioned, underutilized storage resources might soon become a thing of the past.
Over the past 20 years we have seen the transformation of storage from a dumb resource with fixed reliability, performance, and capacity to a much smarter resource that can actually play a role in how data is managed. In spite of the increasing capabilities of storage systems, however, traditional storage management models have made it hard to leverage these data management capabilities effectively. The net result has been overprovisioning and underutilization. In short, although the promise was that smart shared storage would simplify data management, the reality has been different.
The Code Delusion:
The real, the abstract, and the perceived
No, I’m not cashing in on that titular domino effect that exploits best sellers. The temptations are great, given the rich rewards from a gullible readership, but offset, in the minds of decent writers, by the shame of literary hitchhiking. Thus, guides to the Louvre become The Da Vinci Code Walkthrough for Dummies, milching, as it were, several hot cows on one cover. Similarly, conventional books of recipes are boosted with titles such as The Da Vinci Cookbook—Opus Dei Eating for the Faithful. Dan Brown’s pseudofiction sales stats continue to amaze, cleverly stimulated by accusations of plagiarism and subsequent litigation.
The Next Big Thing:
The future of functional programming and KV’s top five protocol-design tips
Dear KV, I know you did a previous article where you listed some books to read. I would also consider adding How to Design Programs, available free on the Web. This book is great for explaining the process of writing a program. It uses the Scheme language and introduces FP. I think FP could be the future of programming. John Backus of the IBM Research Laboratory suggested this in 1977. Even Microsoft has yielded to FP by introducing FP concepts in C# with LINQ. Do you feel FP has a future in software development, or are we stuck with our current model of languages with increasing features?
Orienting Oracle:
Amlan Debnath on Oracle, SOA, and emerging events-driven architectures.
As vice president of server technologies for Oracle, Amlan Debnath is one of the few people who can synthesize Oracle’s software infrastructure plans. In an interview with ACM Queucast host Mike Vizard, Debnath provides some insights to how Oracle’s strategy is evolving to simultaneously embrace service-oriented architectures alongside the demands of new and emerging events-driven architectures.
Instant Legacy:
John Michelsen, CTO of iTKO, on testing tools for SOA applications
Companies building applications in an SOA environment must take care to ensure seamless interaction and make certain that any changes to their applications won’t negatively impact other applications. In an interview with ACM Queuecast host Mike Vizard, John Michelsen, CTO of iTKO, a Dallas based provider of testing tools for SOA applications, discusses the need for companies to recognize this delicate balance.