“We don't find HFS Plus administration to be complex, and we can't tell you what those other things mean, but they sound really cool, and therefore we want them. On the magic unlocked iPhone. For free.”
Har har har. Wait. Hold on a minute. Why is it suddenly fashionable to bash on ZFS?
Part of it is a backlash to the weird and obviously fake rumor about it becoming the default in Leopard, I guess. (No thanks to Sun's CEO Jonathan Schwartz here, who as far as I know has never publicly said anything about why he either misspoke or misunderstood what was going on back in June.)
But don't do that. Don't be a ZFS hater.
A word about my background
Let's get the credentials out of the way up front. Today I work on a file I/O subsystem for PlayStation 3 games. Before that, I worked in Apple's CoreOS filesystems group. Before that, I worked on DiscRecording.framework, and singlehandedly created the content subframework that streamed out HFS+, ISO-9660, and Joliet filesystems. Before that, I worked on the same thing for Mac OS 9. And before that, I worked on mass storage drivers for external USB/FireWire drives and internal ATA/ATAPI/SCSI drives.
You might say I know a thing or two about filesystems and storage.
What bugged me about the article
ZFS is a fine candidate to replace HFS+ eventually. It's not going to happen overnight, no. And it'll be available as an option for early adopters way before it becomes the default. But several years from now? Absolutely.
The bizarre rants about ZFS wasting processor time and disk space. I'm sorry, I wasn't aware that we were still using 30MHz machines with 1.44MB floppies. ZFS is great specifically because it takes two things that modern computers tend to have a surplus of — CPU time and hard disk space — and borrows a bit of it in the name of data integrity and ease of use. This tradeoff made very little sense in, say, 1992. But here in 2007 it's brilliant.
Sneeringly implying that HFS+ is sufficient. Sure, HFS+ administration is simple, but it's also inflexible. It locks you into what I call the "big floppy" model of storage. This only gets more and more painful as disks get bigger and bigger. Storage management has come a long way since the original HFS was created, and ZFS administration lets you do things that HFS+ can only dream of.
Claiming that RAID-Z is required for checksums to be useful. This is flat-out wrong. Sure, RAID-Z helps a lot by storing an error-correcting code. But even without RAID-Z, simply recognizing that the data is bad gets you well down the road to recovering from an error — depending on the exact nature of the problem, a simple retry loop can in fact get you the right data the second or third time. And as soon as you know there is a problem you can mark the block as bad and aggressively copy it elsewhere to preserve it. I suppose the author would prefer that the filesystem silently returned bad data?
Completely ignoring Moore's Law. How dumb do you need to be to willfully ignore the fact that the things that are bleeding-edge today will be commonplace tomorrow? Twenty gigabyte disks were massive server arrays ten years ago. Today I use a hard drive ten times bigger than that just to watch TV.
Reading this article made me feel like I was back in 1996 listening to people debate cooperative vs preemptive multitasking. In the Mac community at that time there were, I'm ashamed to say, a lot of heated discussions about how preemptive threading was unnecessary. There were some people (like me) who were clamoring for a preemptive scheduler, while others defended the status quo — claiming, among other things, that Mac OS 8's cooperative threads "weren't that bad" and were "fine if you used them correctly". Um, yeah.
Since then we've thoroughly settled that debate, of course. And if you know anything about technology you might be able to understand why there's a difference between "not that bad" and "a completely new paradigm".
ZFS is cool
Let's do a short rundown of reasons why I, a qualified filesystem and storage engineer, think that ZFS is cool. I'll leave out some of the more technical reasons and just try to keep it in plain English, with links for further reading.
Logical Volume Management. Hard disks are no longer big floppies. They are building blocks that you can just drop in to add storage to your system. Partitioning, formatting, migrating data from old small drive to new big drive -- these all go away.
Adaptive Replacement Caching. ZFS uses a smarter cache eviction algorithm than OSX's UBC, which lets it deal well with data that is streamed and only read once. (Sound familiar? It could obsolete the need for F_NOCACHE.)
Snapshots. Think about how drastically the trash can metaphor changed the way people worked with files. Snapshots are the same concept, extended system-wide. They can eliminate entire classes of problems.
I don't know about you, but in the past year I have done all of the following, sometimes more than once. Snapshots would've made these a non-issue:
- installed a software update and then found out it broke something
- held off on installing a software update because I was afraid something might break
- lost some work in between backups
- accidentally deleted an entire directory with a mistyped rm -rf or SCM delete command.
Copy-on-write in the filesystem makes snapshots super-cheap to implement. No, not "free", just "so cheap you wouldn't possibly notice". If you are grousing about wasted disk space, you don't understand how it works. Mac OS X uses copy-on-write extensively in its virtual memory system because it's both cheap and incredibly effective at reducing wasted memory. The same thing applies to the filesystem.
End-to-end data integrity. Journaling is the only thing HFS+ does to prevent data loss. This is hugely important, and big props are due to Dominic Giampaolo for hacking it in. But journaling only protects the write stage. Once the bits are on the disk, HFS+ simply assumes they're correct.
But as disks get larger and cheaper, we're finding that this isn't sufficient any more. The odds of any one bit being wrong are very small. And yet the 200GB hard disk in my laptop has a capacity of about 1.6 trillion bits. The cumulative probability that EVERY SINGLE ONE of those bits are correct is effectively zero. Zero!
Backups are one answer to this problem, but as your data set gets larger they get more and more expensive and slow. (How do you back up a terabyte's worth of data? How long does it take? Worse still, how do you really know that your backup actually worked instead of just appearing to work?) So far, disk capacity has consistently grown faster than disk speed, meaning that backups will only continue to get slower and slower. Boy, wouldn't it be great if the filesystem — which is the natural bottleneck for everything disk-related — helped you out a little more on this? ZFS does.
Combinations of the above. There are some pretty cool results that fall out of having all of these things together in one place. Even if HFS+ supported snapshots, you'd still be limited by the "big floppy" storage model. It really starts to get interesting when you combine snapshots with smart use of logical volume management. And we've already discussed how RAID-Z enhances ZFS's basic built-in end-to-end data integrity by adding stronger error correction. There are other cool combinations too. It all adds up to a whole which is greater than the sum of its parts.
Is any of this stuff new and unique to ZFS? Not really. Bits and pieces of everything I've mentioned above have showed up in many places.
What ZFS brings to the table is that it's the total package — everything all wrapped up in one place, already integrated, and in fact already shipping and working. If you happened to be looking for a next-generation filesystem, and Apple is, you wouldn't need to look much further than ZFS.
Still not convinced?
Okay, here are three further high-level benefits of ZFS over HFS+:
Designed to support Unix. There are a lot of subtleties to supporting a modern Unix system. HFS+ was not really designed for that purpose. Yes, it's been hacked up to support Unix permissions, node locking, lazy zero-fill, symbolic links, hard links, NFS readdir semantics, and more. Some of these were easy. Others were painful and exhibit subtle bugs or performance problems to this day.
Designed to support modern filesystem concepts. Transactions. Write cache safety. Sparse files. Extended metadata attributes. I/O sorting and priority. Multiple prefetch streams. Compression. Encryption. And that's just off the top of my head.
HFS+ is 10-year-old code built on a 20-year-old design. It's been extended to do some of this, and could in theory be extended to do some of the others... but not all of them. You'll just have to trust me that it's getting to the point where some of this stuff is just not worth the engineering cost of hacking it in. HFS+ is great, but it's getting old and creaky.
Actually used by someone besides Apple. Don't underestimate the value of a shared standard. If both Sun and Apple start using the same open-source filesystem, it creates a lot of momentum behind it. Having more OS clients means that you get a lot more eyes on the code, which improves the code via bugfixes and performance enhancements. This makes the code better, which makes ZFS more attractive to new clients, which means more eyes on the code, which means the code gets better.... it's a virtuous circle.
Is ZFS the perfect filesystem? I doubt it. I'm sure it's got its limitations just like any other filesystem. In particular, wrapping a GUI around its administration options and coming up with good default parameters will be an interesting trick, and I look forward to seeing how Apple does it.
But really, seriously, dude. The kid is cool. Don't be like that.
Don't be a ZFS hater.
MWJ's response: You don't have to be a ZFS hater to know it's wrong for you.
My followup: ZFS Hater Redux