Sunday, December 02, 2007

The Case Against Insensitivity  

One of the most controversial parts of my earlier post, Don't Be a ZFS Hater, was when I mentioned off-handedly in the comments that I don't like case-insensitivity in filesystems.

Boy, did that spur a storm of replies.

I resolved to not pollute the ZFS discussion with a discussion of case-insensitivity and promised to make a separate blog post about it. It took a while, but this is that post. I blame a busy work schedule and an even busier travel schedule. (Recently in the span of two weeks I was in California, Ohio, London, Liverpool, London, Bristol, London, Amsterdam, London, then back to Ohio. Phew!)

Here's Why Case-Insensitive Filesystems Are Bad

I've worked in and around filesystems for most of my career; if not in the filesystem itself then usually a layer just above or below it. I'm speaking from experience when I tell you:

Case-insensitivity is a bad idea in filesystems.

And here's why:

  1. It's poorly defined.
  2. Every filesystem does it differently.
  3. Case-insensitivity is a layering violation.
  4. Case-insensitivity forces layering violations upon other code.
  5. Case-insensitivity is contagious.
  6. Case-insensitivity adds complexity and provides no actual benefit.

I'll expand on each of these below.

It's poorly defined

When I say "case-insensitive", what does that mean to you?

If you only speak one language and that language is English, it probably seems perfectly reasonable: map the letter a to A, b to B, and so on through z to Z. There, you're done. What was so hard about that?

But that's ASCII thinking; the world left that behind a long time ago. Modern systems are expected to deal with case differences in all sorts of languages. Instead of a simple 26-letter transformation, "case insensitivity" really means handling all the other alphabets too.

The problem with doing that, however, is that it brings language and orthography into the picture. And human languages are inherently vague, large, messy, and constantly evolving.

Can you make a strict definition of "case insensitivity" without any hand-waving?

One way to do it is with an equivalence table: start listing all the characters that are equal to other characters. We can go through all the variants of Latin alphabets, including a huge list of accents: acute, grave, circumflex, umlaut, tilde, cedilla, macron, breve, dot, ring, ogonek, hacek, and bar. Don't forget to find all the special ligatures and other letters, too, such as Æ vs æ and Ø vs ø.

Okay, our table is pretty big so far. Now let's start adding in other alphabets with case: Greek, Armenian, and the Cyrillic alphabets. And don't forget the more obscure ones, like Coptic. Phew. It's getting pretty big.

Did we miss any? Well, for any given version of the Unicode standard it's always possible to enumerate all letters, so it's certainly possible to do all the legwork and prove that we've got all the case mappings for, say, Unicode 5.0.0 which is the latest at the time of this writing. But Unicode is an evolving standard and new characters are added frequently. Every time a new script with case is added we'll need to update our table.

There are also some other hard questions for case insensitivity:

  • Digraph characters may have three equivalent mappings, depending on how they are being written: all-lowercase, all-uppercase, or title-case. (For example: dz, DZ, or Dz.) But this breaks some case-mapping tables which didn't anticipate the need for an N-way equivalence.

  • The German letter ß is considered equal to lowercase ss. Should "Straße" and "STRASSE" be considered equivalent? They are in German. But this breaks some case-mapping tables which didn't anticipate the need for an N-to-M character translation (1:2, in this case).

  • Capital letters can significantly alter the meaning of a word or phrase. In German, capital letters indicate nouns, so the word Essen means "food", while the word essen means "to eat". We make similar distinctions in English between proper nouns and regular nouns: God vs god, China vs china, Turkey vs turkey, and so on. Should "essen" and "Essen", or "china" and "China" really be considered equivalent?

  • Some Hebrew letters use different forms when at the end of a word, such as פ vs ף, or נ vs ן. Are these equivalent?

  • In Georgian, people recently experimented with using an obsolete alphabet called Asomtavruli to reintroduce capital letters to the written language. What if this had caught on?

  • What about any future characters which are not present in the current version of the Unicode standard?

Case is a concept that is built into written languages. And human language is inherently messy. This means that case-insensitivity is always going to be poorly defined, no matter how hard we try.

Every filesystem does it differently

Unfortunately, filesystems can't engage in hand-waving. Filesystem data must be persistent and forward-compatible. People expect that the data they wrote to a disk last year should still be readable this year, even if they've had an operating system upgrade.

That's a perfectly reasonable expectation. But it means that the on-disk filesystem specification needs to freeze and stop changing when it's released to the world.

Because our notion of what exactly "case-insensitive" means has changed over the past twenty years, however, we've seen a number of different methods of case-insensitivity emerge.

Here are a handful of the most popular case-insensitive filesystems and how they handle case-mapping:

  • FAT-32: ASCII upper- and lower-case letters, but a-z and A-Z are considered identical. Also variable IBM code pages in high ASCII.
  • HFS: ASCII upper- and lower-case letters, but a-z and A-Z are considered identical. Also variable Mac encodings in high ASCII.
  • NTFS: Case-insensitive in different ways depending on the version of Windows that created the volume.
  • HFS+: Case-insensitive with a mapping table which was frozen circa 1996, and thus lacks case mappings for any newer characters.

None of these — except for NTFS created by Vista — are actually up-to-date with the current Unicode specification. That's because they all predate it. Similarly, if a new filesystem were to introduce case-insensitivity today, it would be locked into, say, Unicode 5.0.0's case mappings. And that would be all well and good until Unicode 5.1.0 came along.

The history of filesystems is littered with broken historical case mappings like a trail of tears.

Case-insensitivity is a layering violation

When people argue for case-insensitivity in the filesystem, they almost always give user interface reasons for it. (The only other arguments I've seen are based on contagion, which I'll talk about in a moment.) Here is the canonical example:

My Aunt Tillie doesn't know the difference between letter.txt and Letter.txt. The filesystem should help her out.

But in fact this is a UI problem. The problem relates to the display and management of information, not the storage of this information.

Don't believe me?

  • When any application displays items in a window, who sorts them case-insensitively? The filesystem? No! The application does it.

  • When you type-select, typing b-a-b-y to select the folder "Baby Pictures" in an application, who does the case-insensitive mapping of the letters you type to the files you select? The filesystem? No! The application again.

  • When you save or copy files, who does the case-insensitive test to warn you if you're creating "file.txt" when "File.txt" already exists? The filesystem? Yes!

Why does the third question have a different answer than the rest?

And we've already talked about how filesystems are chronically out-of-date with their case mappings. If your aunt is a Turkish Mac user, for example, she's probably going to notice that the behavior of the third one is different for no good reason. Why are you confusing your Aunt Tülay?

One last point was summarized nicely by Mike Ash in the comments of Don't Be a ZFS Hater. I'll just quote him wholesale here:

Yes, Aunt Tillie will think that "Muffin Recipe.rtf" and "muffin recipe.rtf" ought to be the same file. But you know what? She'll also think that "Muffin Recipe .rtf" and "Recipe for Muffins.rtf" and "Mufin Recipe.txt" ought to be the same file too.

Users already don't generally understand how the OS decides whether two files are the same or not. Trying to alleviate this problem by mapping names with different case to the same file solves only 1% of the problem and just isn't worth the effort.

I agree completely.

Case-insensitivity forces layering violations upon other code

All too often, pieces of code around the system are required to hard-code knowledge about case-insensitive filesystem behavior. Here are a few examples off the top of my head:

  • Collision prediction. An application may need to know if two files would conflict before it actually writes either of them to disk. If you are writing an application where a user creates a group of documents — a web page editor, perhaps — you may need to know when banana.jpg and BANANA.JPG will conflict.

    The most common way that programmers solve this is by hard-coding some knowledge about the case-insensitivity of the filesystem in their code. That's a classic layering violation.

  • Filename hashing. If you are writing code to hash strings that are filenames, you probably want equivalent paths to generate the same hash. But it's impossible to know which files are equivalent unless you know the filesystem's rules for case-mapping.

    Again, the most common solution is a layering violation. You either hard-code some knowledge about the case-insensitivity tables, or you hard-code some knowledge about your input data. (For example, you may just require that you'll never, never, ever have multiple access paths for the same file in your input data. Like all layering violations, that might work wonderfully for a while ... right up until the day that it fails miserably.)

I'm sure there are more examples out there.

Case-insensitivity is contagious

This is the worst part. It's all too easy to accidentally introduce a dependence on case-insensitivity: just use an incorrect path with bad case.

The moment somebody creates an application or other system that inadvertently depends on case-insensitivity, it forces people to use a case-insensitive filesystem if they want to use that app or system. And that's one of the major reasons why case-insensitivity has stuck around — because it's historically been very difficult to get rid of.

I've seen this happen with:

  • Source code. Some bozo writes #include "utils.h" when the file is named Utils.h. Sounds innocent enough, until you find that it's repeated dozens of times across hundreds of files. Now that project can only ever be compiled on a case-insensitive filesystem.

  • Game assets. A game tries to load lipsync.dat instead of LIPSYNC.DAT. Without knowing it, the artist or developer has accidentally locked that game so that it can only run on a case-insensitive filesystem. (This causes real, constant problems in game pipelines; teams create and test their games on case-insensitive NTFS and don't notice such problems until it's burned to a case-sensitive UDF filesystem on DVD or Blu-Ray.)

  • Application libraries. DLLs and shared library references are sometimes generated by a build script which uses the wrong case. When that happens, the application may simply fail to launch from a case-sensitive filesystem.

  • Miscellaneous data files. Sometimes an application will appear to run on a case-sensitive filesystem but some feature will fail to work because it fails to load a critical data file: the spell-checking dictionary, a required font, a nib, you name it.

Happily, since Mac OS X shipped in 2001, Apple has been busy solving its own problems with case-insensitivity and encouraging its developers to test with case-sensitive filesystems. Two important initiatives in this direction have been NFS home directories and case-sensitive HFSX.

The upshot of it is that Mac OS X is actually very friendly to case-sensitive disks these days; very little that's bad happens when you use case-sensitive HFSX today.

Case-insensitivity adds complexity with no actual benefit

I'm going to make an assertion here:

ONE HUNDRED PERCENT of the path lookups happening on your Mac right now are made with correct case.

Think about that for a moment.

First off, you may think this contradicts the point I just made in the previous section. Nope; I'm simply rounding. The actual figure is something like 99.999%, and I'd probably get tired of typing 9's before I actually approached the real number. There are infinitesimally few path accesses made with incorrect case compared to the ones that are made with the proper case.

Modern computers make hundreds of filesystem accesses per second. As I type this single sentence in MarsEdit on Mac OS X 10.4.11, my computer has made 3692 filesystem accesses by path. (Yes, really. MarsEdit's "Preview" window is invoking Perl to run Markdown, which loads a handful of modules, and then WebKit re-renders the page. That's a lot of it, but meanwhile there's background activity from Mail, Activity Monitor, iChat, SystemUIServer, iCalAlarmScheduler, AirPort Base Station Agent, Radioshift, NetNewsWire, Twitterrific, and Safari.)

Under Mac OS X you can measure it yourself with this command in Terminal:

  sudo fs_usage -f filesys | grep / > /tmp/accesses.txt

The vast majority of file accesses are made with paths that were returned from the filesystem itself: some bit of code read the contents of a directory, and passed the results on to another bit of code, which eventually decided to access one of those files. So most of the time the filesystem is getting back the paths that it has returned earlier. Very very few accesses are made with paths that come directly from an error-prone human, which is why essentially 100% of filesystem accesses are made with correct case.

But if essentially all filesystem accesses are made with the correct case to begin with, why do we even have case-insensitivity at all?

We've already discussed the problems of contagion, which is a circular justification: we have to do it because someone else did it first. We've also discussed UI decisions being incorrectly implemented in the bottommost layer of the operating system. Other than those two, what good is it?

I don't have an answer to that. For the life of me I can't come up with any reason to justify case-insensitive filesystems from a pure design standpoint. That leads me to my closing argument, which is...

A thought experiment

Suppose case-insensitive filesystems had never been invented. You're the leader of a team of engineers in charge of XYZZYFS, the next big thing in filesystems. One day you tell the other people who work on it:

"Hey! I've got this great idea! It's called case-insensitivity. We'll take every path that comes into the filesystem and compare it against a huge table to create a case-folded version of the path which we'll use for comparisons and sorting. This will add a bunch of complexity to the code, slow down all path lookups, increase our RAM footprint, make it more difficult for users of our filesystem to handle paths, and create a compatibility nightmare for future versions if we ever decide to change the table. But, you see, it'll all be worth it, because... _________________."

Can you fill in the blank?

16 comments:

  • sblowes said...

    Dang, you're smart. I agree, case insensitivity issues should be handle on the App level, and the filesystem level should be as clean as possible.

  • Rosyna said...

    1. Case comparison is well-defined. There's absolutely no need to speculate about some alphabet. Nor is there any reason to use the classic German red herring... it doesn't actually apply in this case since filenames are not poetry.

    2. FAT32 and NTFS don't actually implement case-insensitivity. Not technically, at least. But... saying that something that should be implemented at the FS level is different depending on the FS implementation is odd.... The only real necessity is that the FS implementation agrees with itself.

    Also, it sounds like you're saying, "Because we can only cover 99.999% of all cases and not the last 00.001% of cases, we should just scrap the entire thing and not do it for the 99.999% of cases either." Sadly this reasoning is used way too often by developers.

    And Vista uses the case folding tables from Unicode 4.1.

    3. You want case-insensitivity at the API level? You got it in Windows. And just look at those super cool error dialogs that appear when the user is doing something they think they can do but can't because the API differs.

    And then you get weird-ass UI issues because one application handles the case compare differently than another application. Or worse, data loss bugs. Imagine if an application that did the operation in that screenshot never showed the user a dialog and happily overwrote the file multiple times?

    If you don't put case-insensitivity at the FS level, it becomes a UI problem.

    4. See parts of 3.

    4a. Some sort of collision detection still needs to be done on case-sensitive file systems at the FS level. Otherwise, it is a security issue.

    5. It's only an issue because all the original filesystems designers were lazy and didn't bother to consider that file names were strings, with some kind of meaning. They treated them as a bucket of bits. Evidence of mistakes in designs of the past should not be use to propagate design flaws to the future solely because of legacy. The mistakes become a contagion.

    6. Are they actually being made with paths or is the fs_usage or other API converting them to human readable paths for the user's benefit?

    Paths are evil.

  • Mac Phantom said...

    This comment has been removed by a blog administrator.
  • Drew Thaler said...

    1. Case-comparison is well-defined at any given moment in time. But then the committee meets and we all collectively change our minds.

    2. "The only real necessity is that the FS implementation agrees with itself." You're not thinking cross-platform. The real necessity is that every implementation agrees with every other implementation of the same filesystem. There are literally dozens of completely separate implementations of FAT floating around out there, and they all need to agree with each other. That's a harder problem, and the more complex you make the spec the harder that gets.

    Also, you seem to be arguing here that if Windows has a crappy UI for something, then it must not be possible to create a good UI for it. I disagree.

    4a and 5. Unicode problems that aren't related to case — ignorables, invalid sequences, precomposed vs decomposed issues, etc — are all easily handled by a single Unicode normalization routine which is far simpler than case-folding. We've got that in Mac OS X: +[NSString fileSystemRepresentation] in userspace, and utf8_normalizestr in the kernel. The difference is that by convention, case must be preserved. All that other junk can be thrown away.

    6. They are actually being made with paths: open(2), stat(2), getattrlist(2), etc. fs_usage will print a number instead of a path if the access is made via an open file descriptor. (btw, I just noticed that with this simple test you may get some false positives from Spotlight writing to /dev/diskN, but for my trace above I looked through the file manually and didn't see any.)

  • Devin said...

    Thanks for the follow-up post, Drew. In my opinion, the most compelling argument you make is that against the ethnocentric assumption that just because "Readme and README" are the same in English (and I don't think anyone will ever convince me otherwise) that they're the same in other languages. Having the user's locale affect whether there are collisions in a filesystem is pretty much untenable since multiple users share a single filesystem and users may switch locales. (Although it seems to me that this makes dealing with the problem at the UI level pretty much impossible as well).

    I'm not so swayed by your technical arguments. It seems to me that you are assuming that the "users" of a filesystem are the programmers who write code that uses it -- the arguments you give are about reducing troubles for application programmers -- while most of us view the human end-user as the "user" of a filesystem. Given that you're a filesystem engineer, this makes sense, but I think it's a mistake.

    Now you might argue (in fact, you have) that the end-user doesn't, or shouldn't, "use" the filesystem, rather he or she uses an application which itself uses the filesystem and the application should concentrate on the needs of the user while the filesystem should concentrate on the needs of the application programmer. But I have to agree with Rosyna that this just doesn't seem to work very well in practice. My example: the Finder and hiding file extensions: no matter how hard the Finder works at making sure the user doesn't need know about file extensions, something always leaks through and when it does, the UI has to either apply fixup heuristics or bug the user about it, or both.

    I guess my point is that the argument that case-insensitive file systems are an absolute pain in the ass for file system developers is mostly irrelevant (although very interesting). A convincing argument would have to show that case insensitivity with preservation is bad for end-users.

    Thanks again for the fascinating post.

  • Scott said...

    I have learned one thing over the years, and that is that solving problems early usually makes them go away, while leaving problems 'for the application/user/implementor' makes you live with them for years, and live with bad software written by people who did not fully think the issues through.

    Users want and expect case insensitivity. That they expect other things, like misspellings or spacing, to also map to the same thing does not invalidate their desire to find their damn files, and that putting abc after XYZ is wrong for virtually every use case I can think of.

    It is a red herring to claim that 'it is the application that does the sorting', It works that way because we, the designers, built it that way, and we did it because we used separate ascii codes for a and A, rather than putting on a case bit.

    While there are dozens of FAT implementations, there are thousands of applications, most of which get this wrong. So, we are increasing the odds of the user having a good experience by enforcing case insensitivity prior to the application level.

    Whether it belongs in the file system, or it belongs in the APIs that the OS exposes is unclear. We have currently chosen either 'nowhere' or 'language/library API' rather than 'filesystem', and it is not working, so perhaps moving it to the FS would increase the odds that software would get it right, at additional cost.

    Scott

  • Drew Thaler said...

    @Devin: Technical problems affect users pretty directly, though. If it's difficult for engineers to do something, the cost is passed on to the user in multiple ways. The end result will be at least one, if not all, of these:

    * more expensive
    * have fewer features
    * require more RAM
    * run slower

    Technical problems also hinder innovation. Look at the explosion of filesystems written with FUSE in the past few years. In some ways FUSE doesn't really do much; it's fancy glue code. But it simplifies most of the hardest technical problems and has thus led to a lot of fancy new ideas.

  • Jacob Rus said...

    I'm as concerned with Unicode normalization than I am with case insensitivity, and I don't think you answered the question I asked in your last post.

    What should filesystems do w.r.t. different representations of the same Unicode characters? Should decomposed é be considered to be 2 different characters, allowing two different files called ~/México.text on the same machine?

    It seems to me that your argument extends just as well to Unicode normalization (and once we're doing normalization, the extra cost of doing case mapping is negligible), but the results would be disastrous, as they are in Linux filesystems.

  • Drew Thaler said...

    @Jacob Rus: Yeah, I didn't get into normalization earlier for the same reasons I didn't get into case. I can answer, with the caveat that I've done a lot more work with case than normalization, so I might be missing some subtleties.

    I think normalization is required for filesystems. You can't get away from it: paths are text, and Unicode text has structure and is not a bucket of bytes. The important difference between normalization and case is that you need to preserve case; you shouldn't need to preserve the original input bytes. As long as you translate them to an equivalent normal form it's just fine. So no, you shouldn't allow two files named México.text. When you properly normalize the strings you'll wind up with the same name.

    There are four normalization forms defined; each filesystem that uses Unicode MUST standardize on whatever form it wants. (Or, like HFS+, whatever almost-but-not-quite canonical variant that it wants. That's fine, though. It really doesn't matter as long as you do pick a standard.)

    If you're working with a filesystem spec that uses Unicode but hasn't picked a normalization form yet, it seems to me that in general (a) NFD is the "root" form from which the others can be derived in a lossless fashion, and (b) it's intended by the authors of the standard to be the most forward-compatible. So I'd get the spec for the filesystem changed to standardize on that as quickly as possible.

    The way OSX does it is pretty reasonable: the userspace functions like fileSystemRepresentation return NFD UTF8, and inside the kernel each filesystem renormalizes on its own to its desired format using a set of shared utf8 conversion routines. The people who made this the official kernel filesystem policy at Apple were the same ones who spent years working on the Unicode standard. When you're in doubt about what to do, I think it's usually safe to copy from the person who designed the system. ;-)

  • Mark Munz said...

    The argument that solution A isn't good enough is a weak one. If you're looking for converts, you need to provide that compelling argument that matches with end-users needs.

    Filenames are designed for end-users. Without the end-user, we could just associate a 64-bit number to a file and be done with it. But Uncle Joe and Aunt Tillie are crucial here, as is Business Bob. Do they see "Cup" and "cup" as different? I would argue no, because we do not use case to distinguish words. You will not find "Cup: (n) ..." and then "cup: (n) ..." in the dictionary as two separate entries.

    Case sensitivity seems geared more towards pleasing the purists and less about solving real world problems. Case sensitive filenames is really catering to a tiny percentage of users. Again, what great advantages are offered to me? That I can have "Readme.text" and "ReadMe.text" in the same folder? Oh the joy. What does that mean to me, the user?

    You laugh at Aunt Tillie, but humans have a more difficult time distinguishing between subtle differences like "ReadMe.txt" and "Readme.txt" because of how we naturally read. Yes, case insensitive filenames have some issues, but case sensitive filenames just trade one set of issues for another.

    As a bonus, if it were to actually take place, virtually every end-user app would be totally broken. Wow -- how's that for a benefit. As someone who has gone through the migrations of 68K -> PPC -> Intel -> 32-bit -> 64-bit, I say migrations are the true time sink, not case insensitivity. More expensive, less new features all because I have yet another major migration to undergo. And what do you get for all your pain and trouble? IMHO, not much.

  • John said...

    Drew isn't arguing against case-insensitivity for file names. He's arguing against case-insensitivity being implemented in the file system, and I agree.

  • mikeash said...

    Nice followup. I appreciate the esteem in which you apparently held my old comment.

    To the posters claiming there will be some kind of apocalypse if we switch to case-sensitive filesystems, I ask: why? I'm pretty sure that 99.9% of apps will not care. As for the users, what exactly is the use case which exposes the case sensitivity of the underlying filesystem? Users open files by clicking them. They save files under either completely different names, or under identical names, and the latter is accomplished by telling the app "save over that file", not by painstakingly reproducing the old filename*. I buy the argument that Aunt Tillie expects "foo" and "Foo" to be the same, but under what circumstances would this conception actually be challenged?

    * This brings up an amusing story which is vaguely related. Back in the bad ancient days when everything was run off of floppy disks, I was nearing the end of King's Quest III on my IIGS. In these games it was important to save often, because death was easy and, worse, irrevocable game-losing actions could be taken but not discovered until later. Save game size was also significant. As I neared the end of the game, I moved into a particularly difficult phase in the mountains, and I hit save. In my excitement I accidentally named the save game file something like "in the mountaimns". It so happened that this save took up the very last bit of free space on the disk. It also happened that the GS's standard file save dialog chopped off the display of the name before the typo. Since I didn't want to save over any of my other games, and that dialog missed the nice OS X feature of being able to click on an existing file to get its name, I was stuck trying to exactly reproduce the bizarre typo I had made. As this was a particularly challenging area, I wanted to save after every few inches of progress, and each time I had to type out that exact, partially-hidden filename all the way through those treacherous mountaimns of Kings Quest III. The lesson? Your users will be upset if you make them type the name of an existing file no matter what the semantics of the underlying filesystem.

  • Chuck said...

    My objection to case-insensitive filesystems comes from having used them. I have actually come across folders with a Profile.dat, profile.dat and a PROFILE.DAT. Your counterargument seems to be that this can be solved at the application level. But empirically speaking, it isn't. There are case-sensitive filesystems in existence and applications are not solving the problem. Meanwhile, with case-insensitive filesystems, the problem is solved in the vast majority of cases without any need for application developers to think about it. I find this real-world data quite persuasive.

    Your other main argument seems to be that even though it is more user-friendly, there are still cases where it won't be able to help and there are other things that might confuse users. I don't see how this is an argument against case-insensitivity. It's like somebody's proposed a medicine that can instantly cure the common cold 50% of the time, and we're saying, "But what about the other 50%? We can't allow a medicine that won't cure them!" Even if we miss somebody's character set with our case-insensitivity code, they're not worse off than with a case-sensitive filesystem.

    As for taking up more RAM and CPU time -- that's fine by me. RAM and CPU time are getting cheaper every day. As a user, I say: If you can make my life better by using a little more of my RAM and CPU, do it. In fact, many features of ZFS embrace this philosophy.

  • Drew Thaler said...

    @mark munz: Filesystems are not designed for end users. You're confusing the interface to your files (the Finder, etc) with the filesystem itself. In fact, most low-level components of your operating system do not take users into account. The preemptive thread scheduler is designed to make efficient use of the CPU, not user-friendliness. Hardware drivers are designed to maximize throughput to attached devices. Similarly, filesystems are designed for fast and efficient data retrieval. It's not until you get up to the application/UI level that things are designed directly for end users.

    @chuck: Your PROFILE.DAT example is an argument from contagion, which I've already rejected. And I'm not arguing that it's more user-friendly; I'm saying that doing it in the filesystem has zero effect on user-friendliness. If anything, by introducing ambiguity and cross-platform issues it actually has a negative effect on user-friendliness.

    I'm fine with using RAM and CPU time for something useful like data integrity in ZFS; what I find hard to swallow is using more RAM and CPU time for no visible benefit.

  • Martin Kunev said...

    I totally agree with the author on every point.

  • samantha said...

    Excellent. Just what I was looking for to give the insensitive FS fanboys. :)