Friday, May 13, 2005

A brief history of purity at Apple  

I was reading Ben Goodger's take on the recent Safari-KHTML kerfuffle. He's basically right. What I find particularly amusing is that ten years ago Apple would have been on the opposite side of the debate... with Apple fighting for purity and the open-source guys just putting in a quick hack to make it work.

It took Apple quite a while to get over its obsessions with purity in software design. Don't get me wrong; purity isn't a bad thing at all. In fact it's an important goal to strive for. But you have to make compromises to ship a useful product sometimes.

Ten years ago, more or less, Mac programmers like me were trying to deal with lovely gems that came out of Apple like these:

  • Apple Events -- in theory a great concept which used an incredibly flexible and extensible tagged-data interface. Problem was that its API and implementation were so annoyingly pure that they required allocation of a lot of temporary objects and a lot of memory got copied around needlessly. In the end it was much slower than any contemporary RPC/IPC call you could imagine.

    (Today: Apple Events have been kept alive, although with the implementation optimized and the API expanded to be faster and less pure. Several even-less-elegant but faster-and-easier methods of RPC/IPC became available with OSX and were quickly adopted by many grateful programmers.)

  • AppleScript -- in theory another great concept, designed around an abstracted open scripting architecture which could support multiple scripting dialects, and which allowed those flexible and extensible bits of tagged data from the Apple Event manager to be passed around. But only one dialect was ever completed, and the architecture wound up making it painful to implement from the developer's point of view. Heck, even the one-and-only dialect was fairly clunky from the user's point of view. This might make you wonder whom exactly it was designed for.

    (Today: AppleScript has stayed alive too, although the way developers and users actually use it in practice frequently disobeys the pure object-verb syntax of the original design. And what AppleScript support is out there generally comes for free from AppKit these days. Do you know anyone who's actually written code to properly parse a 'whose' clause lately?)

  • AOCE, or Apple Open Collaboration Environment -- in theory a great concept, trying to create a system that integrated mail, address books, digital signatures, networking, and more before all of these things were widespread. Its problem was that rather than attacking each problem individually, it tried to do them all at once. And it had a nicely object oriented design, but rather than allowing direct access to any of the dozens of classes or hundreds of members, it created accessors for every single operation. Ultimately it collapsed under its own weight. It was monolithic, huge, and incredibly difficult to understand. One of the Inside Macintosh: AOCE books was almost 1400 pages long -- big, 8.5" x 11" pages too. My phone book was smaller. And that book was only one of at least three...

    (Today: Deceased, just like all those trees.)

  • Open Transport -- a networking stack and API that had a very theoretically pure design and was supposed to, in theory and if you used it right, deliver rockin' performance. It did deliver better performance than the older MacTCP, but it was generally much too obnoxious for most developers to use directly. The most popular way to use it was via a wrapper library such as GUSI that made it more socket-like.

    (Today: Deceased. On OSX, Apple re-implemented the APIs with glue that calls through to Unix sockets.)

  • Newton -- another very pure concept. A tablet that you can just write on, and it will just recognize your handwriting! Cool! Electronic book! Electronic paper! The problem is that the technology and/or design clearly wasn't quite up to the task. It was big, clunky, expensive, and slow. The handwriting recognition technology was good in theory but notoriously imperfect in practice, but Apple stuck with it anyway.

    (Today: Deceased. A few years later, when Palm did basically the same product, they put in a hack called Graffiti so that they didn't have to solve the problem of generalized handwriting recognition.)

  • OpenDoc -- a very nice idea that was (and probably still is) ahead of its time. Rather than documents being things created by applications, they were just collections of objects which were essentially peers and had defined relationships to each other. If anything, OpenDoc was less a victim of its design and probably a victim of its circumstances: it was C++ and required COM, which was burdensome in terms of both performance and licensing, and it had to run on a rather lackluster and nonstandard OS which limited its portability.

    (Today: Deceased. But interestingly, Apple is just now starting to regain OpenDoc-like functionality with things like KVC, bindings, and CoreData. Thirteen years later.)

I think you can see what I'm getting at here. Apple being chastised for putting in quick hacks to make things work, rather than going for purity? By a bunch of Linux coders? Ah, sweet sweet irony.

Branching and Integration

I definitely feel the pain of the KHTML guys. But in the end what happened was inevitable.

See, one way or another Apple wound up on a branch. I don't know whether Apple chose to deliberately make incompatible changes, or if KHTML refused to accept some of their changes and forced them out on a branch. But regardless of your interpretation that's the way it ended up. And that was the first step towards KHTML's problems.

There should be (but probably isn't) a lesson in Software Engineering 101 about what happens when you have work proceeding on two different branches of a source tree. Integration -- the unpopular gruntwork everyone loves to hate -- starts to rear its ugly head. As long as the teams devote roughly the same amount of time to developing their branches, the branches grow in parallel and each spends a roughly equal amount of time integrating changes back and forth.

But what happens when one of those branches suddenly has a great deal of work invested in it, and the other doesn't? The team maintaining the less-vigorous branch starts spending more and more of their time on integration and less on development. Integration sucks; it's a necessary evil, but nobody likes doing it. Quickly it becomes less like fun and more like work. So the less-vigorous branch is in danger of withering even further.

It's made exponentially worse if the less-vigorous branch ever refuses some of the changes in the more-vigorous branch, because that causes the source bases to diverge even further. Now not only is there more integration work, but it's harder too. If the team is made up of part-time volunteers it can kill their enthusiasm for the project completely.

In a nutshell, that's pretty much what is happening (or has happened) with Apple's full-time engineering team and KHTML's part-time engineering team. Apple exacerbated the problem with what some are calling "code bombs", ie releases of an entire tree at once, but you're fooling yourself if you think the same problem would not exist regardless. The problem is really in the quantity of changes being made. Is it Apple's obligation to go back and do all the work to make their changes work on Konqueror? Not at all.

It kinda sucks for the KHTML guys, but the solution is clear even if they don't want to admit it: transition the upper levels over from the less-vigorous branch over to the more-vigorous branch. Rather than always integrating lots of changes from B back into A, start using B and port a few changes from A forward into B. It sounds like this is what Maciej suggested.

Happily, it sounds like they are pursuing something like this even as we speak.