Recording Artist

Wednesday, April 20, 2011

Optimization Goals

In my day job, I've worked with many different teams of programmers. Game engine, game tools, OS, SDK, middleware, test engineers, you name it. Many of the people I work with are the best in the industry, true world-class programmers. And others, well, aren't... and that's okay too. :-)

Personally, I find that I've learned a ton from interacting with all of these different groups.

One of the most important lessons I've learned — and I'm frankly surprised this isn't more widely talked about — is that there are many different and equally valid possible optimization goals in programming.

When I talk to a new team, that's often the first thing I try to find out: what are your primary goals? What are you looking for?

The most common definition of optimization involves improving the runtime speed of the code. For example, game engine programmers are often very focused on this, because they have to cram as much work as they can into one frame — generally either 1/30th of a second (33.3ms) or 1/60th of a second (16.6ms). Naturally, this leads to a lot of counting cycles and downcoding inner loops into assembly.

However, game tool programmers tend to be a little more willing to sacrifice speed in favor of optimizing for other goals: development time, correctness, and ease-of-use. For example, it's often OK to use an unoptimized implementation of some algorithm in a tool, since a high-end PC will run it "fast enough". In other cases a tool might only run infrequently or overnight. Or it might be feasible to simply throw more hardware at the problem.

You also see it among different teams: a studio making a AAA game might be very focused on optimizing runtime speed, while a small indie team might be focused on optimizing for development cost — doing the work as cheaply as possible. These teams will have very different optimization goals, to the point that what the AAA game has downcoded to assembly might be C++ objects and STL in the other.

Then there are the sports games — which live in a fascinating alternate world all their own. These guys have to iterate and ship a new version once a year, every single year! Slipping is simply not an option, so they tend to optimize first for reliability, upgradeability, and ease-of-maintenance.

This got me thinking: what are all the possible optimization goals you might have?

Possible Optimization Goals

Here are some things that you may choose to optimize for in your development. It's impossible to optimize for all of these things simultaneously, but it's common to target more than one.

runtime speed - Run as fast as possible on your target system. Most of the optimization literature focuses on this. It involves choosing algorithms carefully, choosing your data layout carefully, looking for early exit opportunities, hand-optimizing inner loops, and so on.
development time - Complete the engineering work as quickly as possible. In other words, finish the job in a day or a week, rather than a month. Optimizing for development time may mean that you use the highest-level language available that will do the job, along with whatever third-party frameworks you can pull in to solve the problems you need solved. C#, Python, Ruby, etc.
development cost - Complete the engineering work as cheaply as possible. This is closely related to development time, but slightly different! The primary difference is the choice of who is doing the work: it might be cheaper (and a better investment) to have an intern use a task as a learning project and work on it for three months, than to have a senior programmer crank it out in three weeks.
correctness - Be absolutely sure there are no mistakes or bugs. Bug-free code is always a goal, of course, but sometimes it's even more important than usual. There are two common situations I can think of where correctness is your primary goal: (1) when creating a reference implementation of an algorithm, for possible later optimization. (2) when you're making important architectural decisions based on the results of some investigation or other; you'll generally double- and triple-check the investigation, if you're smart.
ease-of-use - Make sure users can grasp your interface quickly and achieve what they need to do. Funnily enough, this applies to both applications (user interfaces) and to middleware (developer interfaces). In both cases, you need to understand how people will expect to use the product. The fastest or easiest way to do things might not be the most understandable.
ease-of-maintenance - Make the code so straightforward a monkey could maintain it. Or you, in a few years. ;-) If you've ever had to go back and bug-fix some code that you (or god help you, someone else) wrote more than five years ago, you'll appreciate the value of this. Code that is well-written, easy to read, well-commented, and shaped well to fit the problem is code that is optimized for ease of maintenance. To some extent, this goal can be a cause of "not-invented-here" syndrome; the preference for in-house code is often born from a perfectly rational and reasonable desire to know how to maintain that code.
reliability - Never fail. Keep on going in the face of problems. At runtime, if the program gets unexpected input, don't crash: flag it, log it, and continue on with a sensible fallback behavior. At development time, always have a reference implementation to fall back to if the optimized version isn't working. In the extreme, reliability might require a reduction (hopefully small) in average-case performance just to ensure that the code can reliably handle uncommon edge cases. (LibC's memset and memcpy are examples of that.)
upgradeability - Make sure every subsystem can be upgraded. This is most commonly achieved with modularity. Good module separation should mean that individual pieces can be swapped out without too much trouble. If you remove modularity in order to run faster with today's situation, then you may be unable to upgrade to handle tomorrow's situation without a full rewrite.
iteration time - Minimize the time difference between when you make a change and when you are able to view and test the result. This covers a lot of ground and may include things like build time reduction, file loading, dynamic loading of updated assets, console boot time, and more. This is surprisingly important: the people who write the best software will almost always tell you that rapid iteration is king.

Did I miss any? What are you optimizing for?

Tuesday, March 16, 2010

Variants of CompareAndSwap

I ran across an interesting issue the other day while trying to make a cross-platform atomics header work on Mac OS X.

Compare-and-Swap (hereafter, CAS) is one of the fundamental atomic primitives: given a working CAS you can implement any other atomic operation.

But there are in fact two common variants of CAS:

    // Returns the old value from *ptr
    int ValueCAS(volatile int *ptr, int compare, int swap);

    // Returns true if a swap occurred
    bool BoolCAS(volatile int *ptr, int compare, int swap);

That's OK, though, because it's theoretically possible to convert either one into the other.

It's trivial to implement BoolCAS in terms of ValueCAS. Here's how you do it:

    // GOOD: this works correctly.
    bool BoolCAS(volatile int *ptr, int compare, int swap) {
        return compare == ValueCAS(ptr,compare,swap);
    }

But for what I was doing, I needed to go the other way around. Mac OS X provides BoolCAS functions in <libkern/OSAtomic.h>, and I needed to convert those to ValueCAS functions.

It turns out to be subtly difficult to go the other way around, and I thought I'd share the reasons why.

First attempt

My first attempt was fairly simple, using just an extra load and branch:

    // BAD: do not use! Broken!
    int ValueCAS(volatile int *ptr, int compare, int swap) {
        if (BoolCAS(ptr,compare,swap))
            return compare;
        else
            return *ptr;
    }

The idea is that if the CAS worked, then the old value of *ptr must have been the same as compare, so we can just return that. That's absolutely correct.

However, if the CAS failed, what was the previous value? I was naïvely re-reading *ptr. But my unit tests (oh yes, this is the kind of thing you unit-test thoroughly) showed that very rarely — once per several million iterations — it would fail to do the right thing.

What was going wrong?

The problem comes down to how callers detect success and failure from ValueCAS. If ValueCAS returns compare, the CAS is assumed to have succeeded. If ValueCAS returns any other value, the CAS is assumed to have failed.

Once we recognize this, we can see that the naïve implementation above has not one, but two failure cases:

Spurious CAS failure. This can happen on nearly any system, for a variety of reasons. I'd rather not get too deep into explanations of CPU behavior here, but the typical example might be if an interrupt occurred at exactly the wrong moment. Even if *ptr == compare when the CAS runs, a spurious failure might exit the CAS without changing the value of *ptr. This would cause ValueCAS to incorrectly report success!
Race between CAS and reload. If more than one thread is atomically modifying *ptr, then there's a window between the CAS and the reload where *ptr could be changed to anything at all — including getting set back to compare. Again, this would cause ValueCAS to incorrectly report success!

Another approach

One tempting way to fix it is to simply lie about the old value of *ptr. I didn't actually try this, but I'll admit I considered it for a moment:

    // BAD: do not use! Semantically incorrect!
    int ValueCAS(volatile int *ptr, int compare, int swap) {
        if (BoolCAS(ptr,compare,swap))
            return compare;
        else
            return (compare-1); // or any value other than compare
    }

But this is wrong! It breaks the semantics of the CAS. CAS is supposed to return the old value. But the value this function returns is totally bogus, and may never have existed at *ptr. To help imagine why this is bad, consider what might happen if the caller is using the value as a bitmask. We might wind up returning an illegal mask with bits that had never been set. This could have unintended, ugly, subtle side-effects.

Third time's the charm

As far as I can tell, the best way to fix it is by adding both an extra load and two branches:

    // GOOD: works correctly
    int ValueCAS(volatile int *ptr, int compare, int swap) {
        int old;
        do {
            if (BoolCAS(ptr,compare,swap))
                return compare;
            old = *ptr;
        } while (old == compare); // never return compare if CAS failed
        return old;
    }

And sure enough, it's a little heavy-handed, but it works. Proper branch hinting will help, but I'm still a bit surprised that it's so expensive (relatively speaking) to do this conversion.

Can anyone come up with a better way?

Saturday, December 19, 2009

A Christmas Present to Myself: My New Office

I've moved my home office from one room of the basement to another.

I know, how boring, but this has given me a much-appreciated chance to completely rip out everything and get rid of the mess and abandoned stuff that I don't really use any more. (Heck, back when I moved into my former office, I was still working for Apple.)

As part of the move, I'm making an effort to make everything as wireless and high-speed as I can.

Upgrades include:

Tons of electrical outlets. Spent several days with an electrician to get this all done. 3 drops, each three feet apart, and each drop has 4 outlets above the desk and 4 outlets below the desk. Plus we put a bonus 4-outlet drop on the side. Grand total: 28. Yes, twenty-eight! I'm incredibly happy with the result... you have no idea how much I hate power strips. :-)
Clean wired network. I still had some older Cat-5 cables which kept me from getting maximum throughput. Now all hubs are 1Gb Ethernet and cables are Cat-6 or Cat-5e. This should scale up to 10Gb Ethernet when the time comes.
Wireless speakers. Specifically, Harman-Kardon SoundSticks II attached to an AirPort Express, driven with iTunes and Airfoil. Yes, I poked fun at AirTunes when it first came out, but it's become surprisingly useful. I do virtually all of my work on a laptop these days, and when I'm in my office I want good sound without having a cable plugged in.
AirPort Base Station with simultaneous dual-band. I'm getting a little tired of upgrading base stations, honestly. But since I have a ton of devices on my wireless network, and use it to access other machines on my local net, having 802.11n actually run at 802.11n speed is worth it.
WiEx zBoost YX510-PCS-CEL cellular booster. Properly installed with an antenna on the roof, this has given me 5 bars — in the basement, with AT&T. Infinity percent better than what I could get before! Love it!
Hawking HBB1 broadband booster. OK, this provides almost no feedback so I'm not actually sure how I'm supposed to measure its effectiveness. But it's cheap, dead simple to set up, and the network feels more responsive with it enabled. With so many wireless devices and no QoS in Apple's AirPort, I figured that having some traffic shaping was better than none.
Wall-mounted shelf to hold all the networking gear (DSL modem, base station, cell booster, broadband booster). I prefer to keep things off the desk since (ideally) there should be no need to touch this stuff once it's running. Individual wall mounts are out since the walls are concrete. A simple black shelf from Home Depot did the trick nicely — now everything is elevated and out of the way.

I'm also asking Santa for these extras:

Atomic wall clock, because I enjoy having an analog clock around. And once you've had a clock that automatically sets itself from the WWVB atomic time broadcast, you can't go back.
Indoor/outdoor thermometer. My new desk is right next to a window, so this is easy. I want one with a wired probe, not wireless, because I prefer it to be reliable and low-maintenance.

(FYI, some of the above are affiliate links. I only ever do that for products that I use myself and that I'm very happy with.)

Removed, and recycled or craigslisted:

Several 40' runs of cable that are no longer needed, since my desk is now much closer to where the services come into the house.
Three power strips and two extension cords. Yay!
Quite a few mystery DC power adapters. Often something like a FireWire hub or USB DVD burner that I used to use and now don't, or a charger for something. I'm not a big gadget freak, but after five years in one place these definitely accumulate. (Seriously, wireless power can't get here fast enough.)
Last year's AirPort 802.11n base station. Sigh.
An old 32-bit Gateway PC that doesn't meet my needs any more.
PowerMac G5 that I no longer use.
2 Cinema Displays with ADC connections. These were really a great investment and they lasted an incredibly long time. But sorry ADC, you're dead, and converting you to DVI is an expensive pain in the ass.
Apple DVI to ADC adapter with broken plastic. Huge by modern standards - about the size of a Mac mini.
Ancient software and unread books that had piled up on my bookshelf. Yeah, I'm pretty sure that copy of DiskWarrior from 2002 isn't going to do me any good any more.
My home server. I used to maintain a web and file server at home. (It was a PowerMac G4, nearly ten years old now.) I didn't derive a huge benefit from it, but most of the time the cost was minimal too — just a constant-but-small amount of maintenance. But over time, I found that periodically it created an enormous amount of trouble and expense when a hard drive failed or I needed to upgrade something. I've realized that there's just no value in my messing around with that stuff any more, so I've switched to offsite hosting and online backup.

I've also ordered some new iMacs for work - to use both as a local distributed build cluster, and for local testing. I'm going to put Windows 7 x64 on one, and the other will have OSX 10.6 running a couple of Linux VMs.

One of the nice side benefits is that moving has really given me a chance to clean up the cabling. Every single cable I've installed so far is neat and tidy, either cut to length or neatly bundled with a cable tie — so everything is tangle-free. I'm sure this will only last right up until I need to reconfigure everything in a hurry for some new task, but it's pretty sweet for now. :-)

Done anything nice for your office lately?

Thursday, December 03, 2009

The Economics of School Supplies

I'm interested in economics for one particular reason: incentives are fascinating.

There's the psychology aspect: how can you influence people's behavior? To really do this well you need to understand how people think.

There's the engineering aspect: human society is an enormously complex system, even more complex than the systems we work with in technology today. We can do things to nudge a complex system in one direction or another, but we can't always predict what will happen to it.

And then there are the results: you provide incentives in order to get people to do what you want. And that's a powerful thing! I don't mean this in a cold-blooded way; everybody does this every day of their lives.

Incentives in the Classroom

My wife is a high school teacher, and we had an interesting talk the other day. It was about how she handles providing a classroom supply of extra paper, extra pens, and all that for kids to use when they forget their own.

In years past, she had a system for this. She kept a supply closet, and early in September would give the kids extra credit for bringing in up to two items for the supply closet. Since she did this at the beginning of the year, kids quickly donated all kinds of stuff to get the easy points, and the closet filled up. Before the year had even started she had a year's worth of supplies.

This year, for the first time, she consciously tried not to do that, because she reasoned that it's not really an academic thing — so it wasn't a good idea to give out points for it. Makes sense, right? And besides, the school provides a certain amount of supplies for the teachers which she might be able to use.

But I'm sure you can guess what happened. With almost 200 kids in and out of her classroom each day, all it takes is a few percent each day to need paper and it'll add up fast. So she very quickly ran through her school-provided allotment, and by November, she was bringing in supplies from home to make up the difference.

And thus the dilemma: should she offer extra credit again? Or stick to her principles and not give out points?

Three approaches

As we talked, we came up with basically three approaches to the problem.

Do nothing. This is what she had been doing, and she wasn't really happy with it. The kids could borrow from each other — and some did. Or she could provide her own supplies — which she had started to do. Or kids would just fail the day's assignment when they didn't have the supplies for it... but that's very hard on a public school teacher who wants every kid to succeed.
Positive incentives. This is the extra-credit approach. She had mixed feelings about this at first, until I pointed out that the extra credit was just a tiny fraction of their grade — something like 1% of a single quarter. That isn't going to turn a failing student into a passing one; it might turn a B into a B+, and so on, but that's all. And then she raised the excellent point that roughly 20% (!!) of their grade is class participation anyway. Providing supplies really helps the community of the classroom, just like class participation does, so that's a fine place to apply the extra credit.
Negative incentives. I suggested that another approach would be to penalize the students for coming to class unprepared. Charge kids 1 point (or even just 1/10th of a point) each day that they borrow paper or a pen. The size of the incentive doesn't matter, it just matters that it's negative.

Now, different people will have different feelings about each of these. And you can do different variations on them — charging a penny instead of a point for paper, etc. But those are the basic choices.

Creating an Atmosphere

In practice, both the positive and negative incentives should work. But I pointed out that the main difference between these two is not the first-order results of the incentive ... the main difference is how she will be perceived after applying each one.

Why? Well, even as you incentivize the students with one of these actions, they in turn give you an incentive back with how they react.

If you're the teacher who is chintzy about paper and deducts fractions of points for every little thing, you'll be a Scrooge and the students won't like you. Negative reinforcement yields negative emotions.

But on the other hand, what if you're the teacher who provides extra-credit and then free paper? Why, then you're the wonderfully generous and nice teacher whom they love. Positive reinforcement yields positive emotions.

It's almost Machiavellian: do you want to be feared or loved?

Knowing my wife as I do, in her classroom she works best when she's loved. And so I told her to go back to what she had been doing — give out extra credit to fill up the supply closet. It's funny that after actually dissecting the problem and analyzing it, we came up with the same solution that she'd done more or less instinctively in the first place.