« Pacific Magazine Communitiness, Times Opinion Wackiness | Main | I'm Giving an EETalk on "Software Qualityism" Today »

August 14, 2006

The Build: You Break It, You Bought It?

Microsoft has gotten more progressive in a lot of ways. We now allow part-time work, we support telecommuting, we don't evaluate testers by the number of bugs they file. But there is one area where we still seem stuck in the 1980s. This is the notion that if a developer breaks the build, they need to be punished for it.

I say "punished" because we're not talking about getting a bad review or receiving a public dressing down by their boss. Those are options for other mistakes. But for breaking a build, and only for breaking a build, there is a feeling that people need to suffer real physical or emotional pain. Typically this means being woken up in the middle of the night, or being forced to do some demeaning job for a while. The nighttime wakeups are to fix the build, not to run laps around the dorm; and the demeaning job is babysitting future builds, not scrubbing latrines; but still it's a weird exception to the normally adult way that employees are treated.

People make all kinds of mistakes at Microsoft: we check in buggy code, we don't review documents properly, we do a sloppy job during an interview, we show up for meetings unprepared. These are all things that hurt both us and our coworkers, but they generally get excused with the notion that our jobs are hard and everybody makes mistakes sometimes. But if you break the build, it's like your one step away from being a child pornographer.

The typical response to why it's OK to call people at 2 am when the build breaks is that this will encourage people not to break it again (by the way, if you're reading this and you don't know what a build is or what it means to break it, I apologize, but I don't feel like explaining it right now). There are a few problems with this logic: 1) Given that this is not a virus attack or a website down, there's no reason to call someone at 2 am about ANYTHING; 2) You can fix the build break the next day at 10 am, and if your product is due to ship in 3 months, it can probably handle shipping in 3 months and 8 hours; 3) It's often hard to tell who is responsible for a break, so it's silly to try to single out one person for all the blame; 4) Did I mention that it's ridiculous to call someone at 2 am over a build break; and 5) Build breaks sometimes happen due to carelessness, but also due to JANFU-type conflicting checkins, which can be hard to prevent.

Plus, the fact that someone is awake to originate the phone call at 2 am brings up some other questions, like why our builds need human guidance, and if so why they are being done in the middle of the night. If breaking the build is so bad that we want to brand people with a scarlet B for doing it, shouldn't we be taking steps so that if it actually happens, it's easier to fix?

This is not a new complaint for me; I wrote about it in Proudly Serving My Corporate Masters 6 years ago. On p. 328, for those following along at home, I told a story about someone being woken up at 12:30 am to fix a build break, which turned out to be in code he hadn't written. At the time this annoyed me enough to send email to NT management about it. I wrote, "To a man they all replied echoing the company line: if you broke the build you would get called, no matter when. You could feel the smugness oozing out of the email--'We showed 'em what happens when you break the build.'"

These kind of foolishments still exist, even among people who otherwise seem normal. It's like finding out that the nice neighbor who volunteers for the PTSA also practices cannibalism. I suppose one day this anachronistic attitude will die out, but until then it will linger like an unexpected time warp into Microsoft's past.

Posted by AdamBa at August 14, 2006 08:17 PM

Trackback Pings

TrackBack URL for this entry:
http://proudlyserving.com/cgi-bin/mt-tb.cgi/475

Comments

For many groups it's long gone -- you don't submit changes directly, and the system that submits to SD on your behalf first builds the product and runs smoke to verify it isn't broken.

Posted by: Ziv Caspi at August 14, 2006 11:49 PM

LOL, so you broke the build then eh?

Nice article though :-)

Posted by: at August 15, 2006 02:50 AM

Good point. In our group we have a couple of dedicated build devs and we fix the build during daylight hours. We don't even make the build-breaker wear the I Broke The Build t-shirt anymore:-)

Seriously, on our project and many others the build is now so complex it is a specialist job to maintain it. Our build not only compiles code but also creates CHM docs, runs BVTs, code coverage, FXCop and deploys the build to a test server and runs a whole lot of other custom tasks. If we did call in the dev during the night they might be able to fix it, but the odds are they wouldn't, and they'd still be trying to figure out how to log on to the build server by the time the build manager shows up for work. It's just quicker and more effiecient for the build manager to find the fault and if neccessary raise a work item for it to be fixed - in the meantime either the breaking change is backed out or if it's an obvious error it's fixed there and then.

Posted by: SteveH at August 15, 2006 07:43 AM

Did you hear about the "that's the way it has always been done here" meme? Yes, maybe one day that'll change and a new, more positive meme, will replace the useless one...

Posted by: DHR at August 15, 2006 09:21 AM

Steve/Ziv, yes that sounds more like it. The checking/build should be automated such that you really can't "break" it. I'm not crazy about having builders work at night but I guess if they are really needed and it is a job where they can plan for it (and then have the day off) that may be the best we can do.

No, I didn't just break a build...in EEG we don't have a build to break.

- adam

Posted by: at August 15, 2006 02:01 PM

"Foolishments". Nice neologism, and one that I will start using regularly. :)

Posted by: Mat Hall at August 16, 2006 03:57 AM

A Windows build takes over 12 hours to complete. If it breaks, testers dont get a fresh build to test the next day. A lot of fixes on which 100s of other devs could have been waiting get delayed by another day. While I was still there, we had an especially bad phase where the build lab just couldnt get the darned thing to build for several days at a stretch. It was extremely frustrating for everybody. Especially if explorer was crashing in the last successful build. So in such situations it does make sense to be really stringent about build breaks. The fear of that middle of the night call kept all of us devs honest :)

Posted by: Gaurav at August 16, 2006 09:23 AM

I never understood this--getting a build done is tricky, so we do it the middle of the night? Seems like the worst time to do it if you want people around to fix it.

If there is a serious problems with the build and can't get one out every day, then figure out what is wrong and fix it. Don't think that the threat of a 2 am phone call will magically fix whatever is wrong.

- adam

Posted by: at August 16, 2006 04:00 PM

Oh yes and as for "foolishments"...like most of my best material, the word comes from "Pogo" (the actual quote I think is "this kind foolishments"). But I notice that the word does appear scattered around the Web (53 times per Google, 17 per MSN Search). Some of these appear to be typos, but some of them seem to be genuine attempts to capture Southern slang (which is what Walt Kelly was doing).

- adam

Posted by: Adam Barr at August 16, 2006 09:01 PM

I've not worked in the kind of environment you describe, the night call sounds pretty insane.

A couple of days ago the build was broken on a project on which I'm working. The coding team is about 8 people fairly widely spread around the world. Chances are if someone called you in the night where they are then it wouldn't be nighttime for you.

But most of our comms are email and irc - that applies to build breakage too. As the project lead pointed out, a broken build can prevent work progressing, a few hours of that is a lost person-day, and we've got very tight schedules.

But the thing is everyone does their utmost not to break the build, not for fear of a night call but because it's adversely affecting the work. There's also the desire not to piss off one's colleagues. Taken together these are much more powerful motivation than any punishment. If there were night calls or other recriminations, a lot of that motivation would be undermined.

So yeah, they're a bad idea ;-)

Posted by: Danny at August 17, 2006 12:32 AM

I'd like to know why the build takes 12 hours.

Are you making clean build on single machine?

Posted by: Ivan at August 17, 2006 04:55 PM

A full build of the entire Windows source code takes 12 hours or more on a very high end multi-proc machine. At least it used to take that long in 2004. Windows build is extremely complicated and does way more than simply compiling the code. I think that is a prime reason why the Windows org is especially nasty in its treatment of build breaks.

Posted by: Gaurav at August 18, 2006 11:10 AM