« MS to Get the Boot From MSNBC? | Main | Only Every 50 Years, Though... »

May 02, 2005

Microsoft and Its Source Code

Three years ago, during my time away from Microsoft, I sent email to Steve Ballmer suggesting that Microsoft should release all of its source code.

To his credit, and despite the fact that it was from some random outsider, Steve did reply to the email a couple of weeks later. His response, in total, was: "Good thoughtful input  will consider  thanks"

The only point I missed was the possibility that people could search the source code to find ways to sue Microsoft for patent infringement.

This is the email:

I am writing to you to discuss the issue of Microsoft and its source code.

I'm a former Microsoft employee who spent ten years working as a developer, most of it on the kernel of Windows NT and its successors. I left the company in April 2000, and have not worked for anyone else since then. I still consider myself to be fundamentally pro-Microsoft, and I own a bundle of the stock.

The goal of this email is to try to convince you that Microsoft should publicly release all the source code to its products.

I am not suggesting that Microsoft move to an "open source" model as the term is currently used. However, Microsoft's current "shared source" program, under which source code is released to some institutions and large companies, with restrictions on its use, is inadequate. In particular, I do not feel that Microsoft's "Trustworthy Computing" initiative has any chance whatsoever of succeeding if the source code is not widely available to the public.

People will say that releasing the code will expose security problems, such as buffer overflows, that others could then exploit silently without reporting them. I've been involved in many code reviews, including the previous security sweep done on the Windows XP code, and I find it highly unlikely that the recent much-touted line-by-line code review of the Windows source code really found all the exploits. It's the same problem that security screeners at an airport face; since the vast majority of the code does not have problems, it makes it very hard to really concentrate enough to spot the code that does. Automated tools can help, but they still require human oversight.

But you can't have it both ways: either the code reviews found every bug, in which case Microsoft has nothing to fear from publicizing its code, or there are still bugs in there. Imagine how the press is going to report the first remote exploit found in code that has been through the review process. Having the code public will mitigate the negative PR from this, and also speed the process of converging Windows to a secure state.

The open source community claims that all the "eyeballs" out there make open source code more secure. Be that as it may, that argument could no longer be used against Microsoft. Releasing the source shows that Microsoft is committed to making its software secure; each bug found is not a black eye for Microsoft, but instead one step towards a more secure system. And keep in mind, a properly-designed security or digital rights management system does not depend on keeping the code hidden.

If you look at the current open source movement, it really consists of two parts, one is releasing the source code, the other is releasing all intellectual property claims on the code. But the two do not have to be connected. Releasing source code would allow Microsoft to wrest leadership on the source code issue away from the open source movement. Microsoft could redirect the discussion to show that releasing the code gives the majority of the benefits, while maintaining intellectual property rights avoids the majority of the problems.

And, this doesn't have to happen overnight. Microsoft can announce that it is planning to release its source in six months or a year, and then spend the time preparing for that date. The code could be released under a license that Microsoft devises, but the key goal would be that anyone who wanted could see the code.

There is one issue that needs to be addressed, which is patents. Some people have claimed that the various source code releases that Microsoft has done, such as the recent SMB/CIFS implementation, are really "patent traps," meant to lure companies into using ideas that Microsoft could later sue for patent violations over. Microsoft needs to address this concern head-on. First, by definitively asserting that it will only use patents defensively. Second, by making an honest attempt to mark sections of code that are covered by patents (another reason that the source code would not be released immediately). Third, by stating a reasonable and fair policy in regards to inadvertant patent infringement.

So what are the negatives in this plan?

The most obvious one is the fear that it would make it easier to steal Microsoft products. Until Windows Product Activation, of course, users could steal Microsoft products simply by reusing the CD, with no need to access the source code. Now, someone could come up with a version of Windows XP with the activation code removed. They wouldn't be able to sign the new binary with the Microsoft private key, so they would have to take some code signing checks out also.

They can do this now, however. Code signing is only as strong as its weakest link. Right now when XP is booted, the rough sequence is that the BIOS loads the partition boot code, which loads the loader, which loads the kernel, which loads the rest of the system. The BIOS and the boot code don't do signing checks. So a hacker could take the sign verification code out of the kernel, producing a modified and therefore unsigned kernel, but then take the verification code out of the loader also, so the kernel being unsigned didn't matter. The result would be a version of XP that didn't check for binaries being signed. Then they could hack out the activation code (although I have not personally confirmed it, I have it on good authority that this has already been done for XP). This won't change until the BIOS also checks for signed code. And once the BIOS does that, having the code won't help anyone else because only Microsoft can sign code as Microsoft.

Having the code available does make it *easier* to do this. And, it makes it much easier for someone to modify Windows in much more malicious ways, to act as spyware on a user for example. But while many people feel comfortable taking a single CD and installing it on several machines, buying a copy of Windows that has been compiled by someone other than Microsoft, and thus is obviously illegal, is another matter. In any case Microsoft could use the fact that the source was out there to reinforce the need for users to check that they were buying genuine Microsoft-compiled version of its software, or risk having their personal data compromised.

A second issue with releasing the source code is intellectual property. How would intellectual property rights be protected? The code would still be covered by copyright, and any patents on it would still apply. Nobody outside Microsoft would be allowed to modify the code -- in fact, control of the code by Microsoft is one of the key requirements in getting people to believe in Trustworthy Computing. It is critical both that everybody outside Microsoft can see the code, and that nobody outside Microsoft can modify it.

Although Microsoft often talks about its source code as its crown jewels, and the press plays up this image, I can say that I personally spent many years working with the Windows NT source code, and there is nothing particularly special about it. There's nothing wrong with it either; it's just another way to solve the basic problems involved in an operating system, that have been solved a hundred other ways by a hundred other people. And the code is completely customized to Windows; it is highly unlikely (beyond wholesale theft of large parts, which is a copyright violation) that anyone would glean information from the code which could then be used as a competitive advantage against Microsoft. The real magic in Windows is the public APIs (which are of necessity highly public already) and the data structures and algorithms. The most important of these can be protected by patent, and the rest are not worth protecting. These can't be considered trade secrets since anyone with a debugger can walk through the assembly code and figure them out.

With available code, third-party developers might modify how they call APIs to take advantage of how the system works internally, then assume that such behaviour will continue in future releases. But that is their own risk; and anyway programmers do this already.

Finally, people might look at Microsoft's code and sneer at it. After almost 15 years of development, the NT code base is not the cleanest code around. There are a lot of #ifdefs and other stale code. People might ask, These are the crown jewels of Microsoft? But that is one of the benefits of announcing the public release some period of time before it happens. The code could be cleaned up with an eye towards public release (and it could be run through the preprocessor first to clean up the #ifdefs).

Meanwhile, what are the benefits of Microsoft releasing its code, besides the security issues discussed above?

The first is that it will restore trust in Microsoft, which is key to restoring trust in Microsoft's code. The message will be, Microsoft has nothing to hide. There are no security backdoors or hidden APIs. All the claims of Microsoft's opponents can be conclusively proven false (and if there are security backdoors or hidden APIs -- this will "inspire" Microsoft developers to get rid of them!). Microsoft can graciously acknowledge that it has learned from the open source movement, while making it clear that it is not joining the movement.

The second benefit is that some of the remedies being proposed in the various lawsuits against Microsoft will become non-issues. APIs and communications protocols will implicitly be fully documented by the code. States clamoring for the source code to Internet Explorer can have it. If someone wants to port the .Net Common Language Runtime or Office to another platform, they are free to do so (modulo any patent issues).

Third, it would contribute to keeping Windows at the center of the computing universe. Developers who had questions or concerns about developing for Windows would now have access not only to sample code and API documentation, but real live code and API implementation. Think back to the original IBM PC. The fact that the source code appeared in its entirety in the "Technical Reference" manual -- with IBM still maintaining all intellectual property rights -- was one of the keys to the growth of the PC industry (of course IBM lost control of that industry, but that was due to other mistakes). With the code publicly available, people will spend more time writing applications for Windows, and more time writing code to connect other machines to Windows, and that will generate more sales of Windows.

Finally, Microsoft developers would like having their code released. Not just out of pride of ownership, although since the code would immediately become the most- examined in history, there would be a bit of "rock star" aspect to it. More importantly, It can be extremely convenient, when debugging a problem at a remote site, to have the code available. Previously this had to be done under strict security. Now anyone can have the code, in fact companies with the skill can start to debug their own problems, and even submit fixes to Microsoft -- which would have to be carefully examined before they were accepted, of course. It's not just other companies: a lot of groups within Microsoft would like to debug other Microsoft products themselves, and obviously have the skill to do so -- all they lack is the source code.

With this plan in place, Microsoft can maintain its position as a leader and innovator in computing, and set the groundwork for Trustworthy Computing to succeed.

I hope that you will seriously consider this proposal, rather than reject it out of hand. I know the first reaction of most people at Microsoft would be an unequivocal "NO!". Please think about it with an attitude of "How can this be made to work?" rather than "Why this will never work." Now is the time to fire up the troops with some bold leadership.

Thank you.

- Adam Barr

Posted by AdamBa at May 2, 2005 08:35 AM

Trackback Pings

TrackBack URL for this entry:
http://proudlyserving.com/cgi-bin/mt-tb.cgi/209

Comments

Adam, The original intent of copyright was that the protected content was publicly available. MS was one of many companies that supported changes to copyright rules so that for a software copyright only first and last page needed to be submitted and published. I think returning the practice on software copyright back to the original method, full publication of source code to be filed with copyright is the right answer, not just for MS but for the entire industry.

Posted by: Mark Alexander at May 2, 2005 10:03 AM

Interesting post. I wonder if there have been any other industries which have done the same thing and had success (that is, publish their IP and essentially rely strictly on patent information)

Posted by: Richard Threlkeld at May 2, 2005 04:00 PM

Richard, lots of industries have done this, because they don't have the source code vs. binary code distinction that exists in software. Every book, blender, shirt, chair, music CD, etc. that is sold contains all the information that anyone needs to make an exact duplicate. So those industries rely on patents and copyrights to protect their work (or they don't bother, and use marketing to distinguish themselves).

- adam

Posted by: Adam Barr at May 2, 2005 05:54 PM

(sorry for the very long post)

There are 3 more benefits in fully opening the source:

1. People will be able to compile their own Windows-es, with optimization for their specific processor.

2. People will be sure that THIS source is really the one used to build their system.

3. Do you remember your remark about the process of finding good developers for FOSS projects. With fully open code, the good patch submitters would be the best candidates for employment. No more subjective job interviews.


And now few comments on the letter text itself:

1. Security. Opening source will cause an increase of exploits.
Actually this had already happened with WinNT/2k partial code leak. The good side is that all these exploitable bugs will be fixed. Widely available source would only help in faster fixing. This model is already proven in Linux/BSD.
As for now MS cannot keep up with fixing all bugs in limited time frame. Increasing the number of people that only fix bugs will mean less people for writing features. As patches are for free, this is pure loss of money.

2. Piracy. Piracy is economical phenomen. So it should be fought with economical tools. e.g. Russia have an very good way of keeping pirates under control. The biggest russion hit movie "Turkish Gambit" is sold for $5 as pirate and $6 as original. And, no people will never accept artificially crippled versions like Win Starter Edition.
Selling at price people can afford is basic principle in economy. Using monopoly for money extraction is evil.

3. Stealing code. User will always find a way to use it for free. Competitors are these that could use code to replicate the program.
I have an experience with FOSS programs. GPL license allows everybody to make fork and sell it, despite that there are not zilion forks of one program. Why? It is not just a matter of copy/paste, what really matter is development. People will not follow copycats as they will always lag behind. People will want the product to be better than original in order to make a switch. It is not coincidence that this sound as free market.

4. You can clean the code before release, but you must use the code you open.
Having multiple forks of one product doesn't do any good. This rule applies also to internal forks like these used for porting to another platforms.

5. The problem with patents is global. Microsoft have the power to request removing of software patent-ability.
There are (10 years old) speeches of Bill Gates where he claims patents are bad things.
Having in mind that the biggest software patent holder is IBM, I don't think that MS should support them.
Unfortunately MS seems the biggest supporter for software patents in EU. The main guess is that this way MS hopes to fight Linux. The inability of competing on equal basis and changing rules to the detriment of competitor is considered evil :E


As a programmer that works mainly with Free Open Source Software I can only be happy that Microsoft will never really open their source.
Why I think so?

1. Palladium. It is (as you proposed) an hardware and BIOS component that ensures that only "trusted" code would be executed. Actually this is only the first level of protection and the only level that is announced for Longhorn.

2. It would remove the `weight` from Windows platform. The evangelism had put many efforts to create it and monopoly is one of it best results. Could MS give it away? Don't forget that black holes are created from too much weight.


I do think that the company could survive and make some profit with source code opened. It would generally help development (inside and outside MS) and increase consumer satisfaction.
The problem is that this is not the way to keep monopoly and increase profit.

This means that this step would not be seriously considered until it is too late.

The only valuable assets Microsoft have are the programmers.
Having the best and most productive programmers means that the company is competitive and can outrun all competition, despite what development model is used.
Looking at this angle, having most of programmers money in stock share means an large risk to cause chain reaction if share start dropping.

So the opening should start now, I do recommend an test run with products that are not officialy supported anymore.
Win95, WinNT4 are good candidates.

Posted by: Ivan at May 4, 2005 09:42 AM

Ivan, I think that releasing the source code to the Win9x family (Win95, Win95, WinMe) would be a great idea.

- adam

Posted by: Adam Barr at May 5, 2005 10:35 PM

The idea of intellectual property and reverse engineering is strange in the computer industry. I used to do computer work for car manufacturers. They routinely go out to the local competing car dealer, buy one of their cars, and take it to bits to see how it works.

One dealer I worked at often had the inner workings of a competitor's car on public display in its offices for all to see. This is accepted in that industry, but the software industry frowns upon it.

It seems though that Microsoft's belief is in keeping everything locked up and secret. They do their best to obfuscate their client server communications and their media formats. Reversing this will take a cultural shift at the top - or a few more court orders.

If Linux can play WMAs and read office documents as well as Windows can, then Microsoft fear more people would move to Linux. If Joe User has a few hundred dollars worth of music that can only play on Windows then he can't move away. That's how Microsoft wish to compete.

Posted by: Richard Corfield at May 13, 2005 03:10 AM