May 02, 2005
Microsoft and Its Source CodeThree years ago, during my time away from Microsoft, I sent email to Steve Ballmer suggesting that Microsoft should release all of its source code.
To his credit, and despite the fact that it was from some random outsider, Steve did reply to the email a couple of weeks later. His response, in total, was: "Good thoughtful input will consider thanks"
The only point I missed was the possibility that people could search the source code to find ways to sue Microsoft for patent infringement.
This is the email:
I am writing to you to discuss the issue of Microsoft and its source code.
I'm a former Microsoft employee who spent ten years working as a developer,
most of it on the kernel of Windows NT and its successors. I left the company
in April 2000, and have not worked for anyone else since then. I still consider
myself to be fundamentally pro-Microsoft, and I own a bundle of the stock.
The goal of this email is to try to convince you that Microsoft should publicly
release all the source code to its products.
I am not suggesting that Microsoft move to an "open source" model as the term is
currently used. However, Microsoft's current "shared source" program, under which
source code is released to some institutions and large companies, with restrictions
on its use, is inadequate. In particular, I do not feel that Microsoft's "Trustworthy
Computing" initiative has any chance whatsoever of succeeding if the source code
is not widely available to the public.
People will say that releasing the code will expose security problems, such as
buffer overflows, that others could then exploit silently without reporting them. I've
been involved in many code reviews, including the previous security sweep done
on the Windows XP code, and I find it highly unlikely that the recent much-touted
line-by-line code review of the Windows source code really found all the exploits.
It's the same problem that security screeners at an airport face; since the vast
majority of the code does not have problems, it makes it very hard to really
concentrate enough to spot the code that does. Automated tools can help, but
they still require human oversight.
But you can't have it both ways: either the code reviews found every bug, in
which case Microsoft has nothing to fear from publicizing its code, or there
are still bugs in there. Imagine how the press is going to report the first
remote exploit found in code that has been through the review process.
Having the code public will mitigate the negative PR from this, and also
speed the process of converging Windows to a secure state.
The open source community claims that all the "eyeballs" out there make open
source code more secure. Be that as it may, that argument could no longer be
used against Microsoft. Releasing the source shows that Microsoft is committed
to making its software secure; each bug found is not a black eye for Microsoft,
but instead one step towards a more secure system. And keep in mind, a
properly-designed security or digital rights management system does not depend
on keeping the code hidden.
If you look at the current open source movement, it really consists of two parts,
one is releasing the source code, the other is releasing all intellectual property
claims on the code. But the two do not have to be connected. Releasing source
code would allow Microsoft to wrest leadership on the source code issue away
from the open source movement. Microsoft could redirect the discussion to show that
releasing the code gives the majority of the benefits, while maintaining intellectual
property rights avoids the majority of the problems.
And, this doesn't have to happen overnight. Microsoft can announce that it is
planning to release its source in six months or a year, and then spend the time
preparing for that date. The code could be released under a license that Microsoft
devises, but the key goal would be that anyone who wanted could see the code.
There is one issue that needs to be addressed, which is patents. Some people
have claimed that the various source code releases that Microsoft has done, such
as the recent SMB/CIFS implementation, are really "patent traps," meant to lure
companies into using ideas that Microsoft could later sue for patent violations over.
Microsoft needs to address this concern head-on. First, by definitively asserting
that it will only use patents defensively. Second, by making an honest attempt to
mark sections of code that are covered by patents (another reason that the
source code would not be released immediately). Third, by stating a reasonable
and fair policy in regards to inadvertant patent infringement.
So what are the negatives in this plan?
The most obvious one is the fear that it would make it easier to steal Microsoft
products. Until Windows Product Activation, of course, users could steal Microsoft
products simply by reusing the CD, with no need to access the source code.
Now, someone could come up with a version of Windows XP with the activation
code removed. They wouldn't be able to sign the new binary with the Microsoft
private key, so they would have to take some code signing checks out also.
They can do this now, however. Code signing is only as strong as its weakest
link. Right now when XP is booted, the rough sequence is that the BIOS loads
the partition boot code, which loads the loader, which loads the kernel, which
loads the rest of the system. The BIOS and the boot code don't do signing
checks. So a hacker could take the sign verification code out of the kernel,
producing a modified and therefore unsigned kernel, but then take the verification
code out of the loader also, so the kernel being unsigned didn't matter. The
result would be a version of XP that didn't check for binaries being signed. Then
they could hack out the activation code (although I have not personally confirmed
it, I have it on good authority that this has already been done for XP). This won't
change until the BIOS also checks for signed code. And once the BIOS does
that, having the code won't help anyone else because only Microsoft can sign
code as Microsoft.
Having the code available does make it *easier* to do this. And, it makes it much
easier for someone to modify Windows in much more malicious ways, to act
as spyware on a user for example. But while many people feel comfortable
taking a single CD and installing it on several machines, buying a copy of Windows
that has been compiled by someone other than Microsoft, and thus is
obviously illegal, is another matter. In any case Microsoft could use the
fact that the source was out there to reinforce the need for users to check
that they were buying genuine Microsoft-compiled version of its software, or
risk having their personal data compromised.
A second issue with releasing the source code is intellectual property.
How would intellectual property rights be protected? The code would still
be covered by copyright, and any patents on it would still apply. Nobody outside
Microsoft would be allowed to modify the code -- in fact, control of the code by
Microsoft is one of the key requirements in getting people to believe in
Trustworthy Computing. It is critical both that everybody outside Microsoft
can see the code, and that nobody outside Microsoft can modify it.
Although Microsoft often talks about its source code as its crown jewels, and the
press plays up this image, I can say that I personally spent many years working
with the Windows NT source code, and there is nothing particularly special about
it. There's nothing wrong with it either; it's just another way to solve the basic
problems involved in an operating system, that have been solved a hundred
other ways by a hundred other people. And the code is completely customized
to Windows; it is highly unlikely (beyond wholesale theft of large parts, which
is a copyright violation) that anyone would glean information from the code
which could then be used as a competitive advantage against Microsoft. The
real magic in Windows is the public APIs (which are of necessity highly public
already) and the data structures and algorithms. The most important of these
can be protected by patent, and the rest are not worth protecting. These can't
be considered trade secrets since anyone with a debugger can walk through
the assembly code and figure them out.
With available code, third-party developers might modify how they call
APIs to take advantage of how the system works internally, then assume
that such behaviour will continue in future releases. But that is their own
risk; and anyway programmers do this already.
Finally, people might look at Microsoft's code and sneer at it. After almost
15 years of development, the NT code base is not the cleanest code around.
There are a lot of #ifdefs and other stale code. People might ask, These are
the crown jewels of Microsoft? But that is one of the benefits of announcing
the public release some period of time before it happens. The code could be
cleaned up with an eye towards public release (and it could be run through
the preprocessor first to clean up the #ifdefs).
Meanwhile, what are the benefits of Microsoft releasing its code, besides
the security issues discussed above?
The first is that it will restore trust in Microsoft, which is key to restoring
trust in Microsoft's code. The message will be, Microsoft has nothing to hide.
There are no security backdoors or hidden APIs. All the claims of Microsoft's
opponents can be conclusively proven false (and if there are security backdoors
or hidden APIs -- this will "inspire" Microsoft developers to get rid of them!). Microsoft
can graciously acknowledge that it has learned from the open source movement,
while making it clear that it is not joining the movement.
The second benefit is that some of the remedies being proposed in the various
lawsuits against Microsoft will become non-issues. APIs and communications
protocols will implicitly be fully documented by the code. States clamoring
for the source code to Internet Explorer can have it. If someone wants to port
the .Net Common Language Runtime or Office to another platform, they are free
to do so (modulo any patent issues).
Third, it would contribute to keeping Windows at the center of the computing
universe. Developers who had questions or concerns about developing for Windows
would now have access not only to sample code and API documentation, but
real live code and API implementation. Think back to the original IBM PC. The
fact that the source code appeared in its entirety in the "Technical Reference"
manual -- with IBM still maintaining all intellectual property rights -- was one
of the keys to the growth of the PC industry (of course IBM lost control of
that industry, but that was due to other mistakes). With the code publicly
available, people will spend more time writing applications for Windows, and more
time writing code to connect other machines to Windows, and that will
generate more sales of Windows.
Finally, Microsoft developers would like having their code released. Not just out of
pride of ownership, although since the code would immediately become the most-
examined in history, there would be a bit of "rock star" aspect to it. More importantly,
It can be extremely convenient, when debugging a problem at a remote site, to have
the code available. Previously this had to be done under strict security. Now anyone
can have the code, in fact companies with the skill can start to debug their own
problems, and even submit fixes to Microsoft -- which would have to be carefully
examined before they were accepted, of course. It's not just other companies:
a lot of groups within Microsoft would like to debug other Microsoft products
themselves, and obviously have the skill to do so -- all they lack is the source code.
With this plan in place, Microsoft can maintain its position as a leader and
innovator in computing, and set the groundwork for Trustworthy Computing to
I hope that you will seriously consider this proposal, rather than reject it out
of hand. I know the first reaction of most people at Microsoft would be an
unequivocal "NO!". Please think about it with an attitude of "How can
this be made to work?" rather than "Why this will never work." Now is the time
to fire up the troops with some bold leadership.
- Adam Barr
Posted by AdamBa at May 2, 2005 08:35 AM
TrackBack URL for this entry:
Adam, The original intent of copyright was that the protected content was publicly available. MS was one of many companies that supported changes to copyright rules so that for a software copyright only first and last page needed to be submitted and published. I think returning the practice on software copyright back to the original method, full publication of source code to be filed with copyright is the right answer, not just for MS but for the entire industry.
Posted by: Mark Alexander at May 2, 2005 10:03 AM
Interesting post. I wonder if there have been any other industries which have done the same thing and had success (that is, publish their IP and essentially rely strictly on patent information)
Posted by: Richard Threlkeld at May 2, 2005 04:00 PM
Richard, lots of industries have done this, because they don't have the source code vs. binary code distinction that exists in software. Every book, blender, shirt, chair, music CD, etc. that is sold contains all the information that anyone needs to make an exact duplicate. So those industries rely on patents and copyrights to protect their work (or they don't bother, and use marketing to distinguish themselves).
Posted by: Adam Barr at May 2, 2005 05:54 PM
(sorry for the very long post)
There are 3 more benefits in fully opening the source:
1. People will be able to compile their own Windows-es, with optimization for their specific processor.
2. People will be sure that THIS source is really the one used to build their system.
3. Do you remember your remark about the process of finding good developers for FOSS projects. With fully open code, the good patch submitters would be the best candidates for employment. No more subjective job interviews.
And now few comments on the letter text itself:
1. Security. Opening source will cause an increase of exploits.
Actually this had already happened with WinNT/2k partial code leak. The good side is that all these exploitable bugs will be fixed. Widely available source would only help in faster fixing. This model is already proven in Linux/BSD.
As for now MS cannot keep up with fixing all bugs in limited time frame. Increasing the number of people that only fix bugs will mean less people for writing features. As patches are for free, this is pure loss of money.
2. Piracy. Piracy is economical phenomen. So it should be fought with economical tools. e.g. Russia have an very good way of keeping pirates under control. The biggest russion hit movie "Turkish Gambit" is sold for $5 as pirate and $6 as original. And, no people will never accept artificially crippled versions like Win Starter Edition.
Selling at price people can afford is basic principle in economy. Using monopoly for money extraction is evil.
3. Stealing code. User will always find a way to use it for free. Competitors are these that could use code to replicate the program.
I have an experience with FOSS programs. GPL license allows everybody to make fork and sell it, despite that there are not zilion forks of one program. Why? It is not just a matter of copy/paste, what really matter is development. People will not follow copycats as they will always lag behind. People will want the product to be better than original in order to make a switch. It is not coincidence that this sound as free market.
4. You can clean the code before release, but you must use the code you open.
Having multiple forks of one product doesn't do any good. This rule applies also to internal forks like these used for porting to another platforms.
5. The problem with patents is global. Microsoft have the power to request removing of software patent-ability.
There are (10 years old) speeches of Bill Gates where he claims patents are bad things.
Having in mind that the biggest software patent holder is IBM, I don't think that MS should support them.
Unfortunately MS seems the biggest supporter for software patents in EU. The main guess is that this way MS hopes to fight Linux. The inability of competing on equal basis and changing rules to the detriment of competitor is considered evil :E
As a programmer that works mainly with Free Open Source Software I can only be happy that Microsoft will never really open their source.
Why I think so?
1. Palladium. It is (as you proposed) an hardware and BIOS component that ensures that only "trusted" code would be executed. Actually this is only the first level of protection and the only level that is announced for Longhorn.
2. It would remove the `weight` from Windows platform. The evangelism had put many efforts to create it and monopoly is one of it best results. Could MS give it away? Don't forget that black holes are created from too much weight.
I do think that the company could survive and make some profit with source code opened. It would generally help development (inside and outside MS) and increase consumer satisfaction.
The problem is that this is not the way to keep monopoly and increase profit.
This means that this step would not be seriously considered until it is too late.
The only valuable assets Microsoft have are the programmers.
Having the best and most productive programmers means that the company is competitive and can outrun all competition, despite what development model is used.
Looking at this angle, having most of programmers money in stock share means an large risk to cause chain reaction if share start dropping.
So the opening should start now, I do recommend an test run with products that are not officialy supported anymore.
Win95, WinNT4 are good candidates.
Posted by: Ivan at May 4, 2005 09:42 AM
Ivan, I think that releasing the source code to the Win9x family (Win95, Win95, WinMe) would be a great idea.
Posted by: Adam Barr at May 5, 2005 10:35 PM
The idea of intellectual property and reverse engineering is strange in the computer industry. I used to do computer work for car manufacturers. They routinely go out to the local competing car dealer, buy one of their cars, and take it to bits to see how it works.
One dealer I worked at often had the inner workings of a competitor's car on public display in its offices for all to see. This is accepted in that industry, but the software industry frowns upon it.
It seems though that Microsoft's belief is in keeping everything locked up and secret. They do their best to obfuscate their client server communications and their media formats. Reversing this will take a cultural shift at the top - or a few more court orders.
If Linux can play WMAs and read office documents as well as Windows can, then Microsoft fear more people would move to Linux. If Joe User has a few hundred dollars worth of music that can only play on Windows then he can't move away. That's how Microsoft wish to compete.
Posted by: Richard Corfield at May 13, 2005 03:10 AM