August 21, 2006
Is "Many Eyeballs" the Right Way to Ensure Code Quality?One of the mantras of the open source movement is that the quality of the code is better because there are "many eyeballs" looking at it. That is, anyone can look at and modify the code, and the changes move it inexorably towards higher quality.
Now, certainly if there is a particular crash or exploit that is being hunted down this can work well. You have a bunch of highly motivated people (because there own machines are torqued) who can all look at the code and test out fixes. But that's for a specific problem. The idea of having high quality code is to prevent those bugs from occuring to begin with. How does "many eyeballs" work then?
I was thinking of this because I gave an abbreviated version of my "Parallels Between Environmentalism and Software Quality" talk and someone asked how open source affected code quality. Now there are some ways in which it does help (primarily due to the distributed nature of development), but thinking about it, I believe that "many eyeballs" is an antiquated way to think about code quality.
Sure, open source allows anybody to improve the quality of the code. And people may be motivated to want their code to be high quality because it is public. But what do they do with this ability and motivation?
In the year 2000 Microsoft stopped all development on Windows for some number of weeks in order to do a security review of the source code. This was a great commitment by management to take the time needed to fix problems. Unfortunately at the time the technique used was a line-by-line code review. Code reviews can be useful, but it can also be hard to focus after a while. And reviewers tend to concentrate on things like code formatting and variables names--useful for maintainability, and possible indicators of high quality code, but not the final word.
We now know this isn't the best way to spend our time. One thing that IS a great way to spend your time is using the Software Annotation Language (SAL) and the Prefast code analysis tool; both are included in Visual Studio 2005 (Michael Howard has blogged about them here and here). There's a story floating around Microsoft about how a team spent some large number of hours doing a code review of a parser, and found 6 bugs; then they spent a much smaller time writing a series of fuzz tests for the parser, and found the original 6 bugs, plus 2 more.
But I don't hear of widespread adoption of those techniques in the open source community. It's not that the technologies are not available, it's that the thinking is still that many eyeballs is the answer. That's what Microsoft was thinking 6 years ago, but in the world of programming, 6 years is a long time.
Posted by AdamBa at August 21, 2006 09:57 PM
TrackBack URL for this entry:
It's true that there isn't widespread adoption of verification tools in open source. I think this is primiarly because many of these tools are not open source themselves and are very costly. However at the same time I think there is defintely recognition that these tools really do make things better.
I do know that at least one Linux kernel developer ( http://kernelslacker.livejournal.com/15681.html ) has looked longingly at tools like Prefast and wished for similar tools too. On the static checking front again I hear the kernel use a tool called sparse ( http://www.codemonkey.org.uk/projects/git-snapshots/sparse/ ) to try and do some basic checks.
Thanks to outside parties ( http://scan.coverity.com/ ) more sophisticated verifiers have been run on certain projects. I guess the fact that these projects are open allowed researchers (here's a paper coauthored by one of your folks - http://www.stanford.edu/~engler/osdi04-fisc.pdf#search=%22filesystem%20fisc ) to run their checkers on real world code. Anyone can take their checker and run it on an open source project to use as as sales technique. I know I am certianly impressed if I hear about a checker that is run on a popular project and turns up bugs that are consequently fixed.
I do know work has been done on GCC to put canaries in place to try and catch some subset of buffer overruns. I also hear that the OpenBSD project actually modified the way memory was allocated ( http://kerneltrap.org/node/5584 ) at the expense of performance on the regular system to try and root out these types of errors.
Finally tools like valgrind ( http://valgrind.org/ ) have really helped to reduce the instances of a common problems in user-land code and seems to be used by many large open source projects on a regular basis now.
(I wish I knew how to do paragraphs in this comment)
Posted by: Sitsofe at August 22, 2006 12:27 AM
It's good to hear that these tools are becoming more known, since we're all in the same Internet ecosystem together. Hopefully it will become more widespread and the open source community can brag about how many more people they have doing static code analysis and whatnot.
Posted by: Adam Barr at August 22, 2006 04:04 PM
Please be more careful when you make such statement. It inspires that open software is dangerous, while in the reality Windows machines have proven to be more dangerous and hard to secure.
Now, I think you should start asking yourself, why Open Source Software is better quality. This is fact. The many eyeballs is only one of the factors.
I'd give you a few hints.
1. Evolution. Bad project and buggy code doesn't survive.
2. Better coders. A project is usually started by one person. If he doesn't write good code, his project won't survive.
3. Higher code requirement. Readability, easy to maintain, well commented/documented. When more people work on project these are mandatory.
4. Founder control. Usually the person who started the project would accept or approve every code inclusion, and require modification. This means all the code is checked by the better coder (#2). On bigger project he would separate this work to the most skilled coders available.
5. User feedback. The most important thing. If the project is popular many people would use and report their problems. Many of them would use the development version.
6. No deadlines. You can spend as much time looking for elegant solution as you want.
7. Try-before-you-buy. Looking for new developers is easy because you just have to choose the people who send you valuable code. (you know that;)
Well, there are open source projects that doesn't follow this path, but these are mainly managed by commercial companies that are using already established methodology over open source.
Now, about preventing bugs from happening. All you have to do is create language where you can not write buggy code. e.g. Java have been created to avoid some of the most common C/C++ mistakes. Is C# any better at preventing bugs?
As for the static source code analysis.
It's good to use it, but bad to rely on it.
You know, the halting problem is undecidable.
Posted by: Ivan at August 26, 2006 04:34 PM
Ivan, I disagree with you. First of all, show me the evidence that Windows is worse than OSS. Which has more exploits in recent years, Windows or Linux? IIS or Apache?
Second, most of what you list there is the "old way" of thinking about quality. "Readability, easy to maintain, well commented/documented" -- I'm saying no. Show me the unit tests and the static code analysis. None of those by itself is the answer, but together they make things a lot better than saying "we have better coders", whatever that means.
The point about "interviewing" people using submitted code is a good one, but somewhat separate from what people do once they start work.
Saying "All you have to do is create language where you can not write buggy code" is a vast overstatement. Yes, C# does approach string handling the same way Java does, but that is a far far cry from saying you can't have bugs in C# (or Java).
Posted by: Adam Barr at August 27, 2006 09:17 AM
oppressed viscount upgrade beading:fissure Hellespont genus fifteen