A recent study published by the Development Testing software vendor Coverity,
suggests equal code quality among both open source projects and proprietary
ones, unless the code base grows over 1 million lines of code .
However, dubious data presentation in Coverity’s report (lack of confidence
intervals) and plausible bias seem to suggest this information might not be all
true. Adding to the inconclusively of this report is data from previous years,
both from Coverity and from other sources, showing code quality being equal
in many cases, and even open source projects having higher quality in certain
cases. This paper will explore and discuss the quality of a variety of open
and closed source projects from the past 10 years, and attempt to give a
non-sensationalist viewpoint on the matter.
2 Concerns over the 2012 Coverity Study
This paper, in part, is a direct response to the recent aforementioned study
published by Coverity. This particular year (2012) Coverity released their
report showing data summarized below.
This led to many sensationalist articles 
emphasizing that although FOSS and proprietary project have fairly the same
quality, after 1 million lines, the proprietary ones win in terms of number
of bugs per 1000 lines.
Many of these articles, however, don’t seem to mention key facts about the
study, or about the projects that participated. One fact in particular, önly
13 projects were over the 1 million mark…”, shows a serious fault
with the study. Statistically, it does not make sense to extrapolate
such a broad conclusion (the state of large open source projects) from such
a small sample space.
Looking even further at the data set we that the point estimates seem to jump
sporadically at distinct intervals, with no real pattern. Looking at
proprietary code, a spike is seen in the range of 500,000 and 1 million lines
of code. It seems there really is no pattern from this analysis, just
random variation. The “patterns” inferred by these articles serve nothing more
than to find meaning in random variation, ultimately drawing conclusions
from indistinguishable statistical data.
In fact, the original Coverity report states that because its average analysed
open source code base increased from an average of 425,179 lines in 2008 to
580,000 lines in 2012, the defect density increased upward to 0.69 from 0.45 in
2011. We see that even the original authors of the study do not state open
source software is more bug prone, but suggest their results is due to change
in the data set from previous years.
3 Further Study
We can also examine previous reports by Coverity comparing closed and open
projects from previous years. Some previous Coverity reports show no inclination
towards either programming ideology. The projects show approximately equal
quality across both FOSS and proprietary projects. Other years, however, we
see results from Coverity indicating Öpen Source software quality is better
than proprietary software”. The 2011 Coverity study shows öpen
source code has fewer defects per thousand lines of code than proprietary
software code does.”  It seems that Coverity test results
seem to fluctuate between years, never showing a consistent result. It is for
this reason that it does not make sense to claim one ideology’s superiority in
quality, when no consistent evidence has been found. These articles should
therefore avoid this sense of sensationalism from their titles, and rather
stick to the facts directly presented to them, while also analysing
information from previous years.
Moving away from Coverity (a for-profit company who’s business comes from
selling its test software to businesses) we can examine other studies between
open source and closed source software quality. One study from 2005 published
in the IEEE journal titled Open Source Versus Closed Source: Software Quality
in Monopoly and Competitive Markets shows “no dominant quality advantage of
one method over another under all circumstances”. The study (being a
formal paper) goes on to define mathematical equations representing concepts
such as “value to consumer” and ïndividual developer quality”. Using these
formal equations they formalized the conclusion of equality in these
two methodologies. What is
interesting about this study however is that it goes on to propose conditions
under which each method can generate higher quality software.
4 A Final Study
The final study I would like to call attention to is one written by Diomidis
Spinellis titled A Tale of Four Kernels. This paper compares 4
operating system kernels, specifically the FreeBSD, GNU/Linux, Solaris, and
Windows kernels. In the study, “the source code of the four systems by
collecting metrics in the areas of file organization, code structure, code
style, the use of the C preprocessor, and data organization”. Through this
analysis the paper hopes to analyse the differences between these 4 substantial
pieces of software, and in turn not only to gain insight as to which
software is better (through methodology), but also why.
Unlike Coverity, Diomidis’ paper analyzed the internal structure of the
software. For example, a comparison was done of the “File length (in lines) of
C files and headers”. Noting that Överly long files are often problematic,
because they can be difficult to manage, they may create many dependencies, and
they may violate modularity”. The image shown on the right taken from the paper
shows this with respect to the 4 kernels examined.
The conclusion reached from this paper, is that when looking across all metrics
of measurement (of which I only discussed one), similar values across the
systems was noticed. This shows that the process of software design, of
engineering, while being immensely important, provided no significantly drastic
changes across all 4 kernels, even though it itself was unique in each system
due to the methodology used. Furthermore, Diomidis drew the conclusion that
there are benefits and downsides in each methodology. For example, in his
particular testing environment “Linux [exceled] in various code structure
metrics, but [laged] in code style”. In the end, the paper concluded with
the same answers that other similar papers defaulted on. There is no clear
winner, only shades of grey among the different methodologies.
We ultimately saw that the sensationalist articles this paper is a primary
response to, are just that, sensational journalism. Referencing a few academic
papers, and even the original study of the articles themselves, shows
how it is not one methodology that can be considered superior”, or ïnferior”.
But rather that both of them have their own costs and benefits, and what we
should be trying to do as programmers is to strive for a methodology of
exceptional and outstanding code quality.
- Coverity Scan: 2012 Open Source Reporthttp://wpcme.coverity.com/wp-content/uploads/2012-Coverity-Scan-Report.pdf.
Study: Open Source Delivers Superior Quality… Up To A Point
Open Source Is Better Than the Closed Stuff (Until You Hit 1 Million Lines)
Report: Open source software quality is better than proprietary software
Actually, Open Source Code Is Better: Report
no dominant quality advantage of one method over another under all circumstances
A Tale of Four Kernels