28 August 2008

Criteria for Selecting a Technical Debt Calculator

In "Calculating Technical Debt", I proposed a set of criteria for selecting a technical debt calculator. This list was:
  • Plugin architecture
  • Flexible roll-up rules
  • Current perspective
  • Trend perspective
  • Languages supported
  • Environments supported
  • Supports qualitative data
  • Build environments
  • Custom dashboards
  • Aggregates multiple projects
  • Continuous integration servers
  • User community
To this list I added a few more criteria relevant to using the tools:
  • Tool quality (is the quality calculator buggy?)
  • Latest release (is this project alive?)
  • Documentation (can I figure out how to use it?)
In "Characteristics of Technical Debt", I came to the conclusion that the McConnell internal quality characteristics were well suited to calculating technical debt. His criteria belong on the list then:
  • Calculates maintainability
  • Calculates flexibility
  • Calculates portability
  • Calculates reusability
  • Calculates readability
  • Calculates testability
  • Calculates understandability
I make no claim this list is exhaustive, but it covers enough of the issues important to me that I am comfortable moving forward with it.

In making these evaluations, I need some projects to run through the tools to see how they perform. The obvious choice is some open-source projects. These projects should:
  • Be developed in Java
  • Have their development history captured in a version control system
  • Have high development activity
  • Have JUnit tests
  • Some built with Ant, some with Maven 2
A trip around a few open-source repositories turned up a few likely candidates:

Jena - A framework for building Semantic Web applications. I'll focus on the ARQ component. (Ant)

Jena (ARQ module) has eight releases in its Subversion tags folder:
$> svn ls https://jena.svn.sourceforge.net/svn/root/jena/ARQ/tags

ARQ-2.0/
ARQ-2.0-RC/
ARQ-2.0-beta/
ARQ-2.1/
ARQ-2.1-beta/
ARQ-2.2/
ARQ-2.3/
ARQ-2.4/

Tiles - A Web templating framework.  (Maven 2)

Tiles has eight releases in its Subversion tags folder:
$>svn ls http://svn.apache.org/repos/asf/tiles/framework/tags

tiles-2.0.0/
tiles-2.0.1/
tiles-2.0.2/
tiles-2.0.3/
tiles-2.0.4/
tiles-2.0.5/
tiles-2.0.6/
tiles-2.1.0/
This would seem a trivial matter to pick out some projects to use as a basis for our tool evaluation. It turns out this is not so. I won't bore you with the reasons, but this took WAY longer than I expected.

With criteria and test cases in hand, I'll move on to evaluating the options. Sonar seems like a reasonable place to start.

25 August 2008

Characteristcs of Technical Debt

I was planning to launch into an evaluation of the tools identified in "Calculating Technical Debt" (please read that first if you have no idea what I am talking about), but in preparing for the evaluation, I discovered there are large bodies of work addressing the characteristics of interest for determining code quality. There is even an ISO standard (9126) defining one such set of quality characteristics.

I guess it was pretty naive of me to think that quantifying software quality was a new undertaking. There are five models that appear repeatedly in the literature:
  • Boehm
  • McCall
  • FURPS
  • ISO 9126
  • Dromey
Each model has a different focus and thus includes different characteristics to assess quality (see reference page 38). The following table summarizes these top five models:

Reference: Ortega, Maryoly; Pérez, María and Rojas, Teresita. Construction of a Systemic Quality Model for evaluating a Software Product. Software Quality Journal, 11:3, July 2003, pp. 219-242. Kluwer Academic Publishers, 2003

In "Quantifying the cost of expediency", I proposed using McConnell's "internal quality characteristics" from Code Complete. Does this still hold up in light of these large bodies of research? Lets see.

In a perfect world, I'd take the time research all of these models along with the various hybrids that have been devised. Perhaps I'll formulate my own model one day and become rich and famous. My goal for now is not to devise the optimal model, but rather to create a proof-of-concept for calculating the cost of expedient design and development choices.

From a cursory literature search, it appears that ISO 9126 is the most widely used model. I'll use that as the basis for comparison.

The ISO 9126 standard is summarized by the following graph:

(from http://www.cse.dcu.ie)

As you can see from the graph, the ISO model defines six major characteristics of quality:
  • Functionality
  • Reliability
  • Efficiency
  • Usability
  • Portability
  • Maintainability
Each is further decomposed. Comparing ISO 9126 with McConnell, you'll notice that McConnell is missing characteristics like accuracy and efficiency. He actually presented a separate list "external quality characteristics". These refer to customer-facing issues rather than developer-facing issues. Most of the ISO 9126 characteristics missing in his internal list appear there. Intuitively external characteristics are fundamental to understanding technical debt. They are also more difficult to determine automatically. I am going to choose to postpone including them in the technical debt evaluation for now - it is the expedient thing to do. ;-)

The characteristics on McConnell's list that don't appear in ISO 9126 are flexibility and reusability. These characteristics seem relevant to the technical debt of a project. Flexibility and reusability do appear in the McCall model (see table). McConnell provides the following definitions (p. 558):
Flexibility - The extent to which you can modify a system for uses or environments other than those for which it was specifically designed.

Reusability - The extent to which and the ease with which you can use parts of a system in other systems.
After all this, I am back to where I started. That is not where I expected to be. The original title of this entry was something like "An ISO model for technical debt". It was not until I started comparing ISO 9126 with McConnell that I came to the conclusion that McConnell better serves my purpose.

In the next entry I'll return to looking at the available tools for calculating technical debt.

20 August 2008

Calculating Technical Debt

In "Quantifying the cost of expediency", I describe the problem IT management face when they trade off shorter development time with robust code. To recap, my conjecture is that a solution to understanding such trade offs can be reached by sub-dividing the problem into two phases and addressing each individually. The phases are:
  1. Evaluate the code base against a set of quality criteria and track trends
  2. Convert these metrics into an easily digestible form
At the AgileNM meeting today, I learned there is a term for the code degradation that often occurs in a software project: technical debt. We had an excellent discussion of qualitative ways attendees assess this cost. These included:
  • Decreases in team velocity over time
  • Developer time to understand unfamiliar code
  • Querying developers
How exciting to have this group interested in the very topic I've been pondering and writing about! In looking for quantitative measures of technical debt, my research turned up several candidates for calculating the current debt and tracking trends. This list is shamelessly lifted from the Sonar Related Tools page:

Open Source
Commercial
The commercial tools don't appear to have evaluation downloads, so I'll focus on open source tools for now. Perhaps I can get evaluation copies in the future.

The open-source tools all leverage other open-source tools that determine one or more pieces of the technical-debt picture. Checkstyle, PMD and CPD are static analysis tools while JUnit and Cobertura are runtime tools which all contribute to understanding the debt .

The open-source tools are all Java-centric. I'll search for similar .Net tools.

Originally, I had planned to jump into Sonar but, given the alternatives, it seemed prudent to consider what features best support calculating technical debt. I haven't looked at any of them closely yet, so hopefully this list isn't overly biased:
  • Plugin architecture for adding new analysis tools
  • Flexible rules for rolling up results into characteristics
  • Current and trend perspectives
  • Support for multiple languages and environments
  • Able to consider qualitative data in calculating characteristics
  • Support for multiple build tools
  • Flexible dashboard creation to display results
  • Interfaces to continuous integration servers
  • Vigorous user community
In the next installment I'll present my take on how the open source tools stack up against this feature set.

13 August 2008

Quantifying the Cost of Expediency

Software project managers often ask teams to proceed at the greatest possible speed without regard for the long-term consequences to code quality. Lets call such choices "expedient". If expedient choices are made infrequently, the development team can recover by refactoring the tainted code. For many shops though expediency becomes the norm rather than the exception. The downside of expedient choices is not initially apparent. There is an uneasy feeling in our stomach, but nothing tangible to explain why. The upside usually is apparent since it can (with some accounting magic) be measured in dollars. My contention is that to make informed decisions managers need the ability to measure the "expediency cost" in dollars too.

We have some idea of this cost to the industry as a whole. Capers Jones estimated that almost two-thirds of developer time is spent repairing software. In Code Complete, he states:
Projects that aim from the beginning at achieving the shortest possible schedules regardless of quality considerations, tend to have the fairly high frequencies of both schedule and cost overruns. Software projects that aim initially at achieving the highest possible levels of quality and reliability tend to have the best schedule adherence records, the highest productivity, and even the best marketplace success.
So evidence suggests we're heading down the wrong path when we attempt to be expedient, but how do we quantify this cost? What should high quality code look like?

In Code Complete McConnell lays out the following categories for internal quality characteristics (descriptions are paraphrased):
  • Maintainability - Can the software be modified?
  • Flexibility - Can the software be repurposed?
  • Portability - Can the software be ported to new environments?
  • Reusability - Can the software be used in other systems?
  • Readability - Can the source code be read?
  • Testability - Can the software be verified correct?
  • Understandability - Can the software be understood at the system-organizational level?
I propose then the task is two-fold. First, evaluate the code base against these criteria. This includes tracking metric changes over time. Second, convert these metrics into information management can use to make informed trade-offs between quality and expediency. If this information is ultimately mapped to dollars, a real apples to apples comparison of the cost of expediency can be made.