First
of all, some background: writing the chapter on agile software development for
the next edition of the Computer Science
Handbook
(the third edition will say a lot more about software engineering than the
previous two editions) last summer gave me the opportunity to take another look
at some current trends and initiatives in Agile. There are plenty of them out
there, like Neil Maiden’s work on introducing
creativity techniques in agile development. But one that caught my eye in
particular was a recent uptick in activity around the benchmarking of agile projects
– that is, their comparative productivity.
Agile
software development has been presented by proponents as a more effective
alternative approach to previous methods, but frankly, the evidence has often
been anecdotal. Even the evidence for comparative effectiveness among the various agile methods has
often been mostly anecdotal. One early exception, of course, was a series of
empirical investigations of pair programming. And Test Driven Development has
also been the subject of number of empirical studies. But these are all narrow
studies of particular techniques, and as useful as they are, they don’t capture
the big picture. And with the rise of mindsets like evidence based software engineering, the anecdotal
claims for the effectiveness of agile methods have started to wear a bit thin
in the past few years.
So
it was interesting to see a very large initiative underway last year at the Central
Ohio Agile Association where a lot of companies got together with QSMA Associates, the software
benchmarking company, to benchmark their
projects.
Particularly interesting is that, because of the QSMA involvement with their
huge database of something like ten thousand projects stretching back 30 years,
the benchmarking is against all kinds of projects in industry: big, small,
agile, non-agile – in other words, the big picture of how Agile measures up in
the world of software development.
I
found this quite intriguing and talked with the folks at QSMA about it, and
they agreed to participate in an agile benchmarking initiative here in Italy. So
when I gave my talk in November at the Italian Agile
Day
in Milan, I made an appeal to round up some participation from the Italian
industry for an agile benchmarking initiative in 2013. It turned out, however,
that there is still a lot of perplexity in the community here around the idea
of benchmarking agile projects, and there were some questions involving points
of view expressed at various times by software engineering luminaries such as
Martin Fowler, Tom DeMarco, and Joshua Kerievsky.
I
was a bit surprised by this perplexity, because the QSMA folks have been working with the
agile community for several years on productivity measurement and
benchmarking. In fact Michael Mah of QSMA worked directly with Joshua Kerievsky
on what was probably the very first agile benchmarking project ever, a Canadian
medical systems company, along with Jim Highsmith.
So
I decided to take a look at where the source of all this perplexity might lie.
Since it was explicitly mentioned in a query I received, let’s start with
something Martin wrote. In a post written in
2003, Martin concerned himself with the problem of measuring productivity in a
software development context. Although it’s camouflaged to some extent, there
are actually two questions considered
in that post:
- Can you measure productivity?
- What is the relationship between productivity and value creation?
The
first question is explicit in the title. The second question comes out in the
text.
But
let’s start with the first question: can you measure productivity in software
development? Right away we encounter the famous “Lines of Code” (LOC) problem,
which has two aspects: First, comparing LOC written in two different
programming languages; second, comparing LOC written by two different people.
And indeed, these issues do exist “in the small”: the same program written in
assembler and APL will have a very different size; and Tom and Jerry may well
solve the same problem with two programs of different sizes. But all serious
productivity measurement has long since stopped working “in the small,” at the
point-individual level. In the large, in the real world, the picture is much
different. The issues you see in individual cases don’t appear in the large. For
one thing, programming teams today are generally using modern high level languages
of similar complexity. And, as Fred Brooks pointed out in his classic piece No Silver Bullet while
discussing the impact of programming languages on productivity, “to be sure,
the level of our thinking about data structures, data types, and operations is
steadily rising, but at an ever decreasing rate.”
On
a personal note, I got my first inkling of that back when I was a student at
Yale. When discussing learning a new programming language, my advisor Alan
Perlis – winner of the first Turing Award, coiner of famous epigrams on programming,
and as Dave Thomas reminded me last year in Munich, the true ur-father of
design patterns with his programming
idioms
– suddenly blurted out with a wave of his hand, “Oh, they’re all the same
anyway.” And this was the person who had led the team that created Algol60, the
“mother’s milk of us all,” as he put it, and the most famous APL freak in
history. I don’t want to downplay the effect of programming languages too much,
and of course it’s always good to try to use a solid, modern programming
language and the right one for the job. I’m just saying that in the grand
scheme of things, that’s not really where the issues lie in productivity
measurement today. And the QSMA experience over thirty years confirms that,
covering every class of software application.
On
the other hand, the second aspect of the LOC question – different results from
different people – might be cause for some concern. But once again, such
concern comes mostly from thinking “in the small.” Sure, two individual programmers
might code more or less, but this is mostly meaningless in the real world. It’s
a rhetorical argument. What we are observing in actuality is what is
accomplished by teams working “in the
large”, complete with stand-ups, iteration demos, and in many cases, pair-programming
– complete with on-the-spot code reviews. Good agile teams build the right code
– not too little, not too much – to deliver the desired functionality described
by stories and, say, story points. (Rarely are we talking about two programmers
having a competition, like the good old days of APL “one liners” and that sort
of thing.)
Moreover,
there is a corresponding time to
build that system, by a certain sized
team, with a given amount of effort
and cost, at a level of quality. Suppose a team of 12 people takes 5 months and
successfully delivers working software for 83 stories, totaling 332 Story
Points. After 10 sprints they finish the system. It is comprised of Java and
XML totaling 32,468 new and changed code. 48 bugs are found and fixed during QA
and it is put into service. One
could look at research statistics and observe a comparison: Industry average
for this same amount of software requires 16 people, taking 6 months, and during
QA there are generally 96 bugs in the code that are found and fixed.
In
short, this hypothetical team has delivered working software that has half the
defects, a full month faster than average. In Kent Beck’s view, he would
declare that Agile could be considered more successful with exactly that sentence I just wrote. Jim
Highsmith would claim that by focusing on quality (cleaner code), we received
the benefit of delivering faster (less effort thus less time required to test
out bugs at the end).
So
in actual practice, these ways of looking at productivity are considered meaningful
and important to Agile leaders, and are absolutely measurable – every project
has those three numbers: size, time, effort. Both Mike Cohn and Jim Highsmith
reference the work of the QSMA folks their latest books. Sure, you can discuss
the techniques used for measurement, such as story points. Joshua points out in
his blog post that we can now
do better than story points and velocity. No problem – we should always go with
the best techniques available – but he isn’t taking issue with agile measurement
in itself, and neither are the others.
So
where is the problem people are
seeing? This brings us to Martin’s second question: “What is the relationship
between productivity and value creation?” Martin has been very good over the
years at reminding us that simply delivering code isn’t the ultimate goal of
software development. The ultimate goal is delivering business value, and it’s
not always necessarily in a strict, lockstep relationship to productivity. As
they say in the oldest discussion in the world, size doesn’t always matter:
Martin immortalized that little gem
of JUnit with the Churchillian phrase “never in the field of software
development have so many owed so much to so few lines of code,” and offered
similar (deserved) praise to the short No
Silver Bullet essay of Brooks. Time doesn’t always matter either, as Martin
once pointed out: Windows 95 was finished way over schedule and budget, but
ultimately generated enormous value for Microsoft.
The
question of productivity and value pops up in many areas, of course. Eighty-six
year old author Harper Lee has written only one book – and that was over 50
years ago. But To Kill a Mockingbird was
voted in at least one poll as the greatest novel
of all time.
(And while we’re on the subject of business value, it’s also the ninth-best
selling book ever.) On the other hand, a couple of years ago there was a competition in the Washington Post to become their newest
opinion writer, and one of the requirements was to demonstrate the ability to
produce a full-length column, week after week, year after year. Famously, Ernest
Hemingway would spend an entire day suffering over a single sentence; Isaac
Asimov reeled off books effortlessly. Mozart spun out his masterpieces at a
dizzying rate; Beethoven labored slowly to produce his.
So
what’s going on here? Aside from the caveat that you can only take parallels
between art and technology so far, the key to making sense of all this is to
remember an important fact: Value is
created at the level of the business. Now, the agile community talks about
this a lot, but I often get the feeling they’re talking around it rather than about it, so let’s dwell on it for a few
minutes. The Lean folks in particular have known for a while that value creation
is at the level of the business – I remember Mary Poppendieck nodding
vigorously when I said this at my keynote at the XP2005
conference in Sheffield, and they have continued to develop the idea. So this
discussion isn’t that new in the agile community. Yet it doesn’t seem to have
been fully digested yet.
Those
of you who work in the safety critical systems community will have heard people
say that safety is an emergent
property of the system. By this they mean that you can’t tell at the level of
individual parts, no matter how well made, whether a system is safe. Safety can
only be evaluated within the overall context in which the system will be used.
Analogously, you could say that value is emergent at the level of the business
– it is determined by the overall business context in which the product is
embedded. It is not determined at the
level of operations; and software development is at the level of operations. This
can be a hard pill to swallow, especially to us agilists who talk a lot about
delivering business value – and now we hear that we don’t produce it directly. But
it speaks directly to Martin’s point about how productivity won’t automatically be a determinant of
business value, because they’re at two different levels. But it’s a copout to
stop there: just because it isn't an automatic
determinant, doesn’t mean it’s irrelevant – far from it. So let’s continue the analysis.
If
you can’t judge business value at the level of operations (e.g. software
development), then how do you do it at the level of the business? Just as
safety assessors work with a framework for judging safety at system level, you
need a framework for understanding what creates value at the level of the
business. The framework elaborated several years ago by Michael Porter has withstood
the test of time. It’s simple and straightforward. Even in this post-Web 2.0
networked social hyper ecosystem time to market era, there are still just two
determinants of business value creation: Market Economics and Competitive Position.
Let’s
start with market economics: If you work in “Lucrative market X,” then you’re
simply going to be more likely to create value than if you work in “Struggling
market Y.” This is related to what Tom DeMarco is talking about in his 2009 article in IEEE
Software
that was mentioned in one of the discussions on productivity and measurement.
He talks about choosing a valuable project that amply covers the cost of
whatever resources you put into it, thus needing less control than a project at
risk of not covering its costs. (This is another example of “in the small”
point-comparison. The more general, “in the large” version is working in a more
attractive market.)
This
brings us to the second determinant of business value creation. Within a market, it’s your competitive
position that will determine whether you create business value. (And by the
way, the companies with strong competitive positions even in a weaker market
are still likely to outperform those with weaker positions in more lucrative
markets.) Here again, the framework is surprisingly simple. There are only two
ways to improve your competitive position: successful differentiation and economic
cost position. Differentiation is all about creating that special something
that the customer is willing to pay for. Apple is legendary for this type of
business value creation, and much of the current discussion around innovation
fits in here. Note, by the way, that differentiation isn’t as easy or as common
as it might seem from all the hype. There are even signs that Apple is
faltering in that department, if you’ve been watching the markets recently. Economic
cost position is essentially about lower operating / production costs. Much of
software process and quality improvement (including Agile) is about this, of
course.
Agile
processes do a great job of supporting all of these strategies for business
value creation through improved competitive position – I wrote more about it here – but they only
support them. Consider a phrase like
“ … satisfy the customer through early and continuous delivery of valuable
software.” First of all, think about that expression “valuable software.” Given
that at the operational level you simply can’t determine whether the software
is valuable, the real meaning is more like “the software that the customer is
asking for.”
Considering
that the goal of waterfall processes is also to deliver what the customer is
asking for, the most agile processes can claim here is that they are more
likely to deliver what the customer is asking for. Fair enough, and I happen to
agree, but that’s not exactly a strong argument for value creation. It must be
the “early and continuous delivery” part, then. And indeed it is: early and
continuous delivery supports both a differentiation and an economic cost
strategy for value creation through competitive position. If it’s
differentiation the customer is after, then delivering features early and
continuously gives the customer the chance to test them to see whether the
customer really will pay for them. (And if it’s innovation you’re after, let me
mention once again Neil Maiden’s work on injecting creativity techniques into
agile processes.) And I think we’re all in agreement about the potential of
agile processes to produce more with lower costs and defects to support an
economic cost position strategy for business value creation. That’s
Productivity with a capital “P”.
The
point is that although high productivity isn’t an automatic, a priori guarantee of value creation in
all cases (it doesn’t entirely capture the essence of Market Economics and some
aspects of differentiation / innovation, which is where many of those
exceptional examples come from and has to be analyzed at the business level),
it is absolutely an important factor in the operational support of the most frequent
and important business value creation strategies based on Competitive Position
in one’s chosen market. Generally, and in the large, if your team productivity
is higher, in real-world practice you are almost certainly making a strong and
direct contribution to the creation of business value through a strengthened
competitive position.
In
conclusion I’d like to mention one last software engineering luminary. Philippe
Kruchten, who is particularly well-known in the architecture community but also
in the agile community (he recently co-edited a special issue of
IEEE Software on Agility and Architecture), attended the ten-year agile manifesto anniversary
reunion
and listed some observations about “elephants in the
agile room”
in his blog afterward. Note that one of his biggest elephants (undiscussable
topics) is resistance in the agile community to gathering objective evidence.
Philippe
suggests that this resistance in the agile community to gathering objective evidence
is part of a “latent fear that potential buyers would be detracted by any hint
of negativity.” But aside from considering that buyers will also be detracted
by a reluctance to back up claims with real evidence, there is a much more
positive way to look at it all: buyers and managers who see real objective
evidence of superior productivity with Agile will be a lot more willing to
invest their money in it. And they’ll even be happy to see objective evidence
of where problems lie, because they’ll know they can efficiently spend the
money exactly where it’s needed to fix those problems.
And
finally … without metrics, you’re just someone with a different opinion, as
Steven Leschka of HP once said. Are we really going to just give up on gathering
this objective evidence and leave the field open to Agile’s detractors and
their different opinions? So far, the results of the Ohio agile benchmarking
study have been very positive – certainly no reason for latent fear and concern
– and I hear that an agile benchmarking initiative is starting up in Germany
with several companies already onboard. I hope the Italian Agile Benchmarking
Initiative gets similar participation.