Monthly Archives: February 2017

Lies, Damned Lies, and Agile Metrics

rusty metric measuring tapeWhy do we measure things, or in this case, agile teams?  Well, we might want to know what is likely to happen in the future (prediction), or we might want to know if we’re getting any better (progress), or we might want to see how much value we’re getting out of one team versus another (productivity).  In order to make sound business decisions, we’ve got to know about all these things.  But too often we forget that measurements of human teams building new software are not exact.   Not only that, they sometimes conflict with the qualitative results we see in reality.  As an agile leader, it’s important to counterbalance graphs and and calculations with quite a bit of Management By Walking Around, a healthy dose of Gut Feel, and a heaping portion of Actually Using The Software Yourself.

Please realize that I’m not arguing against metrics.  They’re a crucial part of any agile endeavor and the lean enterprise.  But let me tell you a story about measuring teams.

Which Teams Are Giving The Most Value?

There once was a product that had teams working in two locations.  Several of those were in the United States, where tech salaries tend to be quite high.  Other were in a different country – let’s call it Freedonia as a Marx Brothers tribute – where the team cost was significantly less (60-80% of the US team).  Both teams estimated stories using Fibonacci series points (1, 2, 3, 5, 8, 13 …) and management tracked the teams’ velocity over time.

After about a year of this, management looked at the velocity charts and it was very apparently that the Freedonia teams were consistently delivering more story points per sprint.  When they factored in the lower cost of the Freedonia teams, the difference was even more stark.  Conclusion?  Invest in expanding the Freedonia teams!

Hm, you think.  He wouldn’t be telling this story if that was the end of it.  And you’re right.  Here are just a few of the problems with the conclusion that was drawn.

Problem 1 – Lack of Normalization – This one’s the easiest to spot and fix.  The teams in the two countries were not estimating relative to the same baseline.  Instead, The Freedonia teams tended to assign much larger point values for comparable stories.  So the US teams would consistently estimate and complete 35 points while the Freedonia teams sometimes did up to 80!

Problem 2 – Externalities – If you’ve studied economics, you know that externalities are (roughly) costs or benefits that don’t get quantified as part of an economic transaction.  In this case there were some major unbalancing issues in play.  Most notably, the US teams and their associated business colleagues spent a very large amount of time attending to the technical and requirements needs of the Freedonia teams, because of their overall lower level of familiarity with the product.  That is, the Freedonia teams were creating some drag on the US teams (through no fault of their own).  Also, due to time zone, language, and product owner location issues, the cost to transfer information to the Freedonia teams was much higher.

Problem 3 – Customer Value – Remember, story points or SAFe’s 1-10 scale business value don’t equal customer value.  Customer value equals customer value.  At some point, someone decided to step way back and look at a list of the features delivered by each set of teams over a period of about 12 months.  It looked something like this:

US TeamsFreedonia Teams
Internationalization of product
Metrics dashboard and widgets
Task management
Custom fields
HTML5 screen template builder
Complex international accounting
Third party accounting software sync
Roles and permissions system
Product tiering
Accounting sync enhancements
Bulk document upload
Task management (later redone by US teams)
Page-specific permissions
Spike - doc management (not implemented)
Spike - additional accounting sync

Without knowing too much about the specific product, it’s pretty obvious that the US teams delivered many more meaty, releasable features, while the Freedonia teams took more of an assisting role extending existing features. There were also several spikes that ended up as lost sunk costs since they didn’t end up as released features, and most notably a feature that had a large number of design flaws and bugs which had to be rewritten by a US team.

This doesn’t imply that the US teams were “better”; but they did have more domain knowledge and the advantage of being collocated with the Product Management, architecture, and UX teams. At the very least, deciding to expand in one location over the other should have taken into account the qualitative track record and the other situational advantages of the US teams.

Change Is Gonna Do a Number on Your Metrics

hand adjusting sound mixer fader

Change is constantly happening and affects the viability of your metrics.

My background is in science, where experiments are tightly controlled and the goal is, when possible, to change one variable at a time in order to determine its effects on a system. It’s important to realize that although we talk about agile experiments, they are not and will never be scientific experiments. You’re not building the same feature multiple times (I hope), and products, teams, and environments change so frequently that most metrics need to be taken with a shaker of salt.

Off the top of my head, here are a few things that can affect team velocity (and delivered business value), which is one of the most important metrics in agile teams:

    • someone new joined the team
    • someone left the team
    • the Definition of Done was made more stringent
    • production issues or other operational distractions
    • new Product Owner or ScrumMaster
    • vacation or illness of key personnel
    • larger organizational changes
    • moving offices
    • variable sprint length (yes, people do this)
    • quality of stories and requirements for a specific feature
    • new technologies being introduced

I’m sure there are others, but you get the point. You really aren’t ever comparing apples to apples when you’re measuring your teams.

Baby, Bathwater, Etc.

Having said all this, I love me some metrics. I love pretty graphs and charts, especially when I can draw a trend line through them that shows teams getting faster and better. But I try to be aware of their limitations in describing what’s really happening in my teams and why. Most importantly, I balance them with cultivating a deep knowledge of my teams and products as they exist in the real, non-mathematical world. I talk to the developers and testers. I play with the software and read users’ anecdotal feedback.

I’ll leave you with a short checklist you can use to evaluate your metrics and be appropriately careful about the conclusions you draw from them:

    1. Certainty – What is the level of certainty of this metric, and do all parties understand the error margin? Or are some parties taking SWAGs (Sophisticated Wild-Ass Guesses) as hard truth?
    2. Hackability – How likely is it that people are deliberately or subconsciously gaming the metric to please management?
    3. Variability – What changes to the teams or system are occurring over time that might affect the metric?
    4. Predictiveness – If you step back and look at the team or system in a qualitative, not quantitative, way, does the metric square with what you’re seeing on the ground?

Please think about this and balance metrics with other observations, and do your best to ensure that more perspectives than just raw statistics make their way up and down the leadership chain.  Good luck and happy measuring!

The Product Backlog: A Journey In The Mist

Man walking in the mistI’ve been working on reviewing an upcoming agile book and there’s a nice discussion in it about the various metaphors that people use to describe product backlogs in Scrum.  They all get the point across, but I thought I’d try to come up with an alternate visualization.  And then I had a dream about it (really)!

Something’s Bugging Me About Icebergs

One of the most common backlog metaphors is the iceberg , where a small percentage of the backlog is very well defined and visible at the top (this math says it’s about 13% for actual icebergs) and the rest is progressively more coarse-grained and murky as you go down, until at the bottom you’ve got Big Ideas For The Future that aren’t clear at all.  The idea is that this is a good thing, because Lean principles tell us not to spend too much time worrying about low priority features we may never build.  However …

Let’s play a quick word association game.  ICEBERG!!

icebergI bet you thought “Titanic”.  So did I.  It turns out that the 87% of the iceberg that’s below the water is a big problem if you’re trying to navigate.  In the case of an iceberg, it’s actually *not* ok that we don’t know much about what’s down there, whereas with software development we’ve got to accept that and deal with it.  Also we can’t see the underwater part at all without special sensing equipment – versus the sprint/current release/future release gradation of a backlog – so the analogy kind of breaks down.

Stories In The Mist

So, here’s my metaphor for the backlog:  You’re walking through a misty landscape where the fog obscures your view beyond a few dozen yards.  You need to get to your destination – a small rural town which you know lies to the north – but you can’t see it.  You do have a compass to point the way, though.  You know you’ll have to cross rivers, rocky areas, and other obstacles but again, most of them are hidden from view.  Luckily, the colors and details of closer features stand out crisply against the white background.  You see some rocks and roots in the path and an incline to climb, so you note them as the most important and move forward.

As you reach the top of the small slope, more features emerge from the mist, and now you can see some vague larger shapes in the distance, including what looks like a foot bridge over a river.  You pick that as your goal and as you keep going, more trees and rocks reveal themselves with enough fidelity that you can navigate past them.

You doggedly hike on in the right general direction, picking short and medium distance goals and stopping every now and then to rest and check your compass.  Eventually you begin to see lights piercing through the mist in the distance and this gives you energy to quicken your pace, knowing that you’re almost there.  You hear the sounds of the town emerge: car tires on pavement, a dog barking, and you step out of the fog and the woods onto the main road leading into the town proper.  You’re just a mile away from a comfortable bed and a satisfying meal.  You did it!

What I Like About This

I like this metaphor because it emphasizes that while it’s important to have long-term goals (heading north to the town), most of the effort is in assessing the immediate and imminent surroundings to make good tactical decisions:  step over these roots, head for that bridge, etc.  It also adds a sense of movement.  You’re constantly traveling forward as you achieve small milestones, and that movement itself is what reveals and clarifies the next steps.  Even if you have some prior knowledge of the landscape, there are going to be surprises and that’s ok.  A bridge is out; a path is washed out by a storm or blocked by a tree.  There are no guarantees that the set of actions you end up taking to meet the goal are the same ones you expected to take.

What do you think?  Is this a useful way to think about the backlog?  Leave me a comment and let me know!

stones on foggy river