So busy, so much to do, and nothing gets done
In my last post I talked about value streams and how we can use this concept to change how we think about building software.
In this article I want to talk more about the most common challenge I have seen to achieving high flow through a value stream: we have too much to do, we’re all super busy, but it takes forever to get anything done.
Here are some common ways I hear this problem manifest:
- All we do is work on features, we never have any time to clean up tech debt
- I’m in meetings all day and have no time to get any focused work done
- Standups are useless — everybody is working on a different feature
- We spend way too long in planning, trying to get everyone aligned
- Everything is blocked only half-done because we are waiting for other teams to deliver what we need from them
There are two core causes of this. They are somewhat interrelated, and are direct consequences of the behavior of queues and distributed systems.
More work in the queue increases wait time exponentially
In manufacturing, we can push a lot of work through the system quickly because the rate at which work arrives and the time it takes to get a job done is very fixed and controlled.
But with software it’s just not like that. We really aren’t able to control the rate at which new work comes in, and we have very little control over how long it takes to get something done.
In queuing theory, this type of system has been well characterized and studied. As I mentioned in my last post, the math shows that when arrival time and work time can vary, then the time it takes to get something done (what us nerds call cycle time) becomes effectively infinite when you pile on too much work.
You can try to address this by adding more resources, but that takes time. and often isn’t possible, as many of us well know.
The more effective approach is you can set some boundaries, stop saying yes to everything, and put a limit to how many things you commit to. This is often called “limiting work in progress.”
A great metaphor for this is the metering lights on freeway onramps.
When you are sitting behind a line at the metering light and need to get to work you can feel frustrated. But when you remember that this temporary slowdown will actually get you to your destination much faster, you’re more willing to wait.
Cap your committed backlog
I want to emphasize that this not just about capping the number of things you are actively working on. It’s also the number of things you have committed to delivering — what I call your committed backlog.
Those committed items are things that your stakeholders believe you will be delivering, and you have given them an estimated date.
The longer the queue, the longer they have to wait for something to get done. Not only that, but as you add more things to the queue, the more pressure a team feels to focus on delivering their promises. It also becomes significantly harder to accurately predict when something will actually be done.
To ensure your committed work gets done on time, you also need to ensure all your dependencies are lined up, so this increases the need for coordination and managing dependencies (more on this below).
For these reasons it’s better to just not commit. Be clear on what you’re going to focus on, and the rest is a no until you finish what you’re currently focused on.
Outcomes, not outputs
One great way to keep your committed backlog low is to commit to an outcome or a goal rather than a set of outputs.
So, rather than saying “we’re going to ship widget search redesign in Q1, add video support in Q2, and enable user comments in Q3” it’s much better to say “our goal is to increase conversion rate by 5% by the end of the year.”
Then when you report status, rather than a red/yellow/green dashboard tracking against milestones and deadlines, you talk about what you’ve done, how you’re tracking to your goal, what you’ve learned, and what your next steps are.
Coordination causes significant delays
Those of us who work on large distributed systems have learned the hard way that at a certain scale, the only way you can maintain both performance and reliability is to move from blocking to non-blocking architectures. From the CAP theorem to optimistic concurrency to eventual consistency to event-driven architectures, we are constantly finding ways to intelligently let go of guaranteed correctness to achieve scale and availability.
These same exact rules apply to the large distributed system that is software development.
My experience is that blocking happens in two major ways.
Handoffs between teams
You can’t deliver value until every step of the value stream is completed. And if those steps are owned by multiple teams, each with different backlogs and priorities, you will be overwhelmed by overhead (planning and meetings) and frustrated by delays waiting for someone else to get their piece done.
The best way to address this is to strive to organize your teams to align with your value streams.
This includes all the steps and people involved in the development value stream (strategy, marketing, product, design, engineering, experimentation, IT, operations, support, legal, compliance, security, etc.), across all the steps of the operational value stream.
When you do this, you are able to eliminate handoffs and reduce overhead. You have one team, or a set of teams, with a shared leadership, working towards the same goals and deliverables, with the same priorities, against the same backlog.
Value streams are hierarchical and there are cases where you need to make use of commodity or specialized tools and services, so there is some nuance here. But the guiding principle is clear: organize to optimize for flow and eliminating handoffs and coordination.
For more details on organizing for flow, I highly recommend learning about Team Topologies.
Waiting for signoffs, approvals, and validations
When we try to guarantee correctness by getting manual signoffs, approvals, and validations, we lose out on flow. Research from the Accelerate team actually shows that large change control processes decrease both quality and speed.
The significant delays introduced by our attempts to guarantee correctness have a major cost to the business — unrecognized revenue, increased competitive risk, lost opportunities, increased overhead costs, and higher employee turnover to name a few.
A great image that helps me think of the impact of requiring centralized approval and control is to think of a squad in a battle. You may help ensure they make the right decision by requiring to someone in the squad to go up the hill to the general’s tent every time they need to make a decision, but it’s a great way to lose the battle.
We can trade off our attempts to guarantee correctness and improve flow through the value stream by pushing decision making down to the value stream teams.
Give them the necessary business and architectural context and guidance, and set them free to make the day-to-day decisions. A great video that really captures the essence of this practice is this one by David Marquet.
Bring about big change through small steps
I have shared some of the systemic causes, principles, and high-level approaches to addressing the problem of being super busy but nothing gets done.
But what this looks like for you depends very much on your situation. There is no playbook that you can just apply. You need to find your own way, that works for your culture and your particular set of problems and goals.
You also can’t just wave a magic wand. You will need to convince your leadership to invest in the change by demonstrating success. The best way to do this is start small, think big, and learn fast.
In particular, an approach I have found works really well is to find a team that has the appetite and interest to try something new. Then you all can work to gether to apply the scientific method: define your hypothesis, identify an experiment to prove or disprove it, and evaluate the outcome. Evaluate, assess, and repeat.
As you start to see results, share your learnings with others. Demonstrate the impact of the changes you have made. Nothing helps reduce fear and anxiety more than a set of peers sharing their journey and answering your questions. If you’re finding real results, it won’t be long before others join in, and you’ll find it has a snowball effect.