The first essay in this series on knowledge work discussed the importance of understanding who the customer is, and creating feedback loops to determine what’s valuable about the work. We defined value as “everything the customer is willing to pay for” and everything else as waste. The obvious next question is how to eliminate waste. That sounds simple, but sadly enough the majority of knowledge workers and managers do it wrong. At best, their improvement projects are a waste of time and very often they even make things worse! That may sound hard to believe, but just like in the first essay the underlying problem is a lack of understanding how systems work. To address that we’ll construct a very simple example system and look at effective and useless ways to improve it.

   Let’s pretend we’re looking at software engineers again and imagine that the production of valuable working code is the output of a linear, 3 step production process. This grossly misrepresents software engineering, which is a very creative process with intricate workflows and not at all assembly line like this project. Here is what our straightforward process looks like:

   Three teams A, B and C work in a sequential process. At the start of each day, Team A pulls in 10 units of work. Let’s say Team A are business analysts defining product features, Team B consists of coders and Team C are the testers. At the end of each working day, each team passes on the work-in-progress to the next team in the chain, for processing the next day. As you can see, teams B and C have nothing to work on yet during the first day. But Team B can start coding on the second day, Team C begins testing on the third day, and as of the fourth day finished production code can be shipped to the happy customer.

   One day Team A has great news: they have found a more productive way of working! For a small one time investment worth 5 units, they will be able to pull in and handle 20 units of work every day! They talk to their unit leader and financial controller, who agrees the proposal is a no brainer as the payback time is only half a day. Team A enthusiastically goes to work and starts working at a rate of 20 units per day. Teams B and C continue working at the same capacity of 10 units per day. After a couple of days we begin to see the effect…

   Team A delivers on their proposal but work-in-progress starts to pile up in front of Team B which has become the bottleneck in the process. Team C doesn’t notice any difference: work still keeps coming in at the usual pace of 10 units per day. There is also no change in the output rate of finished product. The value delivered to the customer is still the same as in the original example. That means Team A’s well intended productivity improvement is an illusion. To borrow a phrase from the first essay: Team A celebrating this as an efficiency is a form of grading their own homework. Nevertheless the majority of financial controllers would sign off on the “business case” of the proposed scenario. I know because they do when I ask them the question in trainings, even when they (rightfully) suspect there’s a catch. In real knowledge work life the workflows aren’t as simple and linear as our example. The flows are typically longer and more complicated, and execution of the tasks by an individual or team typically runs in parallel to other workstreams. As a result it’s a lot harder to spot the bottleneck effect, but the concept of local changes in one part of the system creating unintended consequences elsewhere is every bit as real.

   Our example is a greatly simplified software development flow, but actually it may strike you more as an assembly line-style physical production process. Managers of physical processes understand the bottleneck effect a lot better and far longer than knowledge workers. The problem is literally visible when work in progress inventories pile up at a certain workstation in the production chain. These also come with a working capital cost, putting them on the radar of finance managers. In manufacturing the bottleneck effect and the remedies are widely known as Eliyah Goldratt’s “Theory of Constraints”. In knowledge work on the other hand, few people are aware of these concepts – with the exception of software engineers.

   We’ve already touched on a couple of reasons why detecting and managing bottlenecks is harder in knowledge work than manufacturing, but there’s a few more. The workflows are not as linear and fixed as a standardized manufacturing process. In software, a team of coders is working on new features but also a variable and unpredictable stream of bug fixes. Actually in many knowledge work activities, the standard workflow or process is often not well understood or documented in the first place. For a well understood workflow, the subprocesses may still be variable or unpredictable. From one project to another, a specific team may have more or less work to do, and this is usually not known upfront as detailed scope or problems only gets revealed while the project is already running. Individual knowledge workers usually combine different tasks and projects and will always have a full “To Do” list, making it hard to see what work in progress piling up looks like. To generalize, physical manufacturing is more standardized in repeatable, stable and predictable production steps. Knowledge work processes still have a higher percentage of non-standardized work, resulting in a slipperier bottleneck concept.

   It can be tempting to try and organize knowledge work processes in a more factory-like, standardized fashion. For highly repeatable processes, where many identical “output units” need to be produced, there is no reason this wouldn’t work. For such cases, the Theory of Constraints methods can be directly reapplied. On the other end of the spectrum, there’s work resulting in completely unique or bespoke production each time. Bottlenecks aren’t a big problem for those: where there’s no repetition, there’s no bottleneck. The hardest cases are in the murky middle, where there’s a mix of repetition and uniqueness in the processes. Certain elements can be standardized, but overdoing it will squeeze out all the creativity – which is often the main source of value creation. Let’s also keep in mind that knowledge workers aren’t immovable machines but living and breathing human beings who join or leave, train and learn, get sick, make mistakes, motivate each other, try new things, innovate… It doesn’t take much for the cure to be worse than the disease.

   So what can we do to deal with the murky middle? First, we have to correctly define the problem we are trying to solve. If you look back at our simplified example, the goal is to produce as much value as possible per unit of time. That translates to minimizing the amount of time the average unit spends in the production system. This gives us a first practical indication of where to locate the bottleneck: it is the place in the system where the average unit loses the most time. It is crucial to realize that a bottleneck is a problem at the level of the entire system, not at the level of an individual local component. This insight cannot be overstated enough, but it is almost always misunderstood in real life. In our example, it will look to most observers like Team B has a problem but that is not the case. Team B does exactly the same before and after the intervention, which is the clearest possible proof that the problem is not them. Another proof point is that we can’t resolve the system issue at the Team B level: if we somehow make Team B capable of processing 20 units per day, we still don’t increase the total output of the system. The bottleneck just shifts to Team C. But this puts us on the right track to make real improvements to “murky middle” work processes.  

   It’s pretty abstract but we now have a recipe for true improvements of opaque knowledge work processes. The first thing to do is measure how much time a unit of work spends at each step in the process. Any improvement idea that doesn’t address this bottleneck is already wasted, as it won’t do anything to speed up the throughput pace of the system. But if we plan to make an improvement on the bottleneck, we first need to consider what the new production system in its entirety will look like: it is possible we need to fix a few other things before we get the result we want. All of this is easier said than done. It is not always easy to define a reasonable “unit of work”. Measuring the time spent at each process step is also difficult: quite often a lot of it is waiting time and only a small fraction actual processing time. But at least we have the right framework to think about real improvements now. Astonishingly enough, that is not true yet for the majority of knowledge workers and their managers.

(First published: April 2022)

Leave a Reply