back to software blog

Matter of scale

1. Wrong is expensive

There is no right or wrong in software, only cheap and expensive.

It may look like a strange statement. The artificial languages seem to be epitomes of logic: if a conditional expression is false, when it is supposed to be true, then it is wrong, isn't it?

However, because the meaning of software is in its usage, it cannot be right or wrong in itself. If the truth about it exists at all, it has to be outside of software, since whether software is right or wrong is decided by its usage. In this sense, it is like a hammer, or a gun, or any other tool. Moreover, a software library often is actually used by another library or software system, usage of which lies yet outside, etc, usage upon usage upon usage. The same happens to the design and development process, the structure of which, as I tried to show in the previous chapter, mirrors the structure of the code: the layers of performative negotiation have two sides, expressive (negotiation) and pragmatic (performative). Thus, the development process cannot intrinsically be "right" or "wrong". Is decided outside of it.

This may look like an abstract theorizing, but it may explain, on the one hand, why in the engineering community there are endless religious wars simmering about most trivial issues like using upper and lower case; and on the other, why there are so many speculative and highly inefficient project management practices around.

The latter happens, because it is too tempting to mistakenly substitute assumptions, vested interested, personal convictions, ego, or well-meaning but abstract schemes for "logical truth", which is somehow expected to exist.

The Pattern Languages reinterpret the problem at hand by replacing "finding right solution" with "balancing forces" [Alexander]. The pattern and its resolution essentially are molded by the forces and the variation of the forces may give you a solution of quite a different shape.

If the project or team lead does not carefully watch the interaction of forces on the daily basis; and moreover she does not devise a method for staying in touch with those forces, it will cause huge damage to the project, not because the project would take "a wrong turn", but because things will rapidly grow expensive: doing too much to early, starting pressing things too late, doing the same job many times, etc. The insiduous aspect of this damage is that its scale is non-linear, meaning that small actions are produce quickly magnified adverse consequences, but being small they easily go overlooked.

Many of low-overhead methods to avoid the problem are well-established and work wonders: Scrum, Kanban, or 5-why analysis to name a few. However, due to their intended simplicity, they tend to remain an empty shell, when applied by practitioners who learned them from a 2-day certification course.

In this chapter, I discuss a number of things that have to constantly be on the watch list. The list most certainly is not exhaustive not only because of lack of space or my knowledge, but also because all the described techniques are patterns and best practices rather than a unified theory.

2. Expensive things

Coupling, staying out of touch, and speculation, none of them being intrinsically evil (e.g, a quick concept-proof prototype may be highly coupled), all of them have a very high price tag. However, since their price is the time wasted in the [near] future, just like with the credit cards, many of us tend to ignore the inevitable pain of payback.

It is especially so in the software business, where the costs often are less material: time, human energy, or opportunities missed. Additionally, the people who bear the brunt typically are not the ones who caused it. The cognitive fallacy here is that if something is not measurable, therefore it is small. Since software to a large extent is not restricted by material constraints, the wasted time or missed opportunities can be monumental.

I am not going to offer metrics for such pitfalls. Instead, I will try to demonstrate that the cost is so high that it is worth to always avoid them, especially because fixing them is inexpensive, even if not trivial.

2.1 Coupling. The rule of the second use case

Coupling, especially semantic coupling is one of the most expensive things in software development.

Writing decoupled - i.e. having no unnecessary dependencies - code does not take much longer, but it requires the additional effort of the constant concentration not only on the problem domain, but on the semantic quality of one's libraries or applications.

Since the early success of a project often is measured in terms of speedy "modeling" of the problem domain rather than sound codebase, and also due to the implementation bias, the engineers tend to go slack on the latter.

However, the coupled code scales very poorly: the initially upbeat speed of development inevitably gets reduced exponentially almost to a standstill: the coupled code allows less and less change or natural growth, but only hacks; it contains more and more murky corners; the team gets stuck in permanent firefighting and reverse engineering of each other's code. Then the management intervenes with measures rarely targeting the right problem, things somewhat improve, and the project keeps limping from crisis to crisis.

A trivial, but common example: protocol packet parsing gets coupled with the transport protocol, for example in a single function. Both may have bugs. In case of a failure, we cannot tell straight away what caused the bug: parsing or packet acquisition. To get a rough gauge: suppose, there are 3 bugs in the parsing and 5 in the transport. The bugs will interfere with each other and thus the number of failures will multiply, as will the effort of their localization. Due to this multiplication effect, the complexity of a system will grow exponentially, and so will the maintenance time, unless the system is designed in a decoupled way where each element can be tested, debugged, and refactored independently. The price of coupling (e.g. in terms of time and effort) is exponential.

Thus, relentless decoupling. The engineers often stop at the point, when the components are "simple enough", rather than going for the rigorous quest for the minimum semantically cohesive vocabulary or toolkit orthogonal in itself and with other problem domains.

Any amount of coupling, even in the obvious things, introduces a multiplicative component in the price tag of a class or utility. Thus, points of coupling piling on top of each other add an exponential component to the price tag of the whole system.

A while ago, I wrote a library, in which one class had a boolean member. True was the natural default value for it, but I set it to false by default, since it looked more convenient for the system I was working on. The library was a success and found a broad usage across a number of projects. However, as its usage spread it soon exposed the annoying dent: most of the time, the user had to construct an instance of the class and then set that member to true by hand. If she forgot to do that, it would produce strange behaviour, and every time it would take fifteen minutes to realise what was going on. I coupled two design considerations, one of which was overfit to the problem at hand; dozens of applications used the class, thus changing the default to true potentially could break lots of things, and we had to live with the inconvenience. It has been truly a software weed.

The most dramatic decoupling of the code happens with the second use case. The second use case introduces new force, semantic tension in the class usage, which most of the time calls for more decoupling in it.

Fitting the code to a second use case is different from generalization. Generalization may follow from the second use case, but it is not the same. Generalization is finding abstract features that are common: say, generalizing a collection of countable objects as a templated vector. The second use case emphasizes not what is in common, but what is different. It introduces two aspects or dimensions along which the very same code should make sense. Therefore, to say it again, it is not generalization as finding what is common in the two use cases, but identifying two different dimensions along which the code has to scale.

As a simple example, assume, we have a map of city roads, which we load from a file and then perform some search on it. Should our roadmap class load from file on construction? Or have a method load()? If we have a second use case where the roadmap is a result of cutting a map region, i.e. loading does not semantically belong to it, which suggests that for a better decoupling load() perhaps should be a standalone deserialization function.

In the absence of anything else, the testing pretty much becomes such a second use case, since the automated test suite imposes quite a different usage onto the software artifact under test: For the practical usage of the class the most mainstream scenarios are most important, whereas the test with a good test suite focuses much more on overall test coverage, corner cases, and ease of writing heaps of tests. Thus, testing also forces to decouple the cumbersome construction of a class from its use, avoiding involved mocking, etc: the class or library should offer itself for testing. Whenever it is hard to tell from the test code what is being tested, or test does not demonstrate artifact usage, or mocking in it is not brought to the minimum (preferably to zero), it indicates lack of design in the artifact and introduces coupling between the artifact and its test. Whenever an engineer, if asked why his code is not well-tested, complains that a functionality she implemented "is inherently hard to test", you almost certainly will find a lot of coupling in his code.

The second use makes the well-decoupled code shine. But when the second use is resolved by even more coupling, it locks the coupled code: once two semantically different usages pile on top of it, it is much more difficult to refactor.

2.2 Staying out of touch

Efficient communication is one of the main premises in agile - and one of the most violated ones. Whenever there is lack of the direct communication between the stakeholders - no direct touch, the engineer most certainly will make wrong design and coding decisions, since he is forced to do too much too early, assume, guess, and pre-empt, rather than wait for the right information (and also is likely to succumb to the implementation bias). There is nothing awfully bad about it, but it just is very expensive: any wrong decision based on lack of information means branching in the development tree. Any branching introduces exponential component into the development time: we fork our development flow twice, then four times, then eight, etc. Even if the degree of branching is very low, i.e. speculative design decision are made only once in a while, the time is still a slow exponent, which is much greater than linear time. This puts a quantitative estimate on the agile communication tenet: whenever communication is broken, the price rockets exponentially.

Staying out of touch has been plaguing the absolute majority of the projects I have seen or heard of: the project leader would not have time to communicate with his team, the team would not be in contact with the clients, the engineers would have no faintest idea about each other's work, etc.

"Staying in touch" expresses an immediate connection when someone keeps a tactile contact with something else of different nature. It is not meddling or intrusion, but a contact by touch only - meaning that there should not be any separating distance either. Such a notion of touch is not fuzzy or subjective. It has to be stringent to work.

For example, in physics it is causal relations. The things that don't produce any effects are out of touch with our world. In mathematics it is exactness. The mathematics is resting not so much on its philosophical foundations that always have been notoriously wobbly and disputed (e.g. due the existential status of mathematical objects or their truth value). The exactitude of mathematical touch is not about the truth of its foundations, but about having no gaps in each step of the proof. For arts and crafts the method of refined touch is quite literal (see e.g. [Yanagi]). The famous saying of William Morris "You can't have art without resistance in the material" means that craft exists only at the point touch.

This might be another reason why writing software is so often compared to arts, craft, which, I think, is not a romantic metaphor, but a reference to a method. As for the commonality between software and mathematics, it lies to a large extent in the rigour of leaving no gaps - but not in the same way, see the chapter on performative negotiation.

2.3 Speculation is cheap

Not at all, since it is a special case of staying out of touch - but one that seems to be abused most often.

A typical situation: a design discussion that drags forever because of all the "maybe" and "should". Each "maybe" or "if" is a point of branching in the discussion introducing an exponential component into the time spent. Speculative reasoning may look like a cheap way of logically exploring all the possibilities before doing actual work, but when the reasoning tries to reach too far, the branching discussions start seriously eating into the development time.

In the same manner, sometimes during the planning people suggest, say: let us try those three libraries and then choose, after all it takes just two days for each. Trying this and that: sometimes it has to be done, but in general, when you decide to try several cheap things, it means that your development tree forks, introducing an exponential component into the development time.

Discussions consume expensive resources. 15 minutes of 4 engineers talking equals to 1 man-hour that could have been spent on writing code. Rather than saying discussions are useless, I simply say: they are expensive to be wasteful.

The other common part of speculative meetings is pursuing social or political purposes, or asserting someone's ego. I feel it is extremely damaging for the software process, but maybe those things are important for social bonding, establishing common values, or selling the ideas. The analysis of these aspects requires a different method and I would not go there. In my experience, it is important to spot and separate them from the technical purpose-driven conversations.

"Technical" not necessarily means "design". It can be about planning, setting up collaborative environment, a strategy of interviewing job candidates - anything with a clear outcome or problem to solve. And if things are fuzzy, then we need to make an effort to distinguish between two sources of vagueness: lack of decisive facts to inform the discussion on one hand, and "politics" on the other.

Mostly, we cannot obtain the missing decisive facts just by talking, but only by making pragmatic steps of doing something. After a couple of "maybe" or "if", it makes sense to pause and instead figure out what would reduce uncertainty before getting back to the discussion: instead of speculating on both possible outcomes, find what action would rule out one of the branches. E.g. we could try to build a concept-proof prototype, run a decisive test, contact the stakeholder who knows, etc.

Moreover, although it may sound counterintuitive, but even if people argue about a specific design decision, as healthy a discussion as it may seem, in all my experience the best course of action is to stop the dispute, write down the assumptions of both parties, and come up with an experiment or proof of concept that would resolve the argument. This is similar to the validated learning in Eric Ries' Lean Startup [Ries]. Most of the time, it is not about one's logical argumentation to convince, but about forces to balance (as in pattern languages): unlike the logical arguments that need to be refuted, the forces do not need to be canceled, but just balanced. Therefore, a good decisive experiment not so much negates one argument and validates the other, but rather prototypes a design balancing the forces. (In such a game of proofs, it is easy to get carried away, though. Therefore, at any moment I keep asking: "What is the capability we are trying to achieve?")

As for the political part, my pragmatic step is branching it into another meeting, whenever possible. For example, it is common during a design talk that people dig their heels on a general topic like language choice, parallelism, software process, etc. As fun as it may be, it is rarely productive. If my position permits, I always suggest as early as I can: "Yes, it is a big important thing, which deserves a separate conversation." I urge the most vehement party to later choose time, agenda, and invite everyone. Interestingly, in my experience, those follow-up meetings never happen.

2.4 Minor decisions. Butterfly effect. Principle of disposable software

Minor decisions are easy to make. What should be a variable name? Should a parameter have a default value? Should we write a brief e-mail or set up a page on the wiki? What kind of coffee machine do we get?

It is much harder to evaluate whether those decisions are right or wrong. People tend to see them as splitting hair: 'minor' sounds like 'insignificant' and therefore anything goes. However, a large part of software process and development is making many small decisions. However little each of them is, if many are not quite right or just arbitrary, it affects the project direction and momentum: either this or that unfortunate design decision constantly gets in the way, or minor things need to be redone many times, or they get coupled with each other.

Reworking lots of minor things is similar to speculation described above: there are multiple branching points in the development flow and therefore it add, however mild, an exponential component to the development speed. Not taking care of minor decisions has an exponential price tag.

I have seen a number of projects that just do not progress. Instead of linear trajectory to the project goal, the engineers are caught up in constantly working around inconveniences, jumping through the hoops, pulling forth and back the handkerchief-sized blankets of oddly-shaped classes or interfaces, etc. It resembles Brownian motion - and it is: the project has lost its direction.

Interestingly, sometimes those projects succeed, by the price of large cash injections, endless restructuring, and broken deadlines. How do they reach the finish line at all? Perhaps, in the same way as a multidimensional random walk has a chance to reach a specific point, excepts the travel time - and thus project price tag - will be highly non-linear.

An even more common pattern is the almost-done syndrome. Perhaps, it is one of the main underlying conditions of the Brownian motion in a project. An engineer says the class he has been implementing is 99% done. And so he or she says on the next day, and the day after, and so on. Very often that last percent takes longer than the first 99. Obviously, remaining design and implementation decisions and actions look minor to the engineer, however day after day the small steps he takes cannot close the remaining gap. More often than not each of the team members is trapped in his almost-done cycle, which leaves the whole team in a murky unknown almost-done state. The job of the team lead is to:

How to master specifically small decision-making, though? The large project eventually is composed of miriads of small decisions. There are two forces pulling in the opposite directions: On the one hand, however small, the decision should provide as much leverage as possible, otherwise it is not worth spending time on. On the other, it is important to make sure a decision is as small as it looks, which means that its implications does not produce a butterfly effect in the future, being in the way, stifling development.

Big decisions are hard to make, that's why they look big. Typically, there is a disconnect between the problem and available capabilities.

However, the problem becomes small, when all the capabilities already are there and all what we need is to deploy them.

How easy is it to revert the decision, so that we don't need to be dead sure that what we are about to do is right?


3. Sense of time


3.1 Uncertain time estimates and normal distribution

If you are a project or team lead, all the chances are the clients or your bosses come after you asking for Gantt charts or for your gut feel of how long a project would take, i.e. either an upfront day-by-day work breakdown or no breakdown at all. What is in common, though, is that they want a Number. When you tell them it is too hard to estimate upfront whether the project would take a year or two, they would hold you at a knife point and once you cough up a Number, later they will hold you responsible for it down to one day. The numbers that are written down exercise an inexplicable power over humans. (It is quite unbelievable, but even highly qualified people who actually helped to produce a rough number and know how imprecise it is will hold on to it just a couple of days later.)

The external and internal forces in software development have such a large intrinsic uncertainty that it is not possible - and does not make sense - to reasonably estimate something at the scale beyond a month of two. That's why the recommended length of the Scrum sprint is around two to four weeks. (It is the immateriality of language, its lack of material constraints that makes the software development so volatile: firstly, suddenly most of the physical limiting factors that would play role, say, in a construction project are taken away; and secondly, size or measurement do not mean much anymore.)

What can be estimated fairly precisely is a functionally or semantically cohesive line item on the Scrum board. From the timing point of view, the Scrum planning is complete, when one can tell with confidence about each item: it should take not more than 3-4 days.

The main purpose of it not to come up with a very accurate estimate. It is even better to give the Scrum items estimates as ranges, e.g. 1-3 days. Since it is possible to put together a mental picture of what needs to be done for the small artefacts, it is relatively easy to convince the engineers to give such rough numbers: they just need to play back the design and implementation process in their heads to see that if all goes smoothly, one day is enough, but for contingencies 3 days should be sufficient. Empirically - and that may very well be a topic for a psychological research – once a time estimate is greater than 3 days, it becomes very hard to tell whether it would take a week or a month. (It also almost certainly means that the functionality of the item is not sufficiently broken down. Every Scrum item should represent a single user/implementation story as well, but the latter is the matter for other chapters.)

However, that's not what the upper management or clients often crave for. As they sum up the lower estimates and upper once, say, for half a year or a year, they will get the ranges like 6-15 months, which sends your bosses into an overdrive. Because the magic power of Numbers tells them that their costs may vary by a few hundred per cent.

Often, the project or team lead is left with the three political options of giving a single number: To secure the project, you can gamble and give the optimistic estimate, which you know will be blown, but you also know that estimates of other departments will blow, too, and so the cycle of self-deception goes, but at least the project happens. The second option, is to cover yourself and give the most pessimistic estimate, which perhaps exaggerates the costs and lessens the chances of getting the project at all. The third option is giving the average time, perhaps multiplied by the empirical coefficient of your team. If you have not built up a team yet, 1.7 seems to be an empirical number across industry, although it might look unacceptably pessimistic to stakeholders.

Unfortunately, politically, the upper management, marketing, or clients often refuse to hear about uncertainty, especially because at the sprint level items the uncertainty is often is 100-200%. I do not know how to solve the problem politically. I find it being the duty of an engineer to understand uncertainty and convey it at least to her or his immediate management.

For the sake of argument, let us make a crude assumption that a project with the team of 5 engineers has roughly 500 user stories mapped into sprint line items. Each story is estimated to take 1-3 days to complete. Therefore, if everything goes absolutely smoothly, the project may take 5 months and if everything goes badly it may take 15. However, what is the chance of either outcome? The intuition is that, of course, some items will take slightly longer and some will end up on the pessimistic end and therefore offset each other, ending up closer to the average. But can we quantify the uncertainty better than just a 5-15 month range?

As a rough model, which could be refined by an empirical research, we could assume that the time that a line item takes follows the normal distribution or close enough to it. Also, if we have a large number of items that behave similarly and are independent enough - in our case sprint items - collectively they are normally distributed. (In reality, the items may not be totally independent. Finding the actual distribution would be the matter for an empirical research.)

For example, if you are confident all line items could be done in 1-3 days, then for normally distributed values, uncertainty of the whole project will be not plus-minus 100 days, but proportional to square root of 100, i.e. on the order of magnitude of a few weeks (depending how confident you are about the 1-3 day range). A range of something like 180-220 days looks a much more reasonable figure.

This rough gauge is useful to explain to the stakeholders that the wild uncertainty at the sprint level is reduced to acceptable levels at the project level. Of course, if your clients or marketing are prepared to believe in mathematics.

In reality, it is meaningless to do an upfront detailed user story breakdown even for 1 year. Suppose, you are asked for a detailed breakdown for a team of 10. If a year has around 200 business days and each story takes 1-3 days, you will need to think through and document some 1000 user stories upfront. Not only it is extremely tedious. It also creates a development funnel, when the development phase of a project takes forever to kick off. It also is not a very useful exercise, because sprint after sprint things will change and even for an established project detailed planning beyond one-two sprints involves too much speculation and therefore is exponentially wasteful.

Thus, the most successful software projects are consciously staged in 3-6 months deliverable installments. Ries' Minimum Viable Product [Ries] could be one of possible best practices. If nevertheless, the clients, upper management, or marketing refuse to accept it and insist upon accurate 3-4 year projections, it is guaranteed that reality will correct them down the road, mostly in a less pleasant way: either the project will fail or it will stage itself. Ducunt volentem fata, nolentem trahunt.





[Alexander] Christopher Alexander, A Pattern Language, 1977

[Ries] E. Ries, The Lean Startup.

[Yanagi] Yanagi Soetsu, The Unknown Craftsman.


Meaningful software. Notes on software design and process

Vsevolod Vlaskine

Introduction. Software engineering and humanities

Code as two texts

Architectureless design

Staying in touch. Performative negotiation

Matter of scale