back to software blog
 

Architectureless design

1. Complexity of system architecture

When one designs software, what comes out of it? An architecture, is not it? Or is it? The word "architecture" applied to software drags has so many distorting connotations that I am trying to consciously expel it from my vocabulary.

The architectural approach assumes a vision of the system where its components connect, interact, build on top of each other, to form the whole edifice functioning as one organism. Such a vision maybe important in a project, but only as, well, a vision rather than the master blueprint from which the system eventually materializes.

Once we admit that we are building "a software system" the next inevitable design step would be to split it into parts, establish their relationships as subordination, aggregation, connections, interfaces, data flows, etc. Then we further analyse the parts in the same vein, etc. Even when we look at the system's aspects or views (functional aspect, security aspect, deployment aspect, testing aspect, etc), we consider those in their turn as systems, just like cardiovascular, nervous, endocryne, etc systems of the human body.

This approach seems to have served us well for a few centuries of industrial manufacturing. It has been used in software for decades, too. In the software that landed man on the Moon, flies guided missiles, and drives life-critical medical devices. Is there a problem at all?

It is a commonplace that once an upfront software architectural blueprint is eventually implemented, the result often bears little semblance with the original design, due to all the implementation details, adjustments, special cases, undocumented, but necessary features, etc.

The agile approach deals with this inconsistency, suggesting that we should elaborate and beef up the software architecture of our system in incremental steps in parallel with its actual implementation, adapting it according to the feedback from the on-going implementation and changing requirements and therefore architectural design and implementation happen in parallel in a tight incremental loop. This makes the system design more adequate, but very tedious to maintain, overburdened with details, and frankly, from my observations, even those engineers who complain about the lack of documentation are slack at doing their own bit, when it comes to updating the design diagrams.

Thus, since the primacy of the architectural approach does not seem to match our daily practice, the question to ask is:

When we design software, do we really work on some sort of architecture first, or we are essentially doing something else, and the architecture comes only as a convenient afterthought or supporting act?

To get closer to the answer, let us ask ourselves why in the first place we even need to design a "system architecture".

Because it represents the system that we are trying to build? But from all the experience, it does not. The note Code as two texts and the section Software design, visuality, and space below deal with some of the conceptual reasons for it; and as we discussed just now, practice shows that the final implementation most of the time only vaguely and inadequately is reflected in the upfront or even incremental architectural design.

It seems that the motivation of the software design is not so much accuracy, but simplicity, the need to grasp a complex system in simple terms, before it is even built, so that the customers, management, and project team could understand and share various aspects of the product as a whole in the course of the project. Therefore, the simplicity of expression is perhaps the primary goal of the architectural design.

Some of the tools to achieve the simplifying breakdown are various types of UML diagrams, state machine diagrams, database diagrams, etc. Their usage is demonstrated with elegant pictures in textbooks. However, in most of the real-life designs, the number of states, tables, or connections renders the diagram an illegible dog breakfast, especially if you would like to maintain the consistency of the documentation and the code. The alternative is to leave less essential connections out of picture, but that usually makes the diagram too generic to communicate much beyond creating in the upper management the impression that the system is "designed" and "planned".

Therefore, we need simplicity to be able to work on a software system, but representing it as an "architecture" does not quite scale: it either has too many entities and connections there, or, when some of them are hidden, it does not tell you much.

Let us re-formulate what we are trying to achieve: we need a simple adequate way to work on a software system.

Programming essentially is a pragmatic language activity, as I tried to show in the previous chapter, every statement we write in C++ or python, being executed by the computer, does something. Thus, tracing how the design methods plug in into language (both natural and artificial) may help us to understand how to achieve a simpler, more structured, and more adequate view of a software system we are trying to build.

2. Simplicity of language. Laplace and the Centipede

Laplace's daemon is a 19th century hypothesis: if an omniscient daemon knew the positions and speeds of all the particles in the universe at a given moment, he would be able to calculate the exact state of the universe at any time in the past and future.

This hypothesis does not hold because of the laws of thermodynamics and quantum mechanics.

However, the software version of Laplace's daemon described in the previous chapter still persists: if we have the full source code of a software system, it seems that we should be able to tell exactly what the system does. Without going into theory, from practice we see most of the time how the projects that started with an architectural design end up with the convoluted code base that requires a score of engineers - a bunch of full-time Laplacian daemons - and even they are not able to answer how exactly the product would behave in any given situation. There also is no way to thoroughly document its endless state space.

The software version of Laplace's daemon fails due to the fallacy that a system can be accurately represented by its description. The full documentation trivially - and uselessly - equates to the full code. It happens because the language elements just do not behave like physical objects (see the previous chapter). The complexity that we are trying to limit in our design effort is not the complexity of things (e.g. of the car or human body), but the complexity of language.

Unlike physical objects, breaking down into parts does not necessarily simplify things in language. It is the Centipede's predicament: when asked how he walks with so many legs, the centipede went so confused, while reflecting upon her movement, that she could not walk anymore. It is a well-known psychological effect [2], but it is equally true for the software projects that get stifled from the analysis paralysis of the top-down design.

Without the risk of being misunderstood, we say "walk", instead of "contract such and such muscles in such and such sequence". And we say "go to the pharmacy" instead of "walk 100 metres, turn left, walk another 200 metres, etc...". The language actually is very simple: just with a few words it helps us efficiently act in immensely complex contexts.

3. Capability deployment. Brandom

For a good design, we need to be able to speak about the system in simple meaningful terms.

What we need is the right vocabulary. The difference between a system architecture and a vocabulary is that the architecture essentially starts from capturing parts of the system wired up by the relations between them, whereas the vocabulary is like a toolbox: there is no need to draw relationships or connections between a hammer, chisel, and saw, but when used together they let one do all types of carpentry.

At the design level, the vocabularies should be sufficient for the client to tell what she wants to do with the system, i.e. tell user stories. The design task is to express what needs to be done to fulfil a collection of user stories.

If we are asked to implement an online clothing store, perhaps with a virtual fitting room, we will need a web interface, some image processing algorithms, a database, a payment system, etc. We would go through each user story, making sure these items would certainly cover the required functionality, once they are are cobbled together somehow. Rather than drawing a system diagram, it is sufficient for us to know that we can compose those items with each other in various ways to cover user stories.

Now, we have an adequate bullet-point list, the implementation vocabulary (website, database, etc) for the user story vocabulary (find, display, fit, purchase, etc).

It is important that there is not a single user vocabulary, but a number of them. In our example, the payment-related user stories perhaps will have nothing to do with the fitting-room user stories. Those two will form separate vocabularies or mini-languages. In the same way, the implementation vocabularies also are many: choice of the front end is pretty much decoupled from the database, etc.

As the next step of working on our product, we write user stories regarding website, database, etc necessary to be able to accomplish the online-store user stories. What was an "implementation" vocabulary, becomes a "user" vocabulary.

Say, for the user story "fit a shirt", the "implementation" stories would be: "let user choose her/his figure type", "upload user photo", "morph the shirt image", etc. These "implementation" stories become user stories of the next level. We need to devise a list of things that would be sufficient to satisfy them. This list will become the next "implementation" vocabulary, etc, until at a certain step we realize that we can express our vocabularies as library entries or collection of applications, rather than todo lists in the natural language. Importantly, it happens not when the vocabularies and stories become "simple enough" and without further analysis we can jump to their implementation, but when the stories simply can be written down in the programming language rather than in the natural language (the earlier we would be able to start coding, the better). This design process is two-way: not only top-down, departing from client user stories, but also from the point of view of building clear- cut capabilities (of the organization, the team, and the codebase). Our main goal may be keeping full focus on the specific project, but in doing so, it would be wasteful not to take into account building capabilities: libraries, utilities, frameworks, teams - that can be reused in the following projects and provide business continuity.

Robert Brandom in his book "Between Saying and Doing" [4] makes this process more formal. Although his book is generally about language and meaning, it has direct relation to software engineering.

One of the main concerns of the language theory addressed in Brandom's book is the question of meaning: how do the words mean anything? Following Wittgenstein, Quine, and Sellars, Brandom says that semantics, the meaning of the words in a vocabulary has to be established through their pragmatics, i.e. how those word are actually used in the language, "the practices of deploying various vocabularies rather than the meanings they express" [5, p.3].

Since software engineering is a practical thing, it is simpler and more radical, in a way. It is simpler, because, unlike a theoretical discipline, whatever works better provides a sufficient validation of its method. It is more radical, since "use" in software means not just "use in language": all statements and expressions in a programming language do something (they are performative in Austin's sense [5], [1]). Pragmatics of programming is much stronger than that of generic phrases like "it rains", since in software development everything needs to be effectual to be meaningful.

Further in his book, Brandom explores the relationships between the meanings of various vocabularies. He maintains that the meaning is not self-contained in language, but is in its usage, its pragmatics, showing that various vocabularies are pragmatically mediated, calling it "pragmatic expressive bootstrapping". As I wrote in the previous chapter, the code tells us something as well as is executable and therefore does something, calling these two sides "expressive" and "performative", which is practically synonymous to Brandom's terms.

If we have a vocabulary that represents certain capabilities (e.g. various software concepts: database, website, algorithms, etc) and want to express new vocabulary of meanings (e.g. elements of our online store), we need to deploy the former vocabulary, i.e. in software terms, simply by using the former to implement the elements of the latter. As we saw above, the first vocabularies can be implemented by deploying yes another implementation vocabulary, etc.

While not so obvious in the philosophy of language, this step seems almost trivial in the context of software engineering.

4. Some conclusions

Since the isomorphism of the relationship between the software feature and its implementation with the meaning-making structure is so trivial, design and implementation of software systems conforms with the structure of meaning-making in language. Once again confirming that software design is essentially a language activity and therefore the linguistic laws and concepts are highly applicable here.

The purpose of the software design, rather than drawing "relationships" between the entities like database, website, etc, is deploying a collection of vocabularies or mini-languages on top of each other that eventually would be sufficient to build a vocabulary fulfilling the product user stories. Libraries and collections of utilities are good examples of such mini-languages: software design is not about building system architecture, but about defining a collection of mini-languages. Those mini-languages (closely related to Pattern Languages) express not only the user features of a software product, but all its implementation details.

One mini-language is expressed through deploying another one. However it is not their hierarchy that matters, but the ability to produce new statements. The ability to produce new statements is essentially code reuse, which builds the organization capabilities beyond the current project. But most importantly, when in the current project we capture a limited number of relations and connections in a system architecture, any of those relations established too early almost inevitably make the two interrelated entities semantically contaminating each other, producing a point of coupling and unnecessary rigidity.

Instead, a good designer keeps the number of predefined relationships in the system to the absolute minimum and deploys decoupled, semantically sufficient capabilities - libraries, concepts, naming conventions, domains of natural language, etc - as performative phrases, statements, usage scenarious, or user stories. We need not the "design" of a system, but the ability to speak in terms of the system.

This allows to reformulate how the design adapts to the changing requirements: If we have built a "system" that corresponds the "requirements" and then the requirements drift, we have to change the system. If in response to the requirements, we have crafted a vocabulary of capabilities and the requirements change, very often we do not need to change our vocabulary, but just express the change in its terms, which involves much less effort and stress.

(Another interesting implication is that the whole idea of ontologies as the formal representation of "knowledge as a set of concepts within a domain, and the relationships between those concepts" [5] very likely may be a dead end.)

Lots of things in this chapter may look trivial, but over years I noticed that far too many software engineers and managers seem to think that if something is trivial, it is not worth doing (because after all they are there to excel in the complex stuff), whereas to me, one of the main virtues in software design is the ability to speak simply about complex things.

5. Software design, visuality, and space

I tried to demonstrate that the language of plain-English documentation or more formalized means like UML are less adequate ways to produce and communicate design than actual programming languages. However, surely there is one thing where design diagrams beat artificial languages: the diagrams can show things.

Being a very visual person, I am not trying to say there is no merit in drawing diagrams as a part of software design and documentation. However, I find really weird the relatively common assumption that starting software design with drawing pictures is universally a best practice, to the extent that the large part of the corporate world makes the graphic representation the centrepiece of the software design with the whole cottage industry built around it.

As an example, for many years, UML has been promoted as unified modeling language, suggesting that it has all what you could possibly need for software design.

Perhaps, it happens partly because the software documentation without pictures is notoriously strong a remedy for insomnia. Also, the upper management and marketing love the look of charts and drama of colour.

More importantly, the technical blueprint as a hallmark of the industrial era is expected to perform just as well in the information technology.

The problem is that the picture essentially is a spacial representation. It portrays entities in space and therefore imposes a very particular view of the world (and ways of modelling it) as well as limits of what can be expressed in such a graphic language.

It starts with the suggestion to split the system into its major components, just like a car that consists of the frame, engine, wheels, etc. Then each component can be decomposed into smaller modules, etc. Industrial technical drawings historically fit this purpose very well, since they are dealing with material objects.

Then, the types of diagrams start to proliferate to assign visual symbols to various kinds of semantic relationships, trying to mimic non-material things as material objects. The picture suggests the design where there are entities and relationships. Entities are represented by squares, matchstick men, time axis, etc. They roughly refer to a 'place' in the visual space. The relationships between the entities roughly correspond to 'distances' and 'directions' between 'places'. (There already is a problem with such a break-down. The inventor of Pattern Languages Ch. Alexander demonstrated [7] that the top-down analysis cannot adequately represent complex heterogeneous systems for mathematical reasons.)

Many highly efficient concepts are hard - but more importantly unnatural - to express in the world of entities and their relationships. Various transient objects, like scoped locks and transactions, may be a good example, because from the pragmatic point of view we even would not think about them as objects, but rather qualities, interactions, semantic transformations and flows. Of course, we probably could invent diagrams to depict them, but why do it, if they find an essentially better expression in human or programming language?

In Jack W. Reeves' words [9], believing that there would be a suitable visual representation for any aspect of software modelling is like being "convinced that "different language" really just means "different dialect of English".

As a luckily somewhat outdated example, UML comes from the early object-oriented paradigm in which the world is represented as a collection of types, instances and relationships. The limits of such an approach have been demonstrated by the semantic shift in the modern languages. Take a look at the seminal C++ books by Meyers, Alexandrescu, Sutter, or online python documentation: there are not many pictures in them. Ask yourself why or try to draw diagrams that would adequately represent concepts of the modern C++ or python.

I wonder whether choosing UML as the preferred language for Software Patterns has made the books on them less read and patterns often misunderstood, being considered not as a semiotic system, but as a list of illustrated designs and recipies. Even Christopher Alexander's book A Pattern Language [8], a dictionary of architectural patterns (what can be more visual?), does not contain too many illustrations. One reason for it is that a "design diagram" is a symbolic representation of a particular static system (see Chapter 1 for the distinction between symbolic and semiotic systems), while the description of a design pattern essentially is about its variability and degrees of freedom.

When the translation between ideas and language gets funneled through the unified graphic representation, things get lost. Contrary to the conventional: "a picture is worth a thousand words", the right word can awaken a thousand pictures. While the picture merely explains, the word creates and transforms.

The visual in the Western tradition has been a vehicle of power and control. The tension the visual representation brings in is the one between visibility and transparency. These two words do not presume each other. Visibility is about total exposure, surveilance, fixed schematics for the purposes of accountability, unified control, a panopticon. Transparency is rather about being permissive, unobtrusive, non-intrusive, not obstructing the flows. Flows mean not only efficient operation, but also the ability to morph and change.

In a strange twist, 'visible' and 'transparent' often become synonyms in the process management, while in the vernacular 'transparent' means exactly the opposite: 'invisible'. When the visibility gets mistaken for transparency in the software team, the design and documentation are required upfront, get divorced from the code, and very quickly lose their consistency with the functionality. With the latter drifting away from the document, 'visible' returns to its original connotation of 'non-transparent', 'opaque': the pictures and documentation do not reflect what the system does. They block the view and become noise that drowns something that might be otherwise a useful visual signal.

One could say, an organizational process is transparent, if nothing urges you to know its details, nothing makes you suspicious. Then, the next desired quality of a transparent process would be: as long as it flows smoothly, it is invisible, unnecessary to see. To achieve it, we need to use the pictures extremely parsimoniously, remove them as a source of noise and instead focus on a good guarantee that only if something goes wrong, then we have enough of visual capabilities to expose the wrong part visually.

Test-driven development could be an example: we express the proper functioning and semantics of the system not through diagrams, but through test suits so that while they pass, we do not need to worry about them. But when a test fails, we want a graphic indication of it.

Similarly, in Scrum, as long as the burndown chart moves down smoothly, there is not much to talk about besides the brief scrum updates, but once the team observes a growing hump on it, it clearly indicates which part of the sprint has become a blockage. The low organizational and documentation overhead in Scrum not only saves time, but reduces noise in the system.

Approaching it from the other end: whatever needs to be visible or visualized should be suspicious and should not be taken on board without a good reason. Things are visible because they are opaque, which means that they represent a solid structure capable of blocking flows and blocking view.

For example, the stakeholders tend to perceive Gantt charts as the true image of the resource allocation, which is another version of Laplace's daemon in software (see Chapter 1). However, any non-trivial project has too much uncertainty and simply does not flow exactly according to the plan. Gantt charts misrepresent the project, add massive maintenance overhead, and introduce milestones or deliveries at wrong times.

Over-engineered applications or over-structured teams require visualizations exactly because they are opaque, blocking flows and lines of sight too easily. (One almost could consider the visual design as a kind of negative space, a system of points of blockage and failure - and effectively that is what test-driven development does.)

The software artifacts and design process are functional, semantics- and flow-oriented. Therefore, non-obstructing flow (in any sense: code flow, data flow, flow of design activities, human communication flow) is more essential than the structural visibility. Also, they have close affinity to changing and time, not only its linear axis, but also its qualitative, transformative structure, which is not necessary linear. Take "software life-cycle" as a colloquial example of cyclic time structures in software design; or "time-to-the-market" expressing not just linear time, but a finite strongly structured segment of it. Spatial representation of qualitative aspects of time is doubtful, while languages (natural or artificial) have been developing exactly to reflect and express such temporal qualities.

Of course, everything changes, when pictures and diagrams are used as transient, disposable, molecular, optional elements of collaboration, much like words of the spoken, written, or programming language.

 

2013

References

[1] Code as two texts

[2] http://en.wikipedia.org/wiki/Laplace's_demon

[3] http://en.wikipedia.org/wiki/The_Centipede's_Dilemma

[4] R. Brandom, Between Saying and Doing: Towards an Analytic Pragmatism.

[5] J. L. Austin, How to Do Things with Words.

[6] http://en.wikipedia.org/wiki/Ontology_(information_science)

[7] Christopher Alexander, A City is Not a Tree, 1965

[8] Christopher Alexander, A Pattern Language, 1977

[9] Jack W. Reeves, Code as Design, 1993-2005

 

Meaningful software. Notes on software design and process

Vsevolod Vlaskine

Introduction. Software engineering and humanities

Code as two texts

Architectureless design

Staying in touch. Performative negotiation

Matter of scale