Software Architecture Rules of Thumb
From time to time, folks have ask me for guidance or pointers on software architecture.
It gives me pause, it’s such a vast subject. But I’ve been asked enough times that I guess I should take a go at it.
First of all, a key thing I have learned is that principles, or rules of thumb, have more leverage than specific solutions. When faced with a new problem, you may not be able to find a specific solution that’s already there for you. But if you have a good tool belt of principles at your disposal, you can apply them to the problem and they often very quickly guide you to a reasonable solution.
In the spirit of delivering value early and often (well, hopefully it’s value), I’ll share this as a series of posts.
I don’t know how many I’ll end up writing. But I’ll keep each one short so it’s easier to write and easier to read.
The first one is one that came to me early in my career, and has been useful over and over again: shared nothing.
[And no, that doesn’t mean keep all information to yourself and share nothing. Although that can be valuable too depending on what kind of environment you’re in.]
Back in the mid-nineties I was working at Sybase on a highly parallel database project. In his design overview, the architect talked about how the only way you can have horizontal scaling is with a shared-nothing architecture.
It came with a picture, and that picture stuck in my mind (I’ll have a separate article on the power of pictures). It looked something like this:
It bothered me a little bit, being a literal-minded engineer, because you’re still sharing the network. But I got the point.
It was compelling, very compelling. When you have shared state, you have contention, and when you have contention, you can’t scale.
Another picture related to this that has always stuck in my mind is this one:
The little box saying “parallel portion” defines the amount of processing that can happen without synchronization. Notice — at some point no matter how much metal you throw at your problem, you don’t get any faster. It just locks up. Even at 95% parallel processing, that 5% synchronization will ultimately kill you.
This is such a core principle to understand, and it applies to any system, not just software systems.
The desire for command and control
The natural tendency when you’re building systems is that you want control, and that means everybody has to check in and get approval on a regular basis. I have seen it in tons of software systems, but it is even more prevalent in organizational systems.
I read a book on creativity where the author posed the question: why is it that as cities get bigger, they become more vibrant and creative, whereas with companies as they get bigger they become more and more deadened and dull and lack innovation — you know, “enterprisey.”
His answer was that with cities, there is no hierarchy of control that is directing everyone focus and actions. In a city, people are allowed to interact with each other at will and find their own ways to meet their goals. Whereas with a company, with its structure of control and command, you hit Amdahl’s Law and lock up.
For both organizations and software, what this means is you have to let go of that desire for control, for making sure absolutely everything is going according to a centralized plan. You have to let go of two-phased commit, locking, change approval boards and architectural review boards.
One significant aspect of the agile movement is to create teams that are fully empowered to deliver value independent of external approval or dependencies. That is essentially a definition of the shared-nothing architecture. And its impact is significant. Even the military is looking at moving away from command-and-control to agile platoons.
Using the shared-nothing rule of thumb
Keeping this principal in mind, try to catch yourself when you are putting together a design that requires some form of check-in, synchronization, locking or serialization. If this design needs to scale, you’re asking for trouble. Think about what constraints you assume you need, but can actually let go of and still deliver a successful solution.
Here are some examples of how you reduce or eliminate the need for synchronization or serialization:
- Rather than calling another service, post data on a queue
- Each service has its own database
- Generate a unique id on the client rather than use the database to generate ids
- Evolve interfaces in a backward-compatible way rather than requiring everyone upgrade to a new version in lock-step
- Use optimistic concurrency to detect conflicts
And also notice when your organization is getting locked down by command and control, and find ways to enroll your management into a more shared-nothing approach. Some examples:
- Train technical leaders on each team so they can review designs without centralized approval
- Allow each component to be deployed independently
- Don’t create agile teams that span locations or continents
- Remote teams have their own product owner and can make their own decisions for their features
- Distribute expertise across teams rather than have centralized teams (e.g. for UX, UI, data management, ops and so on)
You get the idea. It’s an amazingly powerful principle, and one which can be uncomfortable and get a lot of resistance, particularly from the business. But when you do it, boy does it work.