Over the years I have seen and been responsible for the design of many enterprise systems. From media encoding servers to financial and residual payments, I have seen many systems designs. Some good and some let’s say “required more thought to begin with”. The price to pay for a system that wasn’t designed correctly is high. It usually starts with throwing more hardware at the problem. Then when bugs and patches arrive it becomes clear that the system is difficult to maintain. By then the original team have moved on and a new set of developers, usually less experienced is hired to maintain the system. Everybody in the business can’t wait for version 2.0 that promises to be a departure from version 1.0 and the code quickly becomes legacy code. The lifespan of a good system can reach five years while a poor system will generally be replaced after two. That is if it reaches production at all.
Here are ten tips learned after years of architecting.
Write multi tiered code
Your system will be deployed to many servers and will have many different functions. It’s important to avoid code duplication by having a well organized and loosely coupled code base. Database, Data Layer, Business Layer and UI.
Separate your systems to thin slices
Processing power is important for a system that has many functions within the business. If your code is multi tired you can think about it as if it was a Seven layer dip. Now it needs to be carried over by an array of tortilla chips (anybody else hungry?…lol). Each enterprise slice is responsible for a unit of work and each can be hosted on a separate server or server farm.
Limit communication between system slices
Enterprise slices need not and should not communicate with other slices unless absolutely necessary. Communication should be limited to one way and be asynchronous as much as possible to avoid deadlocks in case parts of the system are down. If a Web Service slice needs to communicate with a windows service slice it is always better to communicate thru let’s say MS Queue then to open a WCF endpoint (even if the we use netMsmqBinding).
Limit system slices awareness
Since Slices are in charge of executing a unit of work, slices should not be aware of what other slices are doing and should not count on other slices work to be done. If there is a sytem dependency it should be on the result of work, not on the work itself.
Pull systems are better than push systems
Push systems use a dispatcher to push work to the different parts of the system. This introduces a weak link to the chain. It’s a central brain. If a brain shuts down the entire system shuts down. It also creates an extra part in the system to maintain, communicate and so on. A system that queries a Journal table in the database or a Queue for fork will be infinitely simpler and faster. The individual parts are better at governing themselves in what amount of work they can execute and can safely self terminate without fooling the brain they are still operational, causing it to distribute more work that may end up never being executed.
Your web server is for serving web pages only
I have said it before and ill say it again. If code is not performing simple CRUD operations to display results or receiving input from a user, it has no business living inside your web server. Web servers are for serving web pages only. If you are processing media, sending emails or invoking reporting from your web page or service, consider slicing the system thinner and moving code to other slices.
Communication should be reliable
If you must communicate between slices, make sure communication is reliable. It won’t hurt to follow up on it if you can, just to make sure your message arrived safely and most communication technologies can verify this for you.
Cache whenever you can
Yes you are accessing the database or the result of a web service that has been highly optimized for performance by the greatest minds in your organization, probably you. Still you have no idea who else is accessing the same data or the frequency in which it’s being accessed. It is better to fetch more data then you need in one query and then load it into memory. You can then query the memory structure in the boundary or your own slice. This will probably be much faster than constant Database round trips. You might also use distributed cache systems like Velocity to help you with caching data in a central location which can take the load of other systems.
Implement security in every layer of your code
Don’t leave security checks for the UI. You don’t know how many UIs your system can end up with. Consider performing checks in ever layer in your code.
Measure your bottlenecks
It’s a big part of the art of performance tuning and defining it in two or three lines here would simply not do it justice but I will go ahead and try nevertheless. Measure every communication channel and every slice in your system for bottlenecks. If you find them after deploying to production it’s already too late. Your system only gets one chance at a first impression. If the user’s impression of the systems is that it’s slow and laggy, it won’t change much even if you tune it afterwards. For the user it will always be a slow system.
Have a real pro set up your hardware
Yes I know that the title of this blog post is the ten basic rules of enterprise systems design so what is number eleven doing here? It’s not a tip about software at all! Well the truth is software developers don’t like to admit it but we depend on hardware to make our code shine. We all have top of the line workstations with windows server 2008 64bit running on 12GB of RAM with solid state drives, but, production systems are usually out of our control. We don’t even have access to them and the design of our SAN is usually left for IT. Well the truth is that no matter how good and stable your code is it needs a stable and fast environment to run in and do its magic. Building hardware and data centers to run your system is just as hard and complicated as coding it. The best thing you can do is hire the best professional you can afford. A poorly designed Data Center will forever stain the reputation of your company with downtime and slowness and will end up requiring expensive professionals to maintain. Plan it right the first time and you won’t be sorry.
Designing a fast and reliable enterprise system isn’t easy. It’s an art that is acquired over years of experience. I hope this will give you a good jump start into understanding of what goes into designing such systems.
21 Oct 2009 6:28 AM