After working with several companies trying to deploy their solutions to Windows Azure, I have found the lack of two common features / patterns in their architectures that prevent them from taking full advantage of the extra IT capacity that the cloud offers. To be honest, these are good practices that apply not only to Windows Azure solutions, but to any application that requires a distributed approach.
Before I dive into the specifics, I would like to mention that both of them are driven by a simple feature that virtualization frameworks brought to the table: the almost infinite capability – only limited by hardware – of creating and starting new servers. In most cases, these virtual machines have basic specs, but it is a lot simpler to follow this approach than to start adding more physical resources to a single computer when the number of users increases. In other words, we went from a traditional model of scaling-up when we needed more capactiy, to scaling-out, or distributing the load among multiple machines.
Enter the two patterns that I mentioned before…
1. Externalize Storage
Having multiple machines running as a single unit of work (or a cluster) means that architects and developers now have to create solutions assuming that ANY of these computers could be down at any given moment. In other words, they have to be completely stateless in order for the remaining ones to pick up the work. If you think about it, this is not too difficult to achieve, as long as the storage mechanism is external to the cluster.
This pattern is easy to find in applications accessing a relational DB, since by default, multiple users, from multiple clients could be reading, removing, modifying, or inserting data. The database engine is usually hosted on a separate machine, and concurrency and transactions are managed either from the application, the data layer, or the database engine itself. However, when it comes to non-relational data, like BLOBs (simple files likes videos, images, or documents), or objects that are traditionally stored in memory (like a web application session variables), the answer is not that simple.
The good news is that multiple storage services are now available in the cloud, facilitating this transition. The Windows Azure Platform, for example, includes a component that provides different storage mechanisms with a RESTful interface on top them, as well as a Caching framework for solutions that require improved performance. Externalizing storage is, in many cases, as simple as modifying the Session provider in the web.config file for an ASP.Net application: http://msdn.microsoft.com/en-us/library/gg278339.aspx.
2. Decouple your application layers
The other situation that I usually find is that many solutions are properly architected from a layering point of view, effectively separating responsibilities for the presentation, business, and data layers… but the issue is that these tiers are tightly coupled, as shown in the following diagram:
There are multiple problems with this approach (difficult to support different UI technologies, among them), but from a cloud perspective, the most important one is that all the layers would act as a single cluster. If a bottleneck was identified – let’s say, in the business layer -, the whole cluster would have to be scaled-out. By decoupling the solution, overall performance is easier to tune-up and optimize. The most common pattern to achieve this is the façade (or service) one. After applying it to the previous architecture, it would look like this:
Notice the introduction of a Service tier, along with a different set of data objects (called DTOs, or Data Transfer Objects) for moving information between the Presentation and the Service layers. Also, between the Data Layer and the Storage itself it is very common to find an ORM (Object Relational Mapping) component, which facilitates interaction with the Database Engine – but not required from a decoupling perspective -. If you are using a Microsoft technology stack, this could be achieved by using WCF (Windows Communication Foundation) in combination with the Entity Framework. More information can be found in this article. If deployed to the Windows Azure Platform, this specific solution could look like this:
In addition to the Façade pattern, Windows Azure Storage Queues can be used for decoupling purposes. This affordable and easy-to-implement option has prompted new architecture frameworks, like the CQRS (Command-Query Responsibility-Segregation) one. I will be blogging about this specific pattern in the near future.
In Summary, when designing or moving applications to the cloud, two common patterns to keep in mind:
1. Externalize Storage in order to create stateless clusters of virtual machines.
2. Decouple the different layers in your solution by using the Façade/Service pattern, or Windows Azure Storage Queues.
As always, I look forward to your comments and/or questions…