onsdag 11. januar 2012

2 cups of messaging, 1 tablespoon of async and a whole lot of context

Just got some very interesting input from someone who had the patience to watch through my Monospace talk. First off I was gutted after looking through it myself. I did not manage to communicate the content anywhere close to the way I had intended to. No matter how painful it was watching it I now know what to work on. I'll practice harder and to a better job next time :) Let me try to make a bit more sense here.

Øyvind Teig left this comment on one of my previous blog posts. After reading through some of his material I decided to reply through a new post. And for the record his papers are well worth reading!
I just saw the Monospace talk, and read some of your blog notes. Just as I have been stating that "synchronous" programming is what we should do, and you say about "asynchronous" the same, we both perhaps, might be easily misunderstood. The problems you told about in the lecture (shared state and ordering etc.) and the fact that messaging is naturally distributable, and that “the world is asynchronous”, could in fact advocate any way. Synchronous and asynchronous are tools in the toolbox. I work in embedded safety-critical systems, where sending pointers internally is bad (if they are still in scope after sending), message buffer overflow is bad (if it restarts the system and not handled at application level), and WYSIWYG-semantics is good. Not trusting 50% of the code, as an argument for doing things asynchronous is ok. But it has to do with layering, not paradigm. If the spec says you must wait, you wait. If not, you don’t. Also, do observe that asynchronous messages (and not rendezvous as in Ada or channels as in Go) makes the messages / sequence diagrams tilting – good for some, not wanted in other situations. What happens after you “send and forget”? Is this what I want to do, always? Of course not. And “waiting” does not mean “don’t do anything”. It has to do with design and also “parallel slackness”. I have blogged some about this, see http://oyvteig.blogspot.com/, and published some, see http://www.teigfam.net/oyvind/pub/pub.html. 
Øyvind Teig, Norway
 As Øyvind points out interpreting a saying using your own knowledge can easily lead to different outcomes depending on the person. From his perspective (and please arrest me if I'm not interpreting you right) async functionality should be boxed in and modeled through what he calls the "Office Mapping Factor" as an example. And it makes perfect sense!
The basic idea of this type of modelling is using CSP (Communicating Sequential Processes). CSP defines separate processes/threads running in parallel while inter communication between the processes/threads is happening synchronously. The receiver of the message has to acknowledge that it can/want's to receive the message rather than the fire and forget / queuing approach.

In my talk and blog posts I tend to talk more about fire and forget messaging. Systems that run asynchronously without the various parts of the system being aware of the asynchronous behaviors happening around them. Each part or rather block of functionality in the system works synchronously while message consumption happens asynchronously. Quite the opposite of what CSP states.

And again the differences are all about context. Personally I write code almost exclusively for servers and desktop computers that has the amount of memory, cores and disk as is desired. Øyvind on the other hand has 128KB program memory and 32KB external memory at his disposal. Meaning all my assumptions of how stuff works goes right out the window. Like he explains: Yes you can use a queue but what happens when the queue is full and you get queue overflow (a scenario that has not even crossed my mind)? With 32KB of memory that is quite likely. The system would crash or halt wouldn't it? How about spawning processes? Same thing. In this context the environment plays a huge role and the architecture needs to reflect it. How you spend your resources is critical.

Establishing a context around the problem you are trying to solve is crucial to how you end up implementing it. I quite enjoyed reading about how Øyvind reflects around the Office Mapping Factor. So how come I tend to approach asynchronous programming so different from Øyvind?

Let's set the context for what kind of system that is usually on my mind when I'm talking about asynchronous systems. Everyday applications. That is the easiest way I can put it. Your everyday Order\CRM\Whatever application. Not an application that needs to process 10.000+ messages a second. Not an environment where there are restricted amounts of resources nor where the application consumes unnaturally huge amounts of resources.

I guess my approach can be split into two parts. The first thing being messaging and the second being asynchronous. Usually messaging is the reason why I use event/message based architectures. It is not from the desire of making asynchronous systems. However making a message based system asynchronous is trivial.


Why message based system?

Abstraction, Coupling
High coupling is like the perfect fertilizer for code rot. A change to any piece of code can break everything because everything is coupled to everything else. How do you prevent this from happening? You introduce abstraction. There are multiple mechanisms that can decouple systems like injection of abstract types (interfaces, abstract classes..), function injection (delegates) and messaging. Messaging takes abstraction farther than the other mentioned mechanisms by adding a dispatcher that routes the message to the consumer. Because of this the sender is only dependent upon the message dispatcher and not the message handler. The message might be dispatched to multiple consumers without the producer of the message knowing about it. Instead of calling a method handed to you through contract (interface, abstract class, delegate) you produce an output others can consume.

Producing real output
By making sure that your code feature produces an output in the shape of a message you have made that code feature a stand alone piece of code. It can now make sense beyond the contract that would have otherwise been passed to it. Being a stand alone piece of code it becomes a natural place for integration. Integration that would happen through consuming a message or producing a message that other participants can consume through the message dispatcher.
Open Closed principle: Software entities should be open for extension but closed for modification. Another principle which violation easily leads to code rot. Since messages (real output) flow through the dispatcher we can now extend functionality regarding message handling without modifying existing code.

Testing
Since our small software blocks now produce real output it is very testable. We can rely on testing input and output focusing on given the circumstances what is the expected result (output). Create a fake message dispatcher for you tests. Publish a message for your feature to consume and verify that the message produced by your feature is as expected.




Now to the asynchronous part


A message based system consists of participants, a message dispatcher and messages. The participants can send messages through the dispatcher and/or receive messages from the dispatcher. Whether your message based system is asynchronous or not can be as trivial as the dispatcher saying _consumer.Consume(message); vs ThreadPool.QueueUserWorkItem((m) => _consumer.Consume(m), message);. That little detail will make you able to have multiple features running in parallel vs features running sequentially.

Even though the code that needs to be written to make a system asynchronous is done in a couple of seconds the impact of that change is enormous. Both in what you can do and what you need to make sure you don't do. Deciding to go down the road of asynchronous executing features requires that you carefully model how you want to handle state.

First let's discuss what writing a message based system really means. For now we have looked into how a single handler produces and consumes messages. However message based systems are often about message chains. Just like mentioned in messaging and abstraction the message can split up the larger system features into smaller software entities that produce and consumes the various messages in a chain. A chain can consist of for instance RequestDelivery->PickItemsFromStock->CheckoutItems->PickTerminal->DeliverToTerminal or DomainLogic->...->...->DbPersistence. Each of the steps in this chain will be able to produce and consume messages.

When writing asynchronous message based systems I tend to divide my types of message handlers into a few categories based on how it relates to state. Keep in mind that each message handler might handle a message in parallel with other handlers.

Stateless Message Handlers
This is my first choice of handler as it has no side effects. It will only transform the input to an output message without writing to any publicly exposed state. It basically means (if a class) creates a new instance of the message handler, pass it the message and when done dispose of the handler.

State Message Handlers
These types of handlers deals either with accumulated internal state or external state. Either way it needs to make sure that it takes the right precautions (locking etc.). If only dealing with external state it has the ability to do work in the same instantiate, handle and dispose manner as the stateless message handlers. If dealing with internal state it needs to be a running "engine" that runs for as long as the scope of it's contained state. Either way it deals with state that at any time can be corrupted so I model these handlers CAREFULLY.

Queuing Handlers
As the name implies they do not dispatch the message like the others but rather stores it to some predetermined queue(s). Queuing handlers are usually something I go for whenever I want a persisted step in the message chain, a critical running ("engine") needing to pick work based on it's own task scheduler or I need to go cross system with some sense of reliability.

At this point it is pretty obvious that choosing how to write asynchronous code deeply depends on the environment where it is written and the context of the problem that is solved. My context as mentioned tends to lean towards normal business applications with a moderate requirement for computer hardware. Had the point of the system been to process a large amount of messages pr second then that would have affected the systems architecture. Compromises are often taken between resources and maintainability. When being able to focus less on max performance we can focus more on writing the system in a way that is max maintainable.