onsdag 11. januar 2012

2 cups of messaging, 1 tablespoon of async and a whole lot of context

Just got some very interesting input from someone who had the patience to watch through my Monospace talk. First off I was gutted after looking through it myself. I did not manage to communicate the content anywhere close to the way I had intended to. No matter how painful it was watching it I now know what to work on. I'll practice harder and to a better job next time :) Let me try to make a bit more sense here.

Øyvind Teig left this comment on one of my previous blog posts. After reading through some of his material I decided to reply through a new post. And for the record his papers are well worth reading!
I just saw the Monospace talk, and read some of your blog notes. Just as I have been stating that "synchronous" programming is what we should do, and you say about "asynchronous" the same, we both perhaps, might be easily misunderstood. The problems you told about in the lecture (shared state and ordering etc.) and the fact that messaging is naturally distributable, and that “the world is asynchronous”, could in fact advocate any way. Synchronous and asynchronous are tools in the toolbox. I work in embedded safety-critical systems, where sending pointers internally is bad (if they are still in scope after sending), message buffer overflow is bad (if it restarts the system and not handled at application level), and WYSIWYG-semantics is good. Not trusting 50% of the code, as an argument for doing things asynchronous is ok. But it has to do with layering, not paradigm. If the spec says you must wait, you wait. If not, you don’t. Also, do observe that asynchronous messages (and not rendezvous as in Ada or channels as in Go) makes the messages / sequence diagrams tilting – good for some, not wanted in other situations. What happens after you “send and forget”? Is this what I want to do, always? Of course not. And “waiting” does not mean “don’t do anything”. It has to do with design and also “parallel slackness”. I have blogged some about this, see http://oyvteig.blogspot.com/, and published some, see http://www.teigfam.net/oyvind/pub/pub.html. 
Øyvind Teig, Norway
 As Øyvind points out interpreting a saying using your own knowledge can easily lead to different outcomes depending on the person. From his perspective (and please arrest me if I'm not interpreting you right) async functionality should be boxed in and modeled through what he calls the "Office Mapping Factor" as an example. And it makes perfect sense!
The basic idea of this type of modelling is using CSP (Communicating Sequential Processes). CSP defines separate processes/threads running in parallel while inter communication between the processes/threads is happening synchronously. The receiver of the message has to acknowledge that it can/want's to receive the message rather than the fire and forget / queuing approach.

In my talk and blog posts I tend to talk more about fire and forget messaging. Systems that run asynchronously without the various parts of the system being aware of the asynchronous behaviors happening around them. Each part or rather block of functionality in the system works synchronously while message consumption happens asynchronously. Quite the opposite of what CSP states.

And again the differences are all about context. Personally I write code almost exclusively for servers and desktop computers that has the amount of memory, cores and disk as is desired. Øyvind on the other hand has 128KB program memory and 32KB external memory at his disposal. Meaning all my assumptions of how stuff works goes right out the window. Like he explains: Yes you can use a queue but what happens when the queue is full and you get queue overflow (a scenario that has not even crossed my mind)? With 32KB of memory that is quite likely. The system would crash or halt wouldn't it? How about spawning processes? Same thing. In this context the environment plays a huge role and the architecture needs to reflect it. How you spend your resources is critical.

Establishing a context around the problem you are trying to solve is crucial to how you end up implementing it. I quite enjoyed reading about how Øyvind reflects around the Office Mapping Factor. So how come I tend to approach asynchronous programming so different from Øyvind?

Let's set the context for what kind of system that is usually on my mind when I'm talking about asynchronous systems. Everyday applications. That is the easiest way I can put it. Your everyday Order\CRM\Whatever application. Not an application that needs to process 10.000+ messages a second. Not an environment where there are restricted amounts of resources nor where the application consumes unnaturally huge amounts of resources.

I guess my approach can be split into two parts. The first thing being messaging and the second being asynchronous. Usually messaging is the reason why I use event/message based architectures. It is not from the desire of making asynchronous systems. However making a message based system asynchronous is trivial.

Why message based system?

Abstraction, Coupling
High coupling is like the perfect fertilizer for code rot. A change to any piece of code can break everything because everything is coupled to everything else. How do you prevent this from happening? You introduce abstraction. There are multiple mechanisms that can decouple systems like injection of abstract types (interfaces, abstract classes..), function injection (delegates) and messaging. Messaging takes abstraction farther than the other mentioned mechanisms by adding a dispatcher that routes the message to the consumer. Because of this the sender is only dependent upon the message dispatcher and not the message handler. The message might be dispatched to multiple consumers without the producer of the message knowing about it. Instead of calling a method handed to you through contract (interface, abstract class, delegate) you produce an output others can consume.

Producing real output
By making sure that your code feature produces an output in the shape of a message you have made that code feature a stand alone piece of code. It can now make sense beyond the contract that would have otherwise been passed to it. Being a stand alone piece of code it becomes a natural place for integration. Integration that would happen through consuming a message or producing a message that other participants can consume through the message dispatcher.
Open Closed principle: Software entities should be open for extension but closed for modification. Another principle which violation easily leads to code rot. Since messages (real output) flow through the dispatcher we can now extend functionality regarding message handling without modifying existing code.

Since our small software blocks now produce real output it is very testable. We can rely on testing input and output focusing on given the circumstances what is the expected result (output). Create a fake message dispatcher for you tests. Publish a message for your feature to consume and verify that the message produced by your feature is as expected.

Now to the asynchronous part

A message based system consists of participants, a message dispatcher and messages. The participants can send messages through the dispatcher and/or receive messages from the dispatcher. Whether your message based system is asynchronous or not can be as trivial as the dispatcher saying _consumer.Consume(message); vs ThreadPool.QueueUserWorkItem((m) => _consumer.Consume(m), message);. That little detail will make you able to have multiple features running in parallel vs features running sequentially.

Even though the code that needs to be written to make a system asynchronous is done in a couple of seconds the impact of that change is enormous. Both in what you can do and what you need to make sure you don't do. Deciding to go down the road of asynchronous executing features requires that you carefully model how you want to handle state.

First let's discuss what writing a message based system really means. For now we have looked into how a single handler produces and consumes messages. However message based systems are often about message chains. Just like mentioned in messaging and abstraction the message can split up the larger system features into smaller software entities that produce and consumes the various messages in a chain. A chain can consist of for instance RequestDelivery->PickItemsFromStock->CheckoutItems->PickTerminal->DeliverToTerminal or DomainLogic->...->...->DbPersistence. Each of the steps in this chain will be able to produce and consume messages.

When writing asynchronous message based systems I tend to divide my types of message handlers into a few categories based on how it relates to state. Keep in mind that each message handler might handle a message in parallel with other handlers.

Stateless Message Handlers
This is my first choice of handler as it has no side effects. It will only transform the input to an output message without writing to any publicly exposed state. It basically means (if a class) creates a new instance of the message handler, pass it the message and when done dispose of the handler.

State Message Handlers
These types of handlers deals either with accumulated internal state or external state. Either way it needs to make sure that it takes the right precautions (locking etc.). If only dealing with external state it has the ability to do work in the same instantiate, handle and dispose manner as the stateless message handlers. If dealing with internal state it needs to be a running "engine" that runs for as long as the scope of it's contained state. Either way it deals with state that at any time can be corrupted so I model these handlers CAREFULLY.

Queuing Handlers
As the name implies they do not dispatch the message like the others but rather stores it to some predetermined queue(s). Queuing handlers are usually something I go for whenever I want a persisted step in the message chain, a critical running ("engine") needing to pick work based on it's own task scheduler or I need to go cross system with some sense of reliability.

At this point it is pretty obvious that choosing how to write asynchronous code deeply depends on the environment where it is written and the context of the problem that is solved. My context as mentioned tends to lean towards normal business applications with a moderate requirement for computer hardware. Had the point of the system been to process a large amount of messages pr second then that would have affected the systems architecture. Compromises are often taken between resources and maintainability. When being able to focus less on max performance we can focus more on writing the system in a way that is max maintainable.

8 kommentarer:

  1. Denne kommentaren har blitt fjernet av forfatteren.

  2. Very interesting. The ideas that you are talking about sound to me like SOA using a service bus (as advocated by Udi Dahan and others), but shrunk down so that the services are in-process. If this is correct, then it occurred to me that it might be interesting to treat the dispatcher as a service bus, perhaps even implementing the IBus interface from, say, NServiceBus. This feels like it would have two benefits: 1) avoid using different terminologies to talk about the same underlying concept, which might help with spreading the ideas; 2) provide a way to scale an app from single process to distributed.

    Apologies if you already covered this in your talk, but the poor video quality during the code sections meant I skipped parts :(

  3. That is very much the thought behind it. IMessageBus was my name for it in the beginning. After some time I realized that it's only feature is to dispatch messages hence IDispatcher. Adding more features to it tended to become a mess.
    But you are absolutely correct. It's like SOA only in process. And the benefits are the same with nicely separated parts of the system and just a fraction of the complexity.
    Totally understand. Hope you got to the part where I cover generating message flow graphs. That's pretty neat :)

    Btw all the code I go through is attached in my blog post I posted after the talk.

  4. Apart from your talk, I haven't found any other information about this online. Is it being discussed anywhere? Are there any open source implementations being used in production?

  5. I usually try to make things as simple as possible. This is my usual messaging implementation https://gist.github.com/1319021. Still haven't found a scenario it can't solve :) I don't know of any groups discussing in process messaging though.

  6. On behalf of aclassifier as the comment field has been giving him some trouble

    Svein Arne, thanks for a thorough reply to my comment, and even a new blog note for it! I will just comment a little here, and keep it like this for my part. The CSP/occam community [1] has always said that concurrency is easy. And now Google's Go language [2] seems to make this more viable, even if there are no parallel usage checks.

    > The receiver of the message has to acknowledge that it can/wants to receive the message rather than the fire and forget / queuing approach.

    "Acknowledge", yes: by listening on the channel. No busy-poll. And it may also _not_ listen on the channel, equally important. Subscribe mechanism not needed.

    > Systems that run asynchronously without the various parts of the system being aware of the asynchronous behaviors happening around them.

    Agree. I know my role, you know yours – we know what’s between us, and that’s all I need to know. How what’s between us is communicated or when / if we want to synchronize, is really another matter.

    > So how come I tend to approach asynchronous programming so different from Øyvind?

    And from Ada, and from Go. . (99. 9% of the ideas I discuss, I have learnt.) But according to Turing any paradigm may be built on the other. But the fact that we have enough memory available makes little difference. That would in case need verifyable looping, including recursive code that is guaranteed to stop in time etc.

    > . . message dispatcher

    I am always _afraid_ that it might contain too much state, and to little WYSIWYG, and too high cohesion with the receivers of the messages. A scheduler that is also some application thins the application code proper.

    Spawning a process to send make it asynchronous with respect to the father only works with not-join semantics. In occam all composite processes must terminate before the father continues after the PAR. But the asynchronism may be moved to a child.

    > Stateless: "without writing to any publicly exposed state"

    Stateless to me, in this context means that "state is in the message" (back and forth), not to which degree of transparency there is with shared variables.

    > State Message Handlers: These types of handlers deals either with accumulated internal state or external state. Either way it needs to make sure that it takes the right precautions (locking etc. ).

    In a _fully_ message-based architecture you don't need shared variables, and consequently don't need locking. Everything is communicable.

    > If dealing with internal state it needs to be a running "engine" that runs for as long as the scope of its contained state.

    So, join semantics, as occam, I believe.

    > Either way it deals with state that at any time can be corrupted so I model these handlers CAREFULLY

    So, occam usage rules would have helped.

    > Queuing Handlers: . . needing to pick work based on its own task scheduler

    Having a private "task scheduler" _might_ imply that some message has been received while the process was doing other things, and not able to handle it. It had to be set aside. But then came another such message, oops. Not WYSIWYG because I also have to know something about these senders: "you shouldn't have sent that in this state! Can't you just wait a little, it would be easier for me?".

    By the way, here is a web server implemented in occam: kar. kent. ac. uk/13916/ "occwserv: An occam Web-Server" - Barnes, Fred (2003) occwserv: An occam Web-Server. In: Communicating Process Architectures 2003, SEP 07-10, 2003 , UNIV Twente, Enschde, Nederlands.

    --- Refs
    [1] CPA conferences and papers: www. wotug. org/
    [2] Go and CSP: golang. org/doc/go_faq. html#csp

  7. It is pretty obvious that your experience is quite a bit broader than mine. I have little or no knowledge about how CSP and Goggle Go works. However I'm planning on adding Google Go support for the development environment I'm working on so hopefully I'll get some eye openers there.

    As regarding a message dispatcher I put no logic or state other than knowing about registered handlers and the ability to query it about whether or not it want to handle a message and handing the message to it. It acts merely as a proxy separating the calling and invoked side of a void method call. However it might let one call invoke several handlers. It would look something like this

    class Dispatcher
    IHandlers[] _handlers

    void Register(handler)

    void publish(message)
    foreach (handler in _handlers)
    if (handler.Accepts(message))

    Again the code I have written is more taking advantage of pipelines. That being: step1->step2->step3->step4 and so on. We can imagine that for instance when calling step3 the dispatcher also dispatches the message of to a step called paralell_step1. That would split it out into two pipes. This means that each step (class) lives for as long as it handles the message. If I understand you correctly this is a different concept of what you are talking about which would be where each step is a running inside a "lightweight process".

    That being said this is an expensive form of async programming as it involves more "uncontrolled" parallel processing as you never know how many messages are being processed at a given time. However in terms of design it delivers great flexibility and very low coupling.

    Stateless as being state is in the message is probably a better definition.

    As for the task scheduler. Since I'm handling pipelines there is no process and rather a threadsafe class handling paralell messages. Now if it contains state shared between the threads that would mean locking mecanisms.

  8. I've been thinking some more about these ideas - http://www.itworksonmymachine.co.uk/2012/04/25/honey-i-shrunk-the-soa/. I'd be very interested in your comments.