Nonstop technobabbel

torsdag 13. mai 2010

Focus on functionality, use technology

As time goes I find myself more and more intrigued by Domain Driven Design. It is for me like a shift where we go from forcefully molding functionality to fit the technology to elegantly produce functionality with the help of technology. Instead of looking at a piece of code and seeing Nhibernate sessions / Datasets and terms like update, delete and insert we'll see the true intention of the functionality we're looking at. The following example is code the way I would have written it some time ago.

    public void UpdateCustomerState(int customerID, CustomerState newState)
    {
        DataSet ds = getCustomerDataset(customerID);
        ds.Tables["Customer"]
            .Rows[0]["CustomerState"] = stateToInt(newState);
        
        switch (newState)
        {
            case CustomerState.Blocked:
                // Do varius stuff done to blocked customers
                break;
                // Handle more state variants...
        }
        updateCustomerDataset(ds);
    }

Now, looking at this code it looks pretty much like any code we would expect to find anywhere in any source code right? It has split some functionality out into separate methods to clean up the code. It reads pretty well so we can understand what it does, it sets the state of the customer. So what's the problem?
The problem is that it's all done from the tooling perspective and not from the actual intent of the functionality. What happened here was that the developer was told to create the functionality for blocking a customer. The developer did as we usually do.. went right into techno mode. "Ok, we have this state field in the customer table. If I just set that field to state blocked and make the necessary changes to the linked rows in table x and y that should do the trick." And when done the code looked like the code above. This is very much like what happens in Hitchhikers Guide To the Galaxy when Deep Thought reveals that the answer to the Ultimate Question of Life, the Universe, and Everything as being 42. As we know the answer wasn't the problem. The problem was that question. We can say the same thing about this piece of code. What you see is the answer but there is nothing mentioned about the intention behind it. When browsing through the classes you'll only find a method called UpdateCustomerState and nothing about blocking customers.

So what do we do about it? We write the code as stated by the intent behind the functionality. The developer was told that a customer needed to be able to be blocked. From that we can determine that a customer is some entity and it needs a behavior which is block. The implementation would look something like this:

public void BlockCustomer(int customerID)
    {
        var customer = getCustomer(customerID);
        customer.Block();
    }

The first example is the typical: Classes with mostly properties in addition to a set of manager/handler/service classes manipulating these properties to achieve desired behavior. The second example keeps the customers state hidden within the customer and only exposes it's behaviors.

Write functionality with the help of technology!

torsdag 29. april 2010

Code readability

I had an interresting discussion with a co-worker the other day about code documentation and formatting. Lately I have moved away from my previous traditional views on the matter. Earlier my main focus would have been about consistency and similarity. Things like having a header on each class, property and function explaining what it is and what it does is. In addition things like using regions within files and how to order functions, properties, private functions, events and so on within a class.

All this is done in the name of readeability right? At least we think we do. We have become so good at managing these things that we completely forget to ask ourselfe why the class has grown so obeese that we need a set of rules to navigate within it. Mybe we create these rules to enable us to write crap code and still feel good about it? I am not really convinced that setting up a huge ruleset in tools like StyleCop makes your code more readable. Of course it would make it readeable in the sense that if someone gave me a handwritten letter and then handed me the same letter written on a computer both in some language I don't understand.. I would probably be able to read the words letter by letter from the computer version more easily. Does it really matter? It's not like I would understand any of it anyway. Same thing when looking at code. Crap code doesn't get any better just because it's all formatted the same way.

The same goes for code comments. Does the function GetCustomerByName need a comment? If it needs a comment does that mean that the function does more than retrieving a customer by it's name? Maybe the name really is GetCustomerByNameOrInSomeCasesCreateOrder. If so this code doesn't need comments it needs some hard refactoring. Difficulties when choosing a good name for your function is usually a code smell.

My point here is that your code should do the talking. It should express it's true intention. Lets look at two peices of code. The first one being a brute force implementation and the other being a bit more rafined.

Implementation 1

Implementation 2

So what's the difference between the two? Basically the second example has split the functionality up into well named classes, functions and properties. In my opinion this is not a problem unless youre writing performance critical low level stuff. Specially not when using disk or network resources.

What I am trying to say is focus on making the code readable before you focus on getting it well formatted. And most likely, when you get the code readable the need for standard formatting would be way lower.

mandag 8. mars 2010

Persistence ignorance

This post is very much related to my post about relational databases and OO design. Specially the domain model part. A problem with the way data access was presented in the last post was that it was drawn as a layer between the domain model and the database. Of course when writing applications we need a layer between the database and whatever using it. The thing is that the domain model needs more than simple CRUD. Multiple CRUD operations has to be batched and handled in transactions. This is when the concept of persistence ignorance makes it's grand entrance. We need a way to cleanly handle storing/persisting our entities within transactions so that we can ensure that processes executes successfully or rolls back all changes.

I talk about entities here and of course you can use dataset's and other types of data carriers. Since I mainly work in C# which is an object oriented language my preference is to work with POCO's.

First off we need a way to persist that entity somewhere. The reason why I use the word persist instead of storing to the database is that location or type of storage is not relevant to the domain model. The only thing the domain model needs to know is that it's entity is persisted somewhere so that it can get to it later.

How do we create this magical peice of code that will handle all persistance and transactions for us? Well, we don't! Unless you feel the need to reinvent the wheel. Most ORM frameworks implements some kind of persistance ignorance, some container that can keep track of changes made to your entities and commit to the storage or roll back. There are some great frameworks out there that you can use. My personal favorite being NHibernate.

That being said you can make a mess with ORM's too. Some people talk about creating an application with Entity Framework or NHibernate. This is usualy a sign that the source code is full of ORM queries and connection/transaction handling. Again these are issues the domain model shouldn't have to deal with. It should focus on cleanly implementing it's spcified functionality. Not deal with these kind of technical details.

Let's take a minute too look at transactions. Transactions live within what we call a transaction scope. A transaction scope starts when you start the transaction and ends when you commit or roll back. So what would be included in a transactional scope? Let's say we're writing some code that updates some information on a customer and on the customers address which is stored in a separate table. Would we want both those updates within a transaction scope? Indeed! Then what about that other function that does various updates and then calls the contact and address update function? Shouldn't we have a transaction scope wrapping all of that too? Well of course so lets add some transaction handling to this function too and make sure we support nested transactions for the customer and address function. And with that the whole thing started giving off an unpleasant smell.. We have just started cluttering our code with transactions left, right and center. Now what?

Let's take a look at the model again. We can visualize the domain model as a bounded context. It has it's core and outer boundaries.Through it's boundaries it talks to other bounded contexts (UI, Other services, Database...). Take the UI. The UI would call some method on the domain model's facade and set of the domain model to do something clever. My point being that the domain model never ever goes of doing something all of a sudden. Something does a request or triggers an event. Something outside it's boundries always requests or triggers it to do something. These requests and triggers are perfect transaction scopes. They are units of work. These units of work knows exactly what needs to exist within transactional scope.

Unit Of Work is an implementation pattern for persistance ignorance. We can use this pattern to handle persistence and transactions. Let's say that every process or event triggered in the domain models boundry is a unit of work. This unit of work can be represented by an IUnitOfWork interface. To obtain a IUnitOfWork instance we use a IWorkFactory. By doing this we end up with a transaction handler which we have access to from the second our domain code is invoked until the call completes. How would a class like this look? Well we need some way to notify it about entities we want it to handle. Let's call the method Attach and give it a parameter entity. Now we can pass every entity object we want to persist to the Attach method of the IUnitOfWork. We also need a way to remove entities from storage. We'll create a Delete method for that. If the current unit of work succeeds we need a way to let it know that all is good and then go ahead and complete the transaction. Let's call this method Commit. This gives us a simple interface for handling persistence.

   IUnitOfWork
      T Attach<T>(T entity)
      void Delete<T>(T entity)
      void Commit()

The code using it would look something like this.

using (IUnitOfWork work = _workFactory.Start())

{

MyEntity entity = new MyEntity();

work.Attach<MyEntity>(entity);

work.Commit();

   }

Since we are using something like for instance NHibernate in the background we would have to retrieve entities from storage through NHibernate and then attach them to IUnitOfWork. IUnitOfWork of course uses NHibernate in the background for all persistence. Because of the nature of ORM's like NHibernate it would make more sense to include entity retrieval through the IUnitOfWork too since every entity retrieved is automatically change tracked by the NHibernate session. That would also let us abstract NHibernate better from our domain model. Lets add a few functions to the IUnitOfWork interface to accomplish this. We would need a GetList function to be able to return a list of entities and maybe a GetSingle function to return a single entity. Get single would have to be able to retrieve through identity to take advantage of caching within the ORM framework and also be able to pass queries where if using NHibernate we could use IDetachedCriteria. If you want complete abstraction you can make your own query builder which can convert to NHibernate queries internally. Now the IUnitOfWork interface would look something like this:

   IUnitOfWork
      T GetSingle<T>(object entity)
      T GetSingle<T>(IDetcahedCriteria criteria)
      IList GetList<T>(IDetcahedCriteria criteria)
      T Attach<T>(T entity)
      void Delete(object entity)
      void Commit()

To obtain the active instance of IUnitOfWork from anywhere in the code we can create a WorkRepository class. We'll just have the IWorkFactory register the unit of work with the WorkRepository using thread id as key. Doing that would enable us to issue the following command in whatever class we want to use the unit of work:

public void SomeFunction()

{

...

var unitOfWork = WorkFactory.GetCurrent();

var customer = unitOfWork.GetSingle<Customer>(customerID);

...

}

How is that for a database layer? This is most likely all you need. Smack repository pattern on top of that and your code will be pure cleanly written domain functionality. Now go ahead and solve real world problems ignoring all the complexity that comes with persistance, transactions and retrieving entities.

lørdag 27. februar 2010

DRY and reuse pitfalls

Don't get me wrong here. I'm all about keeping my code reusable and DRY (Don't repeat yourself). What I want to pinpoint in this post are common pitfalls when reusing code. More the thought behind the decisions than the principle itself.

First let's talk about our overall mindset when writing code. When developing applications we spend time researching and planning for the functionality before we start implementing. The solution is constantly evolving in our heads and being discussed among the projects team members. This thought process will continue throughout planning and implementation. Because of human nature we'll always look forward into upcoming needs like "Maybe we have to support X in the future? I'd better prepare for it now." or "The function I just wrote could support scenario Y if I just make these changes. We'll probably need it in the future so I'd better do it now". We're making compromises on our existing code based on assumptions. In my mind this is not reuse it's code pollution. Reuse is something that happens when you have two implementations doing the exact same thing. DRY is not planning for the future. DRY is reusing functionality in your existing codebase.
For this scenario a good solution would be writing your code based using the SOLID principles. In that way you'd know that your code would be able to evolve with the uncertainties of the future

Another thing I come over quite often is SRP (Single Responsibility Principle) violations as a result of code reuse. Let's take the example where our application has a LogWriter handling writing to the error log. The class looks like this:

class LogWriter
{
    private const string LOG_ENTRY_START = "*************************";
    private string _filename;
    
    public LogHandler(string filename)
    {
        _filename = filename;
    }
    
    public void WriteLogEntry(string message, string stackTrace)
    {
        using (var writer = new StreamWriter(_filename, true))
        {
            writer.WriteLine(LOG_ENTRY_START);
            writer.WriteLine(message);
            writer.WriteLine(stackTrace);
        }
    }
}

Time goes and for some reason a need arises to also be able to support writing log entries to a database. Someone gets the clever idea to create and overload to the WriteLogEntry method that takes an extra boolean writeToDatabase parameter. Cramming two separate behaviors into a single class or function is not reusing code. It might feel like code reuse since you can use the same class for writing to both logs. The painful reality is that this is considered code rot, not code reuse.
Again this is something that is better of solved through following the SOLID principles. If everything was depending on an abstraction of the LogWriter such as an ILogWriter interface we could easily extend our solution with a new DatabaseLogWriter implementing ILogWriter.

The last subject I want to mention here is cross boundary reuse. This is a topic I have touched in an earlier post about OO design and relational databases. The .NET community is jumping straight into using ORM these days. Which I think is fantastic! Whether it's NHibernate, LLBLGen or entity framework we're using entities now not datasets. I will use entities as an example on cross boundary reuse pitfalls This leads me back to my previous post where I argue that the Domain Model/Business Logic and the UI serves two very different needs. Let's say we decide to create a Customer entity in the Domain Model that we also pass off the Domain Models boundaries up to the UI. We probably end up having to clutter our entity with loads of information needed exclusively by the UI. In the UI there's needs like showing addresses, customer activities and various readable information. This is a lot like the LogWriter example only on a higher level. This time we violate SRP to be able to reuse an entity cross boundary. Again this does not lead to greater code reuse but to greater code rot.
In this case I would strongly recommend using DTO's for transferring information cross boundaries. These DTO's can be created in a way that they perfectly fit the needs of the one aimed to consume them.

fredag 22. januar 2010

Solution structuring and TDD in Visual Studio

I'm currently working on solution structuring for a large system using TDD. To be able to work efficiently within the project we have defined a set of solution types with different purposes. This is the setup we ended up with:

Workbench
TDD requires that the solutions you spend most of your time in builds as fast as possible. To achieve this you have to keep the project number within the solution to a minimum. If you cannot get arround having multiple dependencies for every project, binary references would be the way to go. We have choosen a more decoupled approach where every project depends upon abstractions and interfaces are wired to implementations through a DI container. Contracts and interfaces are separated out into contract projects. This keeps project references down to just referencing interface/contracts projects. The build output from this solution would not be able to run since the contract implementations are not referenced here. The solution contains all environment independent unit and integration tests for this workbench's projects. Because of practicing TDD it's not important that the solution is able to run but that the tests are able to pass. A workbench exists for every bounded context and standalone code library.

Test rig
Even though most of our time is spent writing tests and code in the workbench solutions we some times have the need to debug a running system. These solutions contain all code needed to run parts of the system like hosting services or even complete running systems. The test rig solutions are usually quite large and takes time to build but, theire there for us to ocasionally test the system locally.

Continous Integration solution
This solution contains all projects and all environment independent unit and integration test for the complete system. This solution is part of the continous integration build performed on check in. Naturally CI runs all tests on every build.

System test solution
We need a solution containing environment dependent tests requiring things like database access or a running system.The tests within this solution should run on every deployment to the test environment to make sure the system wiring is intact.

As for deployment both test rigs and continous integration solutions contains enough of the system to be able to perform deployment.

It would really be interresting to see how other people are working on similar projects.

lørdag 16. januar 2010

Object Oriented design and Relational Databases

For as long as I have worked with object oriented languages there has always been a bit awquard working with relational databases. I have grown up in the Microsoft world with visual foxpro/vb/.net/access/ms sql server and then used ado, ado.net and now ORM. Seeing Udi Dahan's talk on"Command Query Responsibility Segregation" and reading Eric Evans book "Domain Driven Design" made me connect some dots leading me to write this post.

So why do we use relational databases? If the only purpose of the database is to persist the domain models entities we would have used an object oriented database right? No transformation between tables and objects would be needed. Ok, given this scenario we're sitting here with our clean entity objects formed in a way that perfectly satisfies the needs for rule validation and process execution performed by the domain model. Brilliant, just the way we like it! Enter the UI. Now this is where it gets ugly. The user requires information to be presented in a way that is humanly readeable. The domain model is perfectly happy with knowing that the customer entity with id 1432 links to the address entity with id 65423. To the person using the application that would be a useless peice of information. The structure of the information needed by the user is often very different to the entities needed by the domain model. Specially when the user needs some kind of grouping of information or statistics. These type of complex queries wants to gather information spaning multiple entites joining them in ways unnatural to the domain model. This is where the relational database comes and performs it's magic. With a relational database we can easily perform complex queries joining multiple tables and tweaking information to fit our needs.

Above is the traditional way of looking at layered architecture. I find this way of viewing layered architecture a bit deceiving. So what about the issue described above with the UI geting in the mix. How do we often solve this problem? Well sadly the domain model often gets to pay the price. Our clean entities are stretched and pulled and lumps of information are attached to them so that we can pass them on to the UI. These sins are committed in the name of Layered Architecture though Layered Architecture is not to blame. It's just easy to interpret the picture above that way. Wether being datasets or object entities relations and bulks of information are added, complicating both the UI and the domain. We end up having to make compromizes constantly because the entity no longer fits either the domain model or the UI in a good way.

There must be a better way! Well there is. We have already pin pointed two separate needs here. The domain model needs a database to persist it's entities to and the user needs to use the persisted information in a way that makes sence to him/her. Let's create two rules.

Neither the domain model nor the UI should ever be aware of the complex structure of the database.
The domain model should never be aware of the complexity of it's clients (UI in this example).

Ok, that solves two problems. The domain model's entities will no longer be compromized by it's consumers since their complexity can no longer affect it. They will also be unaware of any database complexity coming from the database because of it being fitted to coping with multiple needs. Our data abstraction layer using orm or any other data access provider will make sure of that.

Great now we have a clean, readeable and maintainable domain model again. So where does the UI retrieve it's information from then? From the database of course. And to hide the database complexity we can use a view or a stored procedure that returns the information the UI needs formatted excatcly as it needs it to be. How cool is that. We just took advantage of the power of the relational database which now hides it's complexity from it's users.

The UI now bypasses the domain model completely when retrieving it's information. This means that the way the UI makes changes to the database has changed. Earlier the UI was provided with the domain model's entities which it modified and sent back to the domain model for persisting. That is no longer possible since the domain model doesn't share it's entities. What we would want to do now is to build an abstraction between the domain model and the UI. Call it an abstraction layer or service layer. Naming is not important right now. The UI now needs to be able to persist information and execute processes through this abstraction. We need some defined messages that the UI can send to the abstraction. For example the SaveAddress operation in the abstraction needs to be able to take an AddressMessage containing address information. The abstraction then needs to use the message to persist it's information using the domain and it's entities. We then end up with a design where information flows like on the sketch below.

When creating services and abstractions it's important to think about responsibility. For instance the consumer of the service should be responsible for it's interfaces and messages while the service host should be responsible for the implementation of these interfaces. Let's look at the service layer between the UI and the model. The UI would define how the service methods and messages should look and the domain model would implement the service interface. Again for the database access framework the UI and the domain model would be responsible for defining the interface while the data abstraction component/layer would implement this interface.

To conclude this post the main takeaways are that a design like this should be viewed as three separate parts: UI, Domain Model and Data store. The implementations should respect this and make sure that each part focus on the problem it's trying to solve.

Domain Model - Handle the logic, rules and processes the application is supposed to handle through it's specification. This is the heart of the application.
UI - Make sure that the user is able to work with the applications functionality in a way suited for the human mind.
Relational Database - Handles persisting the Domain Model's entities and provides the UI with human readable information.

fredag 15. januar 2010

Please stop the madness

Looking at Microsoft’s approach to frameworks and libraries lately gives me the creeps. In frameworks like entity framework, workflow foundation and such Microsoft heavily relies on graphical tools and generated code. My three main objections to this way of developing are 1. It complicates the way of working 2. It complicates maintenance 3. It gives of the wrong signals to developers.

The whys:

1. It complicates the way of working
As developers what is our main skill? Writing code right? And of course with experience we have learned how to read code and through reading we learn how to write cleaner more readable code. Now suddenly we have to relate to the code we write, the designer UI and the code generated by the designer. In addition to that the code generated by the designer are often a messy blob of complex code. By using these tools we have complicated what should have been clean readable code.
Another thing is writing tests for code using generated code. This usually ends up being a nightmare.

2. It complicates maintenance
What happens when requirements change? Well you have to have the designer regenerate the code don't you? Something that you could have done through refactoring tools you now have to do through the provided UI. Also you risk ending up in a scenario when the framework comes in a new and fresh version where upgrade issues corrupts the generated code. Ok that one was a bit unfair but I'll still consider it an issue.

3. It gives of the wrong signals to developers
This is probably my biggest issue with the concept. The issues solved by these designers the way I see it is: hiding a complex framework or making a non developer friendly framework. First off, hiding a complex framework is treating the symptoms of bad design. I would much rather see them putting their effort into writing a high quality useable API for it. If the reason is that writing code for it is too much to ask from the developer that's just sad. As stated earlier one of the greatest skills a developer has is writing code. As for 'non developers' we're talking about development frameworks not applications like the office suite which rightfully contains UI designers and code generator tools.

I just needed to get this out of my system :p I guess my plead to Microsoft is please stop the madness and get back to writing good clean framework API's that developers can write high quality applications with. The .NET core proves that you know how to.