Saturday, February 23, 2008

Unit size in unit testing

It was quite a while since my last post here. Was a bit busy with the new article about Tapestry 5 and facebook application development :).

Recently Howard Lewis Ship, IMHO one of the most knowledgeable guys in software development, creator of Tapestry and Hivemind, posted a very interesting post in his blog concerning unit testing. 

In his post Howard discusses wherever unit testing on class level is efficient enough to worth the spent time, especially when DI (IOC) container is used and each class/service is relatively small. Very simplistic classes are not a good candidate for unit during unit/developer testing. 
He concludes that sometimes bigger units during unit/developer testing will lead to better results, and that's exactly what happened in our system.

We had a relatively big service (around 450 LOC) that was  loading objects from database inside our proprietary ORM (this is another topic why it's exists at all, but about it in the different post). This big service was quite complex one, we would require allot of setup efforts to test it. Actually the simplest way to test it was to hook it to the database (production Oracle or in-process HSQLDB). As you understand it's not a very good idea and make test maintenance very complex.

A bit of explanations required here. We are using EMF generated model code inside our system. It's a very nice approach to system modeling and brings us all benefits of MDA and Metaprogramming. Our Loader service is responsible for fetching data from the database by executing JDBC quries and filling up the model. Query construction and caching is delegated to other services. Query execution and mapping is complex and require some code. EMF model population is also not simple, especially when we would take into account lazy field/collection initialization and complex objects relations.

So, we decided to split this class (service) into two, instead on one Loader there will be Loader and InternalLoader. These two services are mutually dependent (which is no problem at all even with Constructor-based dependency injection). But most interesting decision for me here was how to split the single class into two. Normally according to GRASP we would like to have very small and concise interface between spitted services. It will not only be a better design but (as usual) will be easier to test.

We decided that InternalLoader will be responsible for purely database operations and Loader service will keep the EMF related functionality. After the split we could test Loader service injected into our model which directly highlighted a significant design problem in our code. 

Now it's evident that the problem that we found, couldn't be found by testing each class separately. EMF model and Loader interaction is very complex and involve our code as well as 3rd party EMF code. Only combination of simplified Loader service with our model produces well encapsulated module for testing. For such unit testing is efficient and do not require complex setup/maintenance - hence worth the efforts :)

No comments: