Thursday, September 18, 2008

Mapping Enum Types with iBatis

The problem

Most enterprise software needs enum data types in the domain model mapped to databases for storage. Some databases such as MySQL provide a native enum datatype. For example:


CREATE TABLE sizes (
name ENUM('small', 'medium', 'large')
);

Many databases do not support enums as a formal datatype however. So it is not uncommon for database frameworks to gloss over the details of mapping enums.

Before proposing a solution, first let's understand some of the less obvious issues:

  1. Enum data types can change over time. For example, you might add a valid enum value or remove one later on.
  2. The refactorings to enum value ranges that are trivial to make in Java code require careful consideration when databases contain those values.
  3. Instead of hard-coding Java to deal with each enum data type, it would be nice to solve this problem once and reuse it over and over for all enum types you have.
  4. Some databases support native enum types, some do not. If your database does support native enums, such as MySQL, it is probably a good idea to utilize it. This way you will have less chance of data integrity issues and it will make your DBA's and developers life much easier when it comes to working with SQL.
  5. Note that most programming languages (e.g. Java) have an ordinal (numeric) and string representations for an enum value. The compiler will regenerate the ordinal value based on positional declaration in the Java code. For example, the first enum value is 0, the second is 1, etc.
Danger! Danger! Danger!

I have seen numerous implementations of mapping code that use ordinal values generated by the compiler to map enums to and from an integer value to the database. Don't do this! Although it works initially, you have to consider what happens from a maintenance perspective. Here's some things that can and do occur:
  • Developers less familiar with the internal mapping code can re-order the enum values in the source file. This causes the ordinal values of the enums to change even though their representing the same thing. You can imagine what happens to live data in the database when people start using it! If you don't have good backups and timestamps on every affected row, you can possible render some of your data unusable if you don't catch the problem right away.
  • Developers may need to add or remove enum valid values. As with the above problem, it is easily to accidentally add a new enum value to the middle instead of the end of your enum declaration. Removing and enum value from the middle without causing an ordinal renumber is even more difficult - you'd have to have an enum deprecation hack to keep the ordinal number slot in place.
Suffice it to say, that for anything other than a prototype - do not use the compiler's ordinal values for enums for mapping the enum data type to a database.

Many enterprise software systems I've seen use integers to represent enum values in a database. For reasons stated above, the integer values are best if manually assigned - not the "ordinal" value of an enum declaration. Most persistence related software maps enum fields to either their integer ordinal unfortunately but some map String values to a varchar in the database.

Mapping an enum to a varchar is far more robust to change than mapping to an enum ordinal value. However, this expense comes with the price of additional storage space in the databases. If you have lots of rows, this space adds up quickly and is also felt a tiny bit every time you pass data between your application and database.

Without consideration of mapping enum values, a vanilla Java enum declaration would look something like this:

package com.mcgsoftware.myapp.domain;

public enum WineType {
Cabernet,
Merlot,
Zinfandel;
}

Ideally, it would be nice to not impact the domain model as little as possible (an enum declaration in this case) with our infrastructure-related database mapping concerns.

If your database supports native enum types, that is the cleanest and most robust mapping solution (although many frameworks aren't that sophisticated). If you must, a non-ordinal enum mapping to an small integer in the database works but it is more brittle and more difficult to deal with when it comes to working with SQL.

In the example above, let us presume we need to map the enum values to manually assigned integers. If we have control over the numbers used for mapping enum values, refactoring the enum data type later will be far more robust. In this case:

  • Cabernet = 5
  • Merlot = 10
  • Zinfandel = 15

Solution

Luckily iBatis has the ability to add custom type handlers which we can use for this task. Note that Hibernate also has this ability as well.

We need a way to assign the integer values to the enum declaration. The simplest way to do that is via Java annotations. We could alternatively using external XML files, but it is more difficult to maintain that way and also errors would go unnoticed until runtime.

We also could impose a "mapping oriented" interface that all enum types must implement, such as the code below:

public interface McGEnum {
// return The database integer value for the enum.
public int dbmsValue();
}


public enum WineType implements McGEnum {
Cabernet { public int dbmsValue() { return 10; }},
Merlot { public int dbmsValue() { return 15; }},
Zinfandel { public int dbmsValue() { return 20; }};
}

The disadvantage of the mapping interface is we'd be hard-wiring orthogonal infrastructure concerns into our enum code as if it was business logic. It works, but it doesn't follow the domain idealogy.

Using Java annotations for database mapping is a slightly cleaner separation of concerns and the technique is also more congruent with the style of using JPA annotations for mapping Java to a database.

The solution we'll use here is a 3 step process:

  1. Decorate your enum class with @McGEnum annotations.
  2. Create a subclass of AnnotatedEnumTypeHandler in your Repository implementation package.
  3. Add your new custom Type Handler to your iBatis SqlMapConfig.xml file.

@McGEnum Annotation

For enum data types, we'll use a custom Java annotation @McGEnum to decorate our enum with. It has a single property " dbmsValue " which is the corresponding database integer value.

Our enum is still declared in our domain model package, but would now contain @McGEnum annotations like this:

package com.mcgsoftware.myapp.domain;
import com.mcgsoftware.newframework.McGEnum;

public enum WineType {
@McGEnum(dbmsValue=10)
Cabernet,

@McGEnum(dbmsValue=15)
Merlot,

@McGEnum(dbmsValue=20)
Zinfandel;
}

Custom Type Handler

IBatis requires a custom type handler class for mapping our enum to the database. Mapping an enum is not a domain problem, it is an infrastructure concern. Therefore, we should put the custom type handler class into our "Repository Implementation" package where iBatis persistence related code belong - not into our domain model.

To make things as easy as possible, we create a framework-like abstract super class ( AnnotatedEnumTypeHandler) you can extend for this.

Your enum Custom Type handler can be written as follows:

package com.mcgsoftware.myapp.domain.repository.ibatisimpl;

import com.mcgsoftware.newframework.AnnotatedEnumTypeHandler;
import com.mcgsoftware.myapp.domain.WineType;


//
// The ibatis enum type handler for the enum class.
// This is part of the Repository implementation because it
// is iBatis specific infrastructure and does not belong in
// the domain model.
//
public class WineTypeHandler extends AnnotatedEnumTypeHandler {

@Override
public Enum[]
getEnums()
{
return WineType.values();
}
}

Configuration for custom type handler

And lastly, you need to add the type handler to your IBatis "sqlMapConfig.xml" file so the iBatis framework knows about it. Here's an example of this:

<?xml version="1.0" encoding="UTF-8" ?>

<!DOCTYPE sqlMapConfig PUBLIC "-//ibatis.apache.org//DTD SQL Map Config 2.0//EN"
"http://ibatis.apache.org/dtd/sql-map-config-2.dtd">

<sqlMapConfig>

<typeHandler
javaType="com.mcgsoftware.myapp.domain.WineType"
callback="com.mcgsoftware.myapp.domain.repository.ibatisimpl.WineTypeHandler" />


<sqlMap resource="ibatis/selfservice/brianSqlMap.xml" />



</sqlMapConfig>

You can see from the XML configuration file that IBatis now knows to invoke your custom type handler whenever it sees the "WineType" enum.

You can now use the WineType field in your domain objects and SQL mappings just like any other data type.

Embedded Enum Declarations

You will also have situations where your enum declaration is embedded inside a class declaration. For example:

package com.mcgsoftware.myapp.domain;
import com.mcgsoftware.newframework.McGEnum;

public class DrinkProduct {

String name;
Money price;

enum CupSize {
@McGEnum(dbmsValue=5)
Small,

@McGEnum(dbmsValue=6)
Medium,

@McGEnum(dbmsValue=7)
Large;
}

private void foo() {... code here ...}
}
Your configuration file will need to identify the enum by it's Java class representation. Java appends a '$' and enum type name to the class. In this case, you would reference the javaType attribute in your SqlMapConfig.xml as "com.mcgsoftware.myapp.domain.DrinkProduct$CupSize"

The trouble with custom frameworks

If there is one thing I hate, it's poorly designed custom frameworks!

Why is it that so many companies invent their own frameworks anyway? I have a few theories: (1) Frameworks are more fun for developers to write than application code (2) Managers have been oversold on the idea that you write a framework and somehow it saves a lot of money (3) People grossly underestimate the long-term impact, cost and lock-in a "framework" imposes upon a company (4) Most developers overestimate their abilities as a framework designer.

Real-world case study - A business object framework

This story is about a company that created their own "business object" framework.

Many companies like to create business logic frameworks to enforce a methodology for constructing business logic. This seems like a reasonable approach; encapsulate underlying infrastructure 3rd party frameworks, provide a layer of abstraction “framework” so everyone writes business logic the same way as well as providing additional services which normally burden business logic developers.

This company had a history of mixing business logic and user interface logic, which made applications very difficult to maintain. The CTO and developers sold the management team on a solution for this problem, and you guessed it, the solution was a framework. However, this turned out to cause far more problems than it ever solved.

Although the intentions were good, things didn't turn out well. Why?

  • Adding more complexity instead of making things simpler – Most home grown frameworks of this ilk start off with the lofty goal of simplifying business logic development; and initially with a minimum feature set it appears to work. However, it is very rare that in the course of implementing all the real-life necessary features that are ultimately required that the home grown framework is in reality simpler than not using the framework at all. Without a framework middle-man in the way, developers have easy access to third party frameworks such as Spring, Hibernate, etc. When a home grown framework encapsulates everything, the developer no longer is working with well documented and main-stream understood 3rd party frameworks – he is working with something else entirely different. Moreover, when bugs or problems occur the business logic developer winds up diving into the framework's source code to fix the problem. The net effect is that the job for the developer is now more complex: He has to know and understand the underlying infrastructure of 3rd party frameworks plus the home grown business logic framework that lies on top of it.
  • Preventing development improvements – As with many home grown frameworks, this one was foisted upon development projects. Management simply spent too much money creating it to not insist upon its usage. Developers were told they must use it in the name of consistency for the company. However, when the home grown framework doesn’t have all the features needed for a project or is awkward and difficult to use, the entire development organization is held hostage to the framework’s problems and limitations. In this case, the management push for the framework squelched any dissension and negative feedback, further creating problems.
  • Feature lock-in – Since business logic developers were required to use the home grown framework, it became a bottle neck. All applications and services were affected by it. Every application was limited by the home grown framework’s limitations and complexities. For example, if a specific feature was needed and the home grown framework did’t provide it, every application in the company was affected.
  • High Cost – A development team needs to design, code, document, bug fix, test and enhance a business logic framework. Unless your company is in the framework selling business, building frameworks is a pure overhead expense. Often the real costs of framework development, maintenance and documentation are glossed over by the developers selling it in the company. This company was no exception.
  • Inadequate staffing – When building your own framework, you need to have dedicated resources to work on it. These resources need to be available not just at its inception, but continuously over the lifetime of the framework. Most companies are in business to provide applications and services – not sell software frameworks. As a result, it is rare that companies have enough staff to adequately maintain a business logic framework. Frameworks require significant on-going efforts. For example, virtually every popular open source framework undergoes a major re-design every few years. In this case, developers sold management that the framework was a "quick and easy" solution, which it was for its initial incarnation but neglected to account for the long-term maintenance and enhancement such a beast requires.
  • Underestimated complexity – Most home grown frameworks start off with basic features. Over time, people discover they need more and more features in order to get their work done with it. Technology changes too. What was once a “simple” business logic framework mushroomed into a big framework with lots of classes. Most organizations underestimate the scope of such a framework. This company was no exception.
  • Framework Design is difficult – With open source frameworks, the entire development community determines the popularity of a framework. If it is highly useful, it will become popular and more resources will join the project. If it is not or if there are better options – the framework withers on the vine. Designing and implementing a good framework is very difficult - more difficult than it seems. Only very few open source frameworks ever succeed because of this very reason. Before embarking on such a project, a developer should ask, "Is this good enough to be the next Spring or Hibernate and do I have skill and resources to maintain this at that level of quality?".

Bottom-line is this company’s home grown business logic framework hurt the company far more than it helped the development of business logic.

Once a framework gets used, it is viral in nature becoming embedded in business logic or application code and is typically very difficult to remove later. Like most companies, they didn’t have time and resources for a major refactoring event to remove the home grown framework that was’t working or allow a major redesign of the framework to fix problems. Also, the CTO and management team had great difficulty admitting such failure occured on their watch. Instead, the home grown framework has to be continually enhanced because in management's short-term view, it is cheaper to live with a problem framework than it is to admit failure and remove it.

There was a better way.

By learning enterprise design patterns and doing good research into 3rd party frameworks, this company could've got all the benefits the home grown framework was supposed to provide without any of the cost overhead and limitations it imposed.

In this case with enterprise design patterns, one can enjoy the following benefits:

  • Standardized development methodology for business logic development
  • Layered architecture on purpose
  • Clean separation of concerns
  • High degree of maintainability for business logic code
  • Simple, well-documented use of 3rd party software that everyone can understand
  • No “middle man” software framework to forcibly restrict developers from using the features of the underlying 3rd party frameworks.
  • Ability to easily incorporate the latest features and enhancements as they are added to the underlying 3rd party frameworks

Welcome

Welcome to my blog. This is where I post my technical software development manifestos. Enjoy!