Design Pattern Dilemma: A Localizable Object
Published on:Table of Contents
Question to be answered: How to handle an object that has its data in an invariant language, but also supports a subset of its strings converted into a specified language.
Introduction
This is not a trivial problem as localization is a crosscutting concern. In short, adding this aspect to the code base will result in either code duplication or heavy dependencies between classes. Indeed, I’ll discuss this as I go over potential solutions. First, a little a background on crosscutting concerns.
An exact definition of crosscutting concern is hard to come by. They are often abstract. Look at Wikipedia’s definition
[C]ross-cutting concerns are aspects of a program that affect other concerns.
My explanation would be that there is a responsibility (like logging, security, or in our case localization), and it referenced throughout the code base as either a class, static function, or another method. It is key to be able to recognize a crosscutting concern because in the paper “Do Crosscutting Concerns Cause Defects?” researchers came to the conclusion that ‘concern scattering is correlated with defects’, and that ’the more scattered the implementation of a concern is, the more likely it is to have defects’. Considering that ‘debugging occupies as much as 50 percent of the total development time’ (quote from Code Complete), the decision to implement crosscutting concerns is not one to be taken lightly. These concerns have to have plenty of thought put into the design, and can’t be hacked together.
Solutions to crosscutting concerns aren’t intuitive. The naive solution would be to combine all the logic and functionality into a single file, sacrificing code size for locality. However, a conclusion that was found in the paper was ’that a refactoring that reduces scattering but increases concern size might actually increase defects’. The implied solution seems obvious, keep responsibilities to a single, small file. Realizing this solution is my goal. The following is a list of potential solutions. Some of them I have implemented and I have written about their advantages and disadvantages, while others I’ve only pondered.
The Decorator
The decorator implementation consists of creating a localized class that inherits the base class. Instantiate the localized class with an instance of the base class. Reimplement all required fields by simply forwarding all non-string properties to the base class. All string properties are intercepted and routed through translation.
class Obj
{
public virtual string Name { get; set; }
public virtual int Id { get; set; }
}
class LocalizedObj : Obj
{
public LocalizedObj(Obj inner, ILocalize localizer)
{
// omitted
}
public override Name
{
get { return localizer.localize(inner.Name); }
set { return localizer.invariant(inner.Name); }
}
public override Id
{
get { return inner.Id; }
set { inner.Id = value; }
}
}
I’ll start with the good because I can only think of one. A localized object acts exactly like its base class, which allows the client to make a seamless transition from invariant to translation.
There are plenty of potential pitfalls in the snippet. This code cannot
localize an object that has already been localized. This is a problem, as there
would have to be a constructor check that ensured the inner object isn’t another
localized obj. In fact, if the inner object has been localized but isn’t a
LocalizedObj
, which occurs when there is more than one decorator on the object
and the LocalizedObj
isn’t the most immediate decorator, then there is no way
of detection. An option is to add a flag enum to the base object, which
decorators essentially tell the base class that it has been decorated with a
specific functionality; however, this is error prone and increases complexity.
There is, essentially, an identity crisis.
‘From an object identity point of view, a decorated component is not identical to the component itself. Hence you shouldn’t rely on object identity when you use decorators’ (from the book, C# Design Patterns - A Tutorial, page 178)
One can argue that the translator can detect if a string has been translated but this would be an additional responsibility on the translator class and in a good design, there shouldn’t be a situation where the translator has this worry. The class should only have a single responsibility and that should be translation. To further my argument, imagine designing the translator class. Most likely, the translations are implemented as dictionary/map. Now imagine some calls for a translation but the translation is not in the dictionary. If the method had a precondition of the parameter not already translated, an error can be thrown. However, if the method allowed already translated strings, it would have to iterate the entire dictionary to see if there was a value that equivalent to the string. This is bad for performance, especially if there were multiple localized decorated stacked on top of one another. To get around this performance issue, one could implement a reference set of all translated strings but this would sacrifice memory. I hope that I’m convincing.
Another bad trait that is obvious about a decorator is that there could be many internal calls forwarding data requests. A class could have a single string among many properties that need translation, so the class would consist mainly of forwarding calls to retrieve the inner class’s value. Very tedious and very error prone. Not to mention all the duplicated logic. I was surprised that when looking through design pattern books, this aspect was omitted or glossed over.
I’ve thought about implementing a T4 template, but have decided against it. Code is a liability (I wish I could say I came up with that). Any code, whether auto generated or not is a liability because a template is not cost-free. It’s another language to learn and debug. I also wouldn’t consider T4 Templates ‘mainstream’, so good luck convincing others. There is a guide online on how to use T4 to generated decorator classes, but this isn’t enough to persuade me to go down the rabbit hole.
It may seem that I’m being unfairly harsh on the decorator pattern, but it is because I’ve used it in a project for localization and the extra code and maintenance makes for low enthusiasm. In the end, I feel like localization is a unique crosscutting concern in the fact that decorating isn’t a viable solution. Take logging for instance. Whether logging is the first decorator or the last, it doesn’t matter; the class’s data doesn’t change. Whereas, for localization there are variety of problems of where to decorate and the constraint that a class can’t have more than one localization decorator. The worst thing that could happen if there were multiple logging decorators is that the same information is logged more than one, which, while undesirable, is hardly a showstopper.
Extension Methods
Another possibility are using extension methods. It would have been nice to have a derived class of string called localizedString whose only difference would be a function/property called Localized, but this is not possible as per the C# specification, ’the string type is a sealed class’. This is severely limiting, as it is not possible to decorate the string class. Extension methods would then be the only choice for extension, but as will be explained, there are many aspects wrong with this approach. ‘Use extension methods to provide functionality relevant to every implementation of an interface, if said functionality can be written in terms of the core interface.’ (from Framework Design Guidelines - Conventions, Idioms, and Patterns for Reusable .NET Libraries, page 163)
This excellent advice is a showstopper for this method. There is no way every string in the program can be localized. For instance, if there was a string that held file paths or a URL, it wouldn’t make sense for there to be a localize function. I was going to entertain the idea that if every string could be translated, how to design an extension method, but then I came to my senses. No extension methods on a type if it doesn’t apply to all instances of that type. No way.
Traditional Inheritance
To circumvent the forwarding of decorators, traditional inheritance is an option.
public class LocalizedObj : Obj
{
public override Name
{
get { return localizer.localize(inner.Name); }
set { return localizer.invariant(inner.Name); }
}
}
Notice that the amount of code has shrunk. The only methods that need to be overridden are the strings. There are a few problems with this approach. Moving away from the decorator pattern, we lose the flexibility associated with it. We are no longer able to ’tack’ on additional features easily at runtime. This hinders the end goal because a correctly designed object can be extended in the future. Since the example I’ve been using throughout this post is so generic it is hard to come up with additional functionalities that could be required, but for the sake of argument, a constraint is added such that all numeric values are rounded down and subtracted from some number. There are now 2² = 4 configurations (vanilla, localized, numerical, localized and numerical). For traditional inheritance to cope with this, we would need a class for each configuration. This is definitely not sustainable due to the possibility of additional options – the number of configurations would grow exponentially! This is where the decorator pattern would be useful, but as explained earlier, it was dismissed as a solution. At this point, the choice is to have either lots of forwarding with decorators or many classes.
Attributes
Using attributes would allow only a single class that has itself and the appropriate methods decorated with a localized attribute.
[Localized]
public class Obj
{
[Localized]
public string Name { get; set; }
public int Id { get; set; }
}
There’s a couple options what to do here, as custom attributes by themselves don’t do anything. One is having an external tool (like PostSharp) weave code into the underlying IL after compilation. The benefit is that after defining the attribute according to the vendor’s specification, one is free to decorate. The result of the compilation is code that can be deployed anywhere – debugging is a different story. I’ve never had to debug an application built with PostSharp, but I can imagine that it is not as straight forward as traditional code. Nevertheless, PostSharp by far outstrips all competition when it comes to code weaving, but unfortunately, it is not free, and alternatives seem to leave a lot to be desired.
Another method is to detect all elements that have that attribute at runtime and modify the function appropriately.
What’s wrong with attributes? I’ll summarize the finding from Mark Seemann in Dependency Injection in .NET
- Attributes are compiled into the code.
- Limited options for attributes to decorate (you can’t put attributes on everything)
- Simple constructor
Before going on the next section, a small aside. Using attributes to control behavior of an object is a form of aspect-oriented programming. I will be the first to admit that I am unfamiliar with that style of programming, so I decided to pursue a basic understanding. In the resources that I found, all of them echoed the same thoughts that Aspect-oriented programming fixes flaws in traditional programming. One author even stated that ’the main shortcoming of classical modular programming is lack of support for crosscutting concerns.’ (from Using Aspect-Oriented Programming for Trustworthy Software Development, page 4) . I find this extraordinarily interesting. All along, I’ve been programming in an object-oriented style with an influence of functional and yet there seems to be an inherent flaw! However, I’m not going to jump ship, as there doesn’t seem to be a good solution – a solution that combines object, functional, and aspect oriented programming.
Dynamic Interception
I have to hand it to the book Dependency Injection in .NET, where the author introduced me to a solution I never would have thought of. It’s similar to using attributes and changing the object at runtime. The difference is that the author suggests using an inversion of control tool to rewrite invocation of runtime objects in a strategy known as dynamic interception. Down below is sample code using the open source dependency injector Ninject.
//Implementation of Obj omitted
using (IKernel kernal = new StandardKernel())
{
kernal.Bind<Obj>().ToSelf();
//Add a colon after original value to simulate translation
kernal.InterceptReplaceGet<Obj>(p => p.Name, (i) => i.ReturnValue += ":");
var o = kernal.Get<Obj>();
o.Name = "B";
Console.Write(o.Name); //prints "B:"
}
Note that there are other DI/IoC frameworks that support interception. I
chose Ninject arbitrarily. Also of note is that this is a bad example, as I
would never use this in production. The reason being, if instead I had o.Name = "B" + o.Name
, the console would print B::
! This highlights the dangers of such a method. It is
so easy to change vast amount of code that it is hard to realize what is happening. In the end, I find that the advantages outweigh the disadvantages.
The power of this approach cannot be understated. It should be possible to name classes and properties a certain way, so that on startup and in a few lines of code, the injector is able to create dynamic decorators for these classes. The code reduction is unimaginable! It is with this statement that I plan to write my next application that deals with crosscutting concerns with a tool that known interception.
But wait, interception is just a dynamic decorator, and it was previously stated that decorators were bad and shouldn’t be used, what’s going on here? Yes, there doesn’t seem to be a perfect solution here, but I still find dynamic interception much better than manual decorators.
- All the repetitive code is gone
- While there is still an issue of identity, the problem is mitigated by having all code instantiated in one place, where it can be seen that the localized object is the top decorator.
The one problem I have with this solution, and this may just be restricted to Ninject, but in my next project, there’s going to be a need to translate about a hundred thousand strings, and I suspect performance will suffer. Instead of worrying about it prematurely, I will just have to try and benchmark.
By Hand
I can’t call this post substantial if I don’t mention this method. Simply have the client do the translation. Make them instantiate the translator class and every time they want something translated, they have to make the explicit call.
Arguably, this is the most correct way. There aren’t any of the technicalities that the other solutions have. If something goes wrong, the client is always to blame.
//somewhere in the client code
Obj o = new Obj();
o.Name = "A";
var t = new Translator();
t.translate(o.Name);
The downfall of this is if the client does a lot of translation, they will create decorator/derived classes of their own. This is not desirable. There would be ton of duplicated errors, errors already mentioned, across many clients. Therefore, the best choice would be for the designers of the framework to integrate localization, so the code only has to be written once.
Conclusion
I hope that something was learned, I know I learned a lot writing it. I believe that dynamic interception is the method that is closest to my goal. I can define interceptions for all localized strings in a single file in a very concise manner. While drawbacks exist for every method, I would rather choose the method that has the least amount of code, and dynamic interception has this title. Someday a better solution to crosscutting concerns will present itself, but until it does, we have to spend a little extra time designing.
Great post. Thanks