Design Dilemma: Configuration Files

Table of Contents

No two users have the same installation and in order for an application to work the user may need to edit and maintain configuration files. Ideally this shouldn’t be necessary, but configuration is an easy means to an end. This raises the question, how are these files written? As far as I am aware of there are these options: XML, JSON, TOML, YAML, and a custom format. There are pros and cons to each approach and I’m going to outline them. This article assumes a working knowledge of the various formats. The headers to formats have been linked to their specifications, in case there is any doubt.

Here is what I’m hoping to achieve in a configuration file.

  • Simple defined key value pair, such that “foo is bar” and we know that foo is a common element against all configurations
  • Numeric list. foo is 1, 2, 3, and 4
  • Dictionary of dynamic pairs: We don’t know the keys, the number of them, or their values beforehand. Each configuration will likely have drastically different entries. The following is a pseudocode representation.
<dictionary name> is
    <key 1> is <value 1>
    <key 2> is <value 2>
    .
    .
    .
    <key N> is <value N>
  • File paths will be used in this configuration, so they must be represented simply. Ideally the user copy and pastes a file path from explorer into the configuration and it just works.
  • Human readable to the extent that a non-technical person with no prior experience can write configuration files.

For my purposes I believe YAML to be the best choice

XML

pros: I can’t think of any, maybe besides the fact that it is old and can be parsed quickly by a computer. XML was not meant for everyday user consumption.

cons: Verbose and error prone for those not familiar with XML. Even I don’t pass xmllint every time.

comments: For an employer, I was tasked with rewriting an application that had XML configuration files. he recommended that I drop XML as users complained. I heeded that advice.

JSON

pros: More terse and readable than XML, .NET has a great library for parsing JSON, and it is the language of the web, which happened to be what the application I was rewriting was being accessed through.

cons: For me, I did not find JSON human readable enough. I needed simple file paths and with JSON one has to escape all the backslashes (this was a Windows application I was writing).

comments: JSON is actually decent choice if you didn’t have the file path requirements and you needed a configuration file for a web based program. Realistically it doesn’t have to be a web based (I’m looking at you Sublime Text)

TOML

pros: Extremely readable, minimalistic, and simple. Also designed by the same man responsible for Jekyll, which is what powers this website.

cons: Lack of library support. File paths need to be escaped.

comments: I was excited about this format. Not anymore. This format seemed to be begged to be written by humans, but unfortunately, for .NET at least, there is a severe lack of libraries that work. Either they require .NET 4.5, which I am not able to take advantage of, or the libraries flat out error out when given valid TOML. I’ve raised a couple of issues with one of the .NET implementations, and so far they haven’t been addressed. This makes me sad about open source. However, even if the libraries were perfect, I still wouldn’t have chosen TOML because of the file path issue.

Custom

pros: Can be as simple as you need it and can be tailored to your specific needs

cons: You have to write the parser, spec, and maintain both of them. Not to mention none of the users will know the format. They might write something that is neither valid nor invalid, but something you hadn’t thought of.

comments: This is what I went with on the rewrite of the application. Choosing a format and writing a parser was easy because I’ve already developed one. It took a bit of convincing my boss to give me the green light, but now all of our configuration files are in this format. I regret this somewhat, and that decision was made more than a year ago, so going back on that decision is unlikely and would be costly. The good part is that we’ve received no complaints. The hard part when I ported the application to Linux, I had to rewrite the parser, and it ended up being a quick hack. So far, it hasn’t failed, but I assume it is only a matter of time before it fails and some poor soul, which could be me if it is soon enough, is going to have to fix it.

YAML

pros: Satisfies all my requirements and is terse.

cons: Overkill

comments: In TOML’s spec, it pokes fun of YAML for having a long specification, but I believe that YAML has it right. Users can copy and paste file paths from windows explorer into a YAML configuration file provided that the path is enclosed in single quotes. Also library support for YAML on .NET is quite good.

Conclusion

From now on YAML will be my configuration file markup of choice. It’s terse, human readable, has implemented libraries, handles file paths with ease, and has a well-defined specification. None of the other options could match YAML.

Comments

If you'd like to leave a comment, please email [email protected]