Trading Beauty for Performance: F# active patterns vs Nullable

In F#, there is a feature called active patterns that can greatly simplify and beautify code. The only problem is that ‘active patterns incur a performance overhead’ as stated in the book, Expert F# 3.0. We’ll explore how much by comparing a string parsing implementation using active patterns and Nullables, which are structs that “represents value type that can be assigned null”.

Note that active patterns and nullable objects are generally not interchangeable as nullable only works with value types, while active patterns aren’t limited, so this post is targeted at a very small set of use cases.

Before diving too deep into the technical differences let’s view a usage sample. For our business logic, we need to be able to parse a string into another type, a boolean, a double, a date, or if all else fails, keep it a string. We can represent the logic as a discriminated union:

type Value =
| Bool of bool
| String of string
| Date of DateTime
| Double of float

Below are the two code snippets. The first one is using active patterns:

let parse = function
  | Bool x -> Value.Bool x
  | Double x -> Value.Double x
  | Date x -> Value.Date x
  | x -> Value.String x

Now for Nullable version.

let parse str =
  let boolResult = parseBool str
  if boolResult.HasValue then
    Value.Bool boolResult.Value
  else
    let doubleResult = parseDouble str
    if doubleResult.HasValue then
      Value.Double doubleResult.Value
    else
      let dateResult = parseDate str
      if dateResult.HasValue then
        Value.Date dateResult.Value
      else
        Value.String str

Which one do you like better: a 4 line function or a 13 line function?

Notice how the Nullable version looks like a waterfall. Using active patterns makes life easier on the TAB key. Both functions are functionally equivalent, taking the same input and outputting equivalent values.

The parsing methods themselves are contrived as we aren’t trying to profile those methods, but for the sake of completeness and demonstration on how to use active patterns and nullables in F# I have copied them below.

let (|Bool|_|) = function
  | "true" -> Some(true)
  | "false" -> Some(false)
  | x -> None

let (|Double|_|) = function
  | "1.000" -> Some(1.000)
  | "-1.000" -> Some(-1.000)
  | x -> None

let (|Date|_|) = function
  | "1999.10.8" -> Some(DateTime(1999, 10, 8))
  | "2000.1.2" -> Some(DateTime(2000, 1, 2))
  | x -> None

let parseBool = function
  | "true" -> new Nullable<bool>(true)
  | "false" -> new Nullable<bool>(false)
  | x -> Nullable()

let parseDouble = function
  | "1.000" -> new Nullable<double>(1.000)
  | "-1.000" -> new Nullable<double>(-1.000)
  | x -> Nullable()

let parseDate = function
  | "1999.10.8" -> new Nullable<DateTime>(DateTime(1999, 10, 8))
  | "2000.1.2" -> new Nullable<DateTime>(DateTime(2000, 1, 2))
  | x -> Nullable()

Analysis

Since the stack vs heap will be important in distinguishing the two versions, here’s a quick refresher (yes, I know that it is a Rust page, but it is an amazing in-depth page on stack vs heap).

The Nullable type is a struct and structs are value types, which in C#, can be allocated on the stack. From the book, Writing High-Performance .NET Code, ‘[When a struct] is not on the heap, allocating a struct will never cause a garbage collection’. This eliminates an immense amount of potential complexity because allocating/de-allocating a struct on the stack is instantaneous. Active patterns, on the contrary, are allocated on the heap and require bookkeeping.

It is best not to get caught up in the differences between the stack and the heap because in C#, the stack is an implementation detail. Still when one knows when an instance will be allocated on the stack, there can be performance benefits. Eric Lippert, the a developer from the C# compiler team, has a post covering value types where he discusses when an instance is located on the stack vs the heap.

[I]n the Microsoft implementation of C# on the desktop CLR, value types are stored on the stack when the value is a local variable or temporary that is not a closed-over local variable of a lambda or anonymous method, and the method body is not an iterator block, and the jitter chooses to not enregister the value.

That was a mouthful, let’s see if we can confirm that nullables are being allocated on the stack. We’ll create a mini program that will pump our functions full with a million strings.

For the memory and performance measurements, there is a tool known as Perfview. After running both version of the program, the difference in allocations was 8MB, which coincides approximately with the overhead associated with allocating classes (8 bytes per class instance). For 64bit processes this difference is increased to 16MB. Thus, Nullable was allocated on the stack. Peak memory usage wasn’t affected by the increase of allocations because the garbage collector would kick in.

While according to Perfview, the active patterns versions accrued 16MB more objects and 66% more gen0 (the short-lived objects) garbage collections, the performance differences between the implementations is nearly neglible. Perfview records a 25msec (15%) difference, yet simply time the application results in closer numbers (and occasionally the active pattern code was faster). If anything this a testament to the .NET garbage collector and the speed that it can work with short lived objects (called gen0 objects).

Conclusion

In situations where active patterns and Nullable are interchangeable, keep the code beautiful by using active patterns as the decrease in the amount of allocations caused by Nullable objects on the stack do not increase performance when active pattern objects are kept strictly in gen0.

Comments

If you'd like to leave a comment, please email [email protected]