Trading Beauty for Performance: F# active patterns vs Nullable
Published on:Table of Contents
In F#, there is a feature called active patterns that can greatly simplify and beautify code. The only problem is that ‘active patterns incur a performance overhead’ as stated in the book, Expert F# 3.0. We’ll explore how much by comparing a string parsing implementation using active patterns and Nullables, which are structs that “represents value type that can be assigned null”.
Note that active patterns and nullable objects are generally not interchangeable as nullable only works with value types, while active patterns aren’t limited, so this post is targeted at a very small set of use cases.
Before diving too deep into the technical differences let’s view a usage sample. For our business logic, we need to be able to parse a string into another type, a boolean, a double, a date, or if all else fails, keep it a string. We can represent the logic as a discriminated union:
type Value =
| Bool of bool
| String of string
| Date of DateTime
| Double of float
Below are the two code snippets. The first one is using active patterns:
let parse = function
| Bool x -> Value.Bool x
| Double x -> Value.Double x
| Date x -> Value.Date x
| x -> Value.String x
Now for Nullable
version.
let parse str =
let boolResult = parseBool str
if boolResult.HasValue then
Value.Bool boolResult.Value
else
let doubleResult = parseDouble str
if doubleResult.HasValue then
Value.Double doubleResult.Value
else
let dateResult = parseDate str
if dateResult.HasValue then
Value.Date dateResult.Value
else
Value.String str
Which one do you like better: a 4 line function or a 13 line function?
Notice how the Nullable
version looks like a waterfall. Using active patterns
makes life easier on the TAB
key. Both functions are functionally equivalent,
taking the same input and outputting equivalent values.
The parsing methods themselves are contrived as we aren’t trying to profile those methods, but for the sake of completeness and demonstration on how to use active patterns and nullables in F# I have copied them below.
let (|Bool|_|) = function
| "true" -> Some(true)
| "false" -> Some(false)
| x -> None
let (|Double|_|) = function
| "1.000" -> Some(1.000)
| "-1.000" -> Some(-1.000)
| x -> None
let (|Date|_|) = function
| "1999.10.8" -> Some(DateTime(1999, 10, 8))
| "2000.1.2" -> Some(DateTime(2000, 1, 2))
| x -> None
let parseBool = function
| "true" -> new Nullable<bool>(true)
| "false" -> new Nullable<bool>(false)
| x -> Nullable()
let parseDouble = function
| "1.000" -> new Nullable<double>(1.000)
| "-1.000" -> new Nullable<double>(-1.000)
| x -> Nullable()
let parseDate = function
| "1999.10.8" -> new Nullable<DateTime>(DateTime(1999, 10, 8))
| "2000.1.2" -> new Nullable<DateTime>(DateTime(2000, 1, 2))
| x -> Nullable()
Analysis
Since the stack vs heap will be important in distinguishing the two versions, here’s a quick refresher (yes, I know that it is a Rust page, but it is an amazing in-depth page on stack vs heap).
The Nullable
type is a struct and structs are value types, which in C#, can
be allocated on the stack. From the book, Writing High-Performance .NET Code, ‘[When a struct] is not on the heap, allocating a
struct will never cause a garbage collection’. This eliminates an immense amount of potential complexity because
allocating/de-allocating a struct on the stack is instantaneous. Active
patterns, on the contrary, are allocated on the heap and require bookkeeping.
It is best not to get caught up in the differences between the stack and the heap because in C#, the stack is an implementation detail. Still when one knows when an instance will be allocated on the stack, there can be performance benefits. Eric Lippert, the a developer from the C# compiler team, has a post covering value types where he discusses when an instance is located on the stack vs the heap.
[I]n the Microsoft implementation of C# on the desktop CLR, value types are stored on the stack when the value is a local variable or temporary that is not a closed-over local variable of a lambda or anonymous method, and the method body is not an iterator block, and the jitter chooses to not enregister the value.
That was a mouthful, let’s see if we can confirm that nullables are being allocated on the stack. We’ll create a mini program that will pump our functions full with a million strings.
For the memory and performance measurements, there is a tool known as
Perfview. After running both version of the program, the difference in
allocations was 8MB, which coincides approximately with the overhead associated
with allocating classes (8 bytes per class instance). For 64bit processes this
difference is increased to 16MB. Thus, Nullable
was allocated on the stack.
Peak memory usage wasn’t affected by the increase of allocations because the
garbage collector would kick in.
While according to Perfview, the active patterns versions accrued 16MB more
objects and 66% more gen0 (the short-lived objects) garbage collections, the
performance differences between the implementations is nearly neglible.
Perfview records a 25msec (15%) difference, yet simply time
the application
results in closer numbers (and occasionally the active pattern code was
faster). If anything this a testament to the .NET garbage collector and the
speed that it can work with short lived objects (called gen0 objects).
Conclusion
In situations where active patterns and Nullable
are interchangeable, keep
the code beautiful by using active patterns as the decrease in the amount of
allocations caused by Nullable
objects on the stack do not increase
performance when active pattern objects are kept strictly in gen0.
Comments
If you'd like to leave a comment, please email [email protected]