Wasm compression benchmarks and the cost of missing compression APIs

Lack of transparent compression

IndexedDB, the Cache API, and network requests will not transparently compress data, and I think this is a problem.

Testing was done on Chrome and behavior may vary by browser.

Let’s say we want a function to persist a large binary payload. Here’s an IndexedDB solution using the idb-keyval library to simplify the API:

import { set } from "idb-keyval";
export async function setRawData(data: Uint8Array) {
  await set("data", data);
}

And here’s the Cache API solution:

export async function persistDataCache(data: Uint8Array) {
  const cache = await caches.open("my-cache");
  await cache.put("/data", new Response(data));
}

After executing these functions, I noted that the data was being stored uncompressed in the following directories:

%LocalAppData%\Google\Chrome\User Data\Default\IndexedDB\http_localhost_3001.indexeddb.blob\1\05
%LocalAppData%\Google\Chrome\User Data\Default\Service Worker\CacheStorage\..

I even tried getting cute and adding content encoding headers to the response to see if that would trigger compression, but to no avail:

// Adding content encoding header doesn't work :(
export async function persistDataCache2(data: Uint8Array) {
  const cache = await caches.open("my-cache");
  const headers = new Headers({ "Content-Encoding": "gzip" });
  await cache.put("/data", new Response(data, { headers }));
}

I was initially optimistic that there might be transparent compression as browsers store compressed responses in a compressed format. But no such luck. What’s worse is that through empirical testing, storing compressed responses via Cache API aren’t compressed. If the previous article is true, then it would seem like a fetch with a force-cache may be the best avenue for accessing compressed responses, and responses only.

And in what may be the most egregious omission: client POST uploads. It’s 2023, and we, as an industry, are laser focussed on shaving bytes and milliseconds off server responses but are perfectly content with forcing clients to upload data uncompressed. Seems like such an easy win. What server can gzip compress but can’t decompress?

What’s the point?

Why do I care about storing and sending data in a compressed format?

The most obvious reason is disk space and especially network usage, but the second, more nuanced point is performance.

If a SATA SSD can sequentially read around 500 MB/s, a 100 MB file will be read in 200ms. But if we can compress the corpus to 10 MB and inflate it at 1 GB/s, the total time to read the file from disk and decompress is 120ms, which is 66% faster. In the squash compression benchmarks, there is a visual dedicated to communicating this exact point: introducing compression can be a boon for performance.

With mainstream CPUs now shipping with 6 or more cores, I think it may be safe to say that CPUs are more under-utilized than disk I/O.

This performance increase from compression is why the transparent compression for ZFS file systems is recommended. Compressing with LZ4 (or ZSTD for newer installs) saves on time and space at the cost of a marginal increase in CPU usage. And “incompressible data is detected and skipped quickly enough to avoid impacting performance” (source).

So everything should be compressed by default, right? If there’s only upsides to disk space and performance? Quora experts weigh in. Sorry for the tongue in cheek phrasing.

But it looks like unless one stores Chrome’s data on a file system with builtin compression, we’ll need to handle compression ourselves.

Compression Streams API

Until transparent compression is implemented, we’ll need to handle the compression in user space.

The easiest way is to use Chromium-only Compression Streams API:

export async function persistDataCache3(data: Uint8Array) {
  const src = new Blob([data]);
  const cs = src.stream().pipeThrough(new CompressionStream('gzip'));
  const cache = await caches.open("my-cache");
  await cache.put("/data", new Response(cs));
}

Compression streams are easy but their downsides should give one pause:

Don’t sleep on compression streams either, you won’t find a better way to incrementally stream a compressed request:

export async function sentToApi(data: Uint8Array) {
  const src = new Blob([data]);
  const body = src.stream().pipeThrough(new CompressionStream('gzip'));
  
  // todo: fill out content-type
  await fetch("my-url", {
    method: "POST",
    duplex: "half",
    headers: {
      "Content-Encoding": "gzip"
    },
    body,
  });
}

There some limitations:

DIY Compression

I can’t help but wonder if the enthusiasm towards browsers implementing the CompressionStream API has been hampered due to the existence of Wasm and high quality JS Deflate libraries like pako and fflate. We can see these exact points used as hesitation in a mozilla standards position discussion a couple years ago.

In the same discussion, a core Chromium developer lends support to having a native implementation and touched on performance:

Facebook found that the native implementation was 20x faster than JavaScript and 10% faster than WASM.

Queue the benchmarks! We’re going to find out just how accurate this is.

In addition to the two aforementioned JS libraries, our contestants were written with Rust and compiled to Wasm.

Results

I took a 91 MB file and ran it through the gauntlet on Firefox and Chrome. Keep in mind there are many different Wasm runtimes, see this benchmark of them. And runtime performance may differ greatly depending on the underlying platform and may results may not be reflective of native performance, so be careful when making extrapolations.

Chrome compression chart (click to enlarge)

Chrome compression chart (click to enlarge)

Firefox compression chart (click to enlarge)

Firefox compression chart (click to enlarge)

Notes about the chart:

Below is a table of the brotli compressed size of the library.

Name Size (Brotli’d kB)
native 0.0
lz4 13.9
zstd 138.6
miniz 25.2
pako 15.2
fflate 12.8
brotli (read) 137.8
brotli (write) 681.0

A couple of insights:

In an effort to detect if there was a benefit to delivering data compressed to Wasm instead of using the browser’s native Content-Encoding decoder, I ran a small test outside of this benchmark to measure the overhead. I’m happy to report that there isn’t significant overhead in transferring large amounts of data to Wasm. So don’t abandon the Content-Encoding header. The native browser implementation is probably much more efficient than any reimplementation in userland.

Recommendations

  1. Keep using transparent content decoders like brotli on responses. Don’t go re-encoding assets in zstd and decompressing in userland. Even if the response data ends up being processed in Wasm, there is no overhead in marshaling uncompressed data and the native Deflate and brotli implementations are more efficient than Wasm. Re-fetch the data from cache if sporadic access is needed instead of using IndexedDB or the Cache API.
  2. If available, use the native Compression Streams API as it costs nothing, and will be faster than JS and Wasm implementations. Deflate has an acceptable compression ratio.
  3. If unavailable, there’s a chance you’re on firefox and should polyfill with the miniz-oxide Wasm implementation to reap firefox’s faster runtime
  4. But if Wasm is too much of an inconvenience, polyfill with fflate or, if you prefer, pako.
  5. lz4, brotli, and zstd Wasm implementations are highly situational, and server support may need to be tailored. But I would lean towards zstd as it scores highly in throughput, compression ratio, and isn’t obscenely large.

These recommendations are good whether it’s for file uploads or if like this netizen who stores compressed serialized state in the URL.

I have a tantalizing thought that two of these methods can be used in combination. For instance, a client can use the efficient native Compression Stream API to upload a file, which is intercepted by your favorite serverless environment or Wasm edge compute (like Fastly’s Compute@Edge or Cloudflare Workers). There at the edge, the file is transcoded with brotli for an improved compression ratio. Ideally, this transcoding could be done concurrently with validation, parsing, and any db queries related to the file.

I used brotli in the above example in the assumption that the user may download the file, so we’d want to take advantage of the browser automatically handling the Content-Encoding. So despite brotli not scoring as well as zstd, it can oftentimes be more practical for a given use case.

I won’t be proclaiming a winner between zstd and brotli. They are both very impressive. There’s an interesting exchange between a core developer of zstd, Nick Terrell, and an author behind brotli, Dr. Jyrki Alakuijala. They go back and forth about the merits of their respective algorithms. In a more recent comment, Dr Alakuijala offers insight on why they continue to praise brotli. I’m not eloquent enough to properly distill the viewpoints, but it is clear the teams behind zstd and brotli are passionate so I’ll leave you, dear reader, with these resources for your leisure. All I can say is that I ran the benchmarks, as fair as I could, to the best of my abilities, and am satisfied.

Future Work

Benchmarks are never complete. And I’m sure others have thoughts like:

All shortcuts were done in the interest of time, as this already has been quite the detour. I thought I had a simple question, and now look at me.

Discuss on Hacker News

Comments

If you'd like to leave a comment, please email [email protected]