Routing Select Docker Containers through Wireguard VPN

Part of the Wireguard series:


Scenario: You have a host running many Docker containers. Several sets of these containers need to route traffic through different VPNs. Below I’ll describe my solution that doesn’t resort to VMs and doesn’t require modification to any docker images.

This post assumes that one has already set up working wireguard servers, and will focus only on client side. For a quick wireguard intro: see WireGuard VPN Walkthrough.

Solution #1

If you’re familiar with the openvpn client trick then this will look familiar. We’re going to create a Wireguard container and link all desired containers to this Wireguard container.

First we’re going to create a Wireguard Dockerfile:

FROM ubuntu:16.04

RUN apt-get update && \
    apt-get install -y software-properties-common debconf-utils iptables curl && \
    add-apt-repository --yes ppa:wireguard/wireguard && \
    apt-get update && \
    echo resolvconf resolvconf/linkify-resolvconf boolean false | debconf-set-selections && \
    apt-get install -y iproute2 wireguard-dkms wireguard-tools curl resolvconf

COPY wgnet0.conf /etc/wireguard/.
COPY startup.sh /.

EXPOSE <PORT-LIST?>

ENTRYPOINT ["/startup.sh"]

Some notes about this Dockerfile:

#!/bin/bash
set -euo pipefail

wg-quick up wgnet0

VPN_IP=$(grep -Po 'Endpoint\s=\s\K[^:]*' /etc/wireguard/wgnet0.conf)

function finish {
    echo "$(date): Shutting down vpn"
    wg-quick down wgnet0
}

# Our IP address should be the VPN endpoint for the duration of the
# container, so this function will give us a true or false if our IP is
# actually the same as the VPN's
function has_vpn_ip {
    curl --silent --show-error --retry 10 --fail http://checkip.dyndns.com/ | \
        grep $VPN_IP
}

# If our container is terminated or interrupted, we'll be tidy and bring down
# the vpn
trap finish TERM INT

# Every minute we check to our IP address
while [[ has_vpn_ip ]]; do
    sleep 60;
done

echo "$(date): VPN IP address not detected"

Some notes:

The best way to see this in action is through a docker compose file. We’ll have grafana traffic routed through the VPN.

version: '3'
services:
  wireguard:
    container_name: 'wireguard'
    build: .
    restart: 'unless-stopped'
    sysctls:
      - "net.ipv4.conf.all.rp_filter=2"
    cap_add:
      - net_admin 
      - sys_module
    ports:
      - '3000:3000' # grafana
  grafana:
    container_name: 'grafana'
    image: 'grafana/grafana'
    restart: 'unless-stopped'
    network_mode: "service:wireguard"

Notes:

When dependant services bind to wireguard’s network they are binding to that container’s id. If you rebuild the wireguard container, you’ll need to rebuild all dependant containers. This is somewhat annoying. Ideally, they would bind to whatever container’s network that had the name of wireguard.

Quick quiz, which of these addresses will resolve to our grafana instance (taken from the host machine)?

localhost
[host-ip:192.168.1.6]
[vpn-ip:10.192.122.2]
[vpn-external-ip]
[wireguard-docker-interface:172.19.0.1]
[docker0:172.17.0.1]

Correct answers:

localhost
[wireguard-docker-interface:172.19.0.1]
[docker0:172.17.0.1]

If I hadn’t ran the experiment, I would have gotten this wrong!

If we log onto the VPN server, we see that only curling only our client IP address will return Grafana. This is good news, it means we are not accidentally exposing services on our VPN’s external IP address. It also allows a cool trick to see the services locally through the host machine without being on the VPN. Normally one would would put in http://host-ip.com:3000 to see Grafana, but as we just discovered, that no longer routes to Grafana because it lives on the VPN. We can, however, ssh into the host machine and port forward localhost:3000 to great success!

The end result gives me a good feeling about the security of the implementation. There is no way someone can access the services routed through the VPN unless they are also on the VPN, they are on the host machine, or port forward to the host machine. We have a rudimentary kill switch as well for some more comfort.

Solution #2

Our second solution will involve installing Wireguard on the host machine. This requires gcc and other build tools, which is annoying as the whole point of docker is to keep hosts disposable, but we’ll see how this solution shakes out as it has some nice properties too.

Initial plans were to follow Wireguard’s official Routing & Network Namespace Integration, as it explicitly mentions docker as a use case, but it’s light on docker instructions. It mentions only using the pid of a docker process. This doesn’t seem like the “docker” approach, as it’s cumbersome. If you are a linux networking guru, I may be missing something obvious, and this may be your most viable solution. For mere mortals like myself, I’ll show a similar approach, but more docker friendly.

First, wg-quick has been my crutch, as it abstracts away some of the routing configuration. Running wg-quick up wgnet0 to have all traffic routed through the Wireguard interface is a desirable property, and it was a struggle to figure out how to route only select traffic.

For those coming from wg-quick we’re going to be doing things manually, so to avoid confusion, I’m going to be creating another interface called wg1. Our beloved DNS and Address configurations found in wgnet0.conf have to be commented out as they are wg-quick specific. These settings are explicitly written in manual invocation:

ip link add dev wg1 type wireguard
wg setconf wg1 /etc/wireguard/wg1.conf
ip address add 10.192.122.2/24 dev wg1
ip link set up dev wg1
printf 'nameserver %s\n' '10.192.122.1' | resolvconf -a tun.wg1 -m 0 -x
sysctl -w net.ipv4.conf.all.rp_filter=2

At this point if your VPN is hosted externally you can test that the Wireguard interface is working by comparing these two outputs:

curl 'http://httpbin.org/ip'
curl --interface wg1 'http://httpbin.org/ip'

For my future self, I’m going to break down what just happened by annotating the commands.

# Create a wireguard interface (device) named `wg1`. The kernel knows what a 
# wireguard interface is as we've already installed the kernel module
ip link add dev wg1 type wireguard

# Point our new wireguard interface at the VPN server and allocate addresses
# for the interface
wg setconf wg1 /etc/wireguard/wg1.conf
ip address add 10.192.122.2/24 dev wg1

# Start the interface and add the VPN server as our DNS nameserver. This is so
# our VPN will resolve hostnames like httpbin.org or google.com.
ip link set up dev wg1
printf 'nameserver %s\n' '10.192.122.1' | resolvconf -a tun.wg1 -m 0 -x

# rp_filter is reverse path filtering. By default it will ensure that the
# source of the received packet belongs to the receiving interface. While a nice
# default, it will block data for our VPN client. By switching it to '2' we only
# drop the packet if it is not routable through any of the defined interfaced.
sysctl -w net.ipv4.conf.all.rp_filter=2

Now for the docker fun. We’re going to create a new docker network for our VPN docker containers:

docker network create docker-vpn0 --subnet 10.193.0.0/16

Now to route traffic for docker-vpn0 through our new wg1 interface:

ip rule add from 10.193.0.0/16 table 200
ip route add default via 10.192.122.2 table 200

My layman understanding is that we mark traffic from our docker subnet as “200”, kinda like fwmark. We then set the default route for the docker subnet to our wg1 interface. The default route allows the docker subnet to query unknown IPs and hosts (ie. everything that is not a docker container in the 10.193.0.0/16 space). By having the route be more specific, as it mentions a table, data is routed through wg1 instead of eth0.

You can test it out with:

docker run -ti --rm --net=docker-vpn0 appropriate/curl http://httpbin.org/ip

Once we docker network remove docker-vpn0, we slim down our docker compose file.

version: '3'
services:
  grafana:
    container_name: 'grafana'
    image: 'grafana/grafana'
    restart: 'unless-stopped'
    ports:
      - '3000:3000'
    dns: '10.192.122.1'
    networks:
      wireguard: {}
networks:
  wireguard:
    ipam:
      config:
        - subnet: 10.193.0.0/16

Now when we bring up grafana, it will automatically be connected through the VPN thanks to the subnet routing.

If we want to bring the VPN up on boot, we need to create /etc/network/interfaces.d/wg1 with encoded commands:

auto wg1
iface wg1 inet manual
pre-up ip link add dev wg1 type wireguard
pre-up wg setconf wg1 /etc/wireguard/wg1.conf
pre-up ip address add 10.192.122.2/24 dev wg1
up ip link set up dev wg1
post-up /bin/bash -c "printf 'nameserver %s\n' '10.192.122.1' | resolvconf -a tun.wg1 -m 0 -x"
post-up ip rule add from 10.193.0.0/16 table 200
post-up ip route add default via 10.192.122.2 table 200
post-up sysctl -w net.ipv4.conf.all.rp_filter=2
post-down ip link del dev wg1

The auto wg1 is what starts the interface automatically on boot, else we’d have to rely on ifup and ifdown. Everything else should look familiar.

The last thing that needs mentioning is the kill switch. We’ve seen calling curl inside our networked docker container. We could use this to periodically check the IP address is as expected. But I know we can do better, but I can’t quite yet formulate a complete solution, so I’ll include my work in progress.

We can deny all traffic from our subnet with the following:

ip route add unreachable 0.0.0.0/0 table 200

But how to run this command when the VPN disintegrates? I’ve thought about putting it in a post-down step for wg1, but I don’t think it’s surefire approach. The scary part is if wg1 goes down, the docker ip table rule is no longer effective, so instead of dropping packets for an interface that no longer exists, they are sent to the next applicable rule which is eth0! We have to be smarter. For reference, the kill switch used as an example in wg-quick:

PostUp = iptables -I OUTPUT ! -o %i -m mark ! --mark $(wg show %i fwmark) -j REJECT
PreDown = iptables -D OUTPUT ! -o %i -m mark ! --mark $(wg show %i fwmark) -j REJECT

The end result should look something like this. It will be more complete than any curl kill switch. I didn’t want to bumble through to a halfway decent solution, so I called it a night! If I come across the solution or someone shouts it at me, I’ll update the post.

Conclusion

I think both solutions should be in one’s toolkit. At this stage I’m not sure if there is a clear winner. I ran the first solution for about a week. It can feel like a bit of a hack, but knowing that everything is isolated in the container and managed by docker can be a relief.

I’ve been running the second solution for about a day or so, as I’ve only just figured out how all the pieces fit together. The solution feels more flexible. The apps can be more easily deployed anywhere, as there is nothing encoded about a VPN in the compose file. The only thing that will stand out as different is the networking section. I also find pros and cons to having the VPN managed by the host machine. On one hand, having all the linux tools and wg show readily available to monitor the tunnel is nice. Like collectd, it’s easiest to report stats on top level interfaces. But on the other hand, installing build tools is annoying and managing routing tables makes me anxious if I think too much about it. In testing these solutions, I’ve locked myself out of a VM more than once – forcing a reboot, and I don’t take rebooting actual servers lightly.

Only time will tell which solution is best, but I thought I should document both.

Comments

If you'd like to leave a comment, please email [email protected]