Home Network Part 1 - DIY Home Router with NixOS

Introduction

This is an article about my home network. I’m writing it to capture the knowledge I gained setting it up. Hopefully this will be useful to someone embarking on that same journey. My goal is to provide the information needed to set up a similar network, but also to provide insight and depth into each topic.

As recently as two years ago my entire network consisted of one 2.4 GHz ASUS RT-N16, running EasyTomato (a now defunct variant of the Tomato firmware for Broadcom router chipsets). It was connected to a cable modem and placed strategically in the center of a 2,900 sqft house. Last year we upgraded to a Ubiquiti USG, Cloud Key Gen2 Plus, three Unifi switches and three Unifi Access Points AC Lites. The Ubiquiti gear works fine and has all the Wi-Fi options you need to play settings-induced whack-a-mole with Wi-Fi issues for months.

Motivation for Change

So why build my own router? Well, Ubiquiti networks, like so many other lines of consumer gear, provides easy to use features by connecting your home devices to their cloud services. For example, without an inbound port open, you can manage your network on their website. We also have their cameras and it’s the same deal. The equipment and management experience is nice, but I work as a security engineer, and as it turns out, one can proceed merrily along in their career with some knowledge of cryptography and web application vulnerabilities, but lacking knowledge and experience in networking. I’d like to fix that by building my own router.

Recommendation

My quest for knowledge is taking me on a compressed tour of home network best practices spanning at least ten years into the past. I hope to eventually arrive in the modern era. You should probably go straight there. What does it look like? Treating your local network as hostile and building an encrypted overlay network on top of it. This overlay network can conveniently span physical locations. This is what’s known as Zero Trust, i.e. not trusting the physical network layer.

But you have to trust somebody. I recommend you trust Eero for your router/access-points. They’re owned by Amazon but operate independently. Like Ubiquiti, they operate cloud web services to make using their gear easier, but they take security seriously. They have nice features like mesh networking and a built-in Zigbee hub.

Use Tailscale for your secure overlay network. Tailscale is managed Wireguard. Wireguard is a simple, modern, and secure VPN, that encapsulates best practices into a single version. This keeps the codebase small and auditable, and eliminates the plethora of dangerous options found in other VPNs. A lot of people run Wireguard without Tailscale. You can do that, but then you have to manage dynamic DNS for your internet link and run a persistent server for connections. With Tailscale, they handle the complicated NAT hole-punching allowing you to create a mesh network. Tailscale also uses your SSO provider (Google, AzureAD, GitHub, Okta, etc) for authentication, and has an ACL layer for granular permission within the VPN mesh.

The Lab

When I transitioned to the Unifi equipment, I wired it up live in the house and fiddled with the settings for a few months. I even wrote service and web UI to control my children’s access to the Wi-Fi on a schedule, something EasyTomato had, but Unifi did not. Because of Covid, the kids were doing online school, and my wife and I were working from home. I spent several bloodshot evenings fixing network issues that I had created. That was a lesson learned the hard way. This time around I’m setting up the network in a lab and testing out all the common devices we use before going live.

Devices

Lab Devices

When you see the Ubiquiti equipment here, you’re going to be thinking, “Hey, I thought you wanted to get away from them!” Well, their hardware is pretty nice and as I’ll show later, it can be managed locally without connectivity to their web services.

Mini Desktop Router I3 Q330G4 Intel Core I3-4005U,1.7Ghz (8Gb Ddr3 Ram 64Gb Ssd) - I would have preferred a four core I5, but those required waiting for overseas shipment and I was in a hurry to get started. This came with pfsense installed, which I overwrote with NixOS. I paid $316 on Amazon.
Ubiquiti Switch Lite 16 PoE - A managed switch with plenty of ports to test out vlans. Eight of the sixteen ports are PoE, but with a paltry 45W of total supply. I paid $199 on their site.
Ubiquiti Access Point AC Lite - This comes with two dual-band antennas, supports 802.11 a/b/g/n/r/k/v/ac, and PoE. I paid $89 on their site.
Switch Flex Mini - A small, inexpensive managed switch. I need two switches because I wanted to closely model my existing home network, which unfortunately is a tree topology and not star. I paid $29 on their site.
Raspberry Pi 4 Model B/4GB - A Raspberry Pi in a passive cooling case. This is used to run Home Assistant and AdGuard Home, an ad blocking DNS server. I paid $55 for the board on PiShop.us. And a few more for the case and power supply elsewhere.
Crucial M4 256GGB SATA SSD - I had this lying around and it seemed like a better storage solution than an SD card. It’s connected to the Pi with a USB 3.0 SATA III Hard Drive Adapter.
ConBee II The Universal Zigbee USB Gateway - A USB Zigbee adapter for connection to home automation devices. I paid $45 on Amazon.
Low Spec HP Laptop - I used this to run the Unifi network controller software. I paid $25 on Craigslist.
Macbook Pro M1 Max - I use this and my iPhone as the typical network clients. I paid a lot to Apple.
USBC Accessory with NIC - Came with another computer I bought on Craigslist.
3rd Gen Apple TV - Used to test out Airplay across vlans.
Amazon Firestick - Used to test casting to TVs.

Chronology

Rather than just dump the finished router configuration into the page, I’m going to show the evolution of the router and network from a minimum peer connected into my home network, to the final configuration.

Installing NixOS

Nix was recommended to me as being a good Linux distro for routers because the configuration for the whole OS can be tracked in one of more nix files. I haven’t done much automation around linux deployments, but I imagine the alternatives are something like Puppet or bash scripts. Any configuration-as-code solution is better than tweaking the live OS and (hopefully) taking notes.

I followed the NixOS install instructions. The basic steps were:

Create a bootable USB drive with the NixOS install image.
Boot the router with the NixOS image.
Elevate session to root, su -i.
Partition drive following the example UEFI instructions.
Generate the base NixOS configuration, nixos-generate-config --root /mnt
Install, nixos-install.
When the install completes, create a root password when prompted.

I didn’t take good notes as I set up this router the first time, so I decided to start over and build it up step by step, recording and commenting on it here. But, I didn’t want to start fresh with the install media, so this will be a little hand-wavy/fuzzy on the first stage.

Step One - Basic Internet Access

Up to this point, I’ve been using a keyboard and monitor attached to the router to interact with it. I want to get the keyboard off my crowded desk as quickly as possible. To do this I need to access to router over Ethernet. When switching a NixOS configuration, you have a few choices. You can set it to take effect immediately without a reboot. You can set it to be the default on the next reboot, or as a choice on the GRUB 2 boot screen. Let that sink in for a minute! The entire OS configuration of NixOS can be done with a file. You can have multiple named configurations, and switch between them using GRUB! But if things go wrong, in order to see the boot screen, you still need a keyboard and monitor handy.

Below is the step one configuration. This is a bit more full-featured than my original one where I didn’t enable a DHCP service and I may not have had a firewall. I typed in my first configuration, but this router does have internet access so the easiest thing to do is put the config on the internet somewhere and download it from the router.

Network created using this configuration

Step one

Configuration file

Francis Begyn has a great post that helped me get started.

# Edit this configuration file to define what should be installed on
# your system.  Help is available in the configuration.nix(5) man page
# and in the NixOS manual (accessible by running ‘nixos-help’).

{ lib, config, pkgs, ... }:
let
  publicDnsServer = "8.8.8.8";
in
{
  imports =
    [ # Include the results of the hardware scan.
      ./hardware-configuration.nix
    ];

  boot.loader.systemd-boot.enable = true;
  boot.loader.efi.canTouchEfiVariables = true;

  boot.kernel.sysctl = {
    "net.ipv4.conf.all.forwarding" = true;
  };

  networking = {

    hostName = "nix-router";
    nameservers = [ "${publicDnsServer}" ];
    firewall.enable = false;

    interfaces = {
      enp1s0 = {
        useDHCP = true;
      };
      enp2s0 = {
        useDHCP = false;
        ipv4.addresses = [{
          address = "10.13.84.1";
          prefixLength = 24;
        }];
      };
      enp3s0 = {
        useDHCP = false;
      };
      enp4s0 = {
        useDHCP = false;
      };
    };

    nftables = {
      enable = true;
      ruleset = ''
        table ip filter {
          chain input {
            type filter hook input priority 0; policy drop;

            iifname { "enp2s0" } accept comment "Allow local network to access the router"
            iifname "enp1s0" ct state { established, related } accept comment "Allow established traffic"
            iifname "enp1s0" icmp type { echo-request, destination-unreachable, time-exceeded } counter accept comment "Allow select ICMP"
            iifname "enp1s0" counter drop comment "Drop all other unsolicited traffic from wan"
          }
          chain forward {
            type filter hook forward priority 0; policy drop;
            iifname { "enp2s0" } oifname { "enp1s0" } accept comment "Allow trusted LAN to WAN"
            iifname { "enp1s0" } oifname { "enp2s0" } ct state established, related accept comment "Allow established back to LANs"
          }
        }

        table ip nat {
          chain postrouting {
            type nat hook postrouting priority 100; policy accept;
            oifname "enp1s0" masquerade
          } 
        }

        table ip6 filter {
	        chain input {
            type filter hook input priority 0; policy drop;
          }
          chain forward {
            type filter hook forward priority 0; policy drop;
          }
        }
      '';
    };
  };

  time.timeZone = "America/New_York";

  environment.systemPackages = with pkgs; [
    pciutils 
    tcpdump
  ];

  services = {
    openssh.enable = true;
    openssh.permitRootLogin = "yes";
    
    dhcpd4 = {
      enable = true;
      interfaces = [ "enp2s0" ];
      extraConfig = ''
        subnet 10.13.84.0 netmask 255.255.255.0 {
          option routers 10.13.84.1;
          option domain-name-servers ${publicDnsServer};
          option subnet-mask 255.255.255.0;
          interface enp2s0;
          range 10.13.84.2 10.13.84.254;
        }
      '';
    };

  };
  
  # https://search.nixos.org/options?channel=21.11&show=system.stateVersion&from=0&size=50&sort=relevance&type=packages&query=stateVersion
  system.stateVersion = "21.05"; 

}

When I first starting using nix, I found the config syntax frustrating. I’d google how to do this or that, and each time I read someone’s configuration file, I didn’t understand the let blocks or the dot versus nested block structure. It wasn’t until I took an hour to read about nix expression syntax that I felt comfortable.

Let’s go over this configuration file section by section.

A function to make a config

{ lib, config, pkgs, ... }:
let
  publicDnsServer = "8.8.8.8";
in

You can think of this block as a function. If it were in Python is would look something like, def makeConfig(lib: Dict[str, Any], config: Dict[str, Any], pkgs: Dict[str, Any], *args: List[Dict[str, Any]]) -> Dict[str, Any]: <implementation>. This function must be passed lib, config, and pkgs and because of the ellipsis ... it can be passed additional arguments without an error being thrown, but the function cannot read those additional arguments. So the function takes a set of values, and it returns a set of values. There’s no imperative return statement, the final { ... } block is the return set. The let ... in construct give us a place to create variables to use. You can read more here and here.

Next we define a variable for our DNS server. The router itself will use this DNS server and it will provide it to clients via DHCP.

Includes and boot options

imports =
  [ # Include the results of the hardware scan.
    ./hardware-configuration.nix
  ];

boot.loader.systemd-boot.enable = true;
boot.loader.efi.canTouchEfiVariables = true;

This section was created by nixos-generate-config. It imports the file, hardware-configuration.nix, also created by nixos-generate-config. Mine has information about file systems and available kernel modules. Then it enables systemd-boot.

boot.kernel.sysctl = {
    "net.ipv4.conf.all.forwarding" = true;
  };

This tells the kernel to forward packets between network interfaces. Turning on this setting means, “make this box a router”.

Networking config

networking = {

  hostName = "nix-router";
  nameservers = [ "${publicDnsServer}" ];
  firewall.enable = false;

Here’s the beginning of the network configuration. We give the router a hostname, then the DNS server(s) it should use (for itself), and finally we disable the simple firewall, since we’re going to be adding our own custom rules.

interfaces = {
  enp1s0 = {
    useDHCP = true;
  };
  enp2s0 = {
    useDHCP = false;
    ipv4.addresses = [{
      address = "10.13.84.1";
      prefixLength = 24;
    }];
  };
  enp3s0 = {
    useDHCP = false;
  };
  enp4s0 = {
    useDHCP = false;
  };
};

Here we define what we what each network adapter to do. We get the names of the network adapters with ifconfig. enp1s0 is the upstream connection so it will use DHCP. enp2s0 is the downstream connection to the rest of the network, so it will get a static IP. We could leave the remaining two adapters out of the config if we like.

Firewall rules

Next we will set up the firewall rules. If you’re familiar with iptables or it’s successor nftables, this will be easy to understand. Before this project I never took the time to understand firewall rules very well. I found their loose and terse syntax especially frustrating. If, like me, you’re starting with some basic knowledge of how firewalls work, but have no comprehensive understanding of their features or how to configure them, then you should set aside some time to read up on them.

Even though nftables is newer than iptables, do not try to learn about linux firewalls by reading nftables documentation! All you will find is a detailed list of each command, but little guidance or big picture information to frame the topic. Instead, I suggest to start with this excellent tutorial on iptables by Oskar Anderson. It’s from 2006, so a lot has changed, but the fundamentals of a firewall have not. Read at least chapters 1, 2, 3, 4, 6, 7, 8, 9, 10, and 11. Oskar is a great teacher. You’ll feel comfortable setting up a firewall after reading his free book.

With a thorough understanding of iptables, you’ll be ready to see what’s good about nftables. The nftables wiki has all the information you need, but it’s definitely not going to hold your hand and walk you through it. Compared to the iptables book above, it’s just a pile of doc pages.

Spend some time on the wiki. There is a wealth of information, but don’t get frustrated if you have questions and can’t find the answers there. Just soak in what you can. The diagram on hooks is useful.

Check out the online man pages. Make sure you know what the following are: address families, hooks, tables, chains, rules, and sets. While you’re reading those topics in the manual, flip back and forth to the nice examples on the gentoo linux nftables page. I found these helpful and reassuring.

nftables = {
  enable = true;
  ruleset = ''
    table ip filter {
      chain input {
        type filter hook input priority 0; policy drop;

        iifname { "enp2s0" } accept comment "Allow local network to access the router"
        iifname "enp1s0" ct state { established, related } accept comment "Allow established traffic"
        iifname "enp1s0" icmp type { echo-request, destination-unreachable, time-exceeded } counter accept comment "Allow select ICMP"
        iifname "enp1s0" counter drop comment "Drop all other unsolicited traffic from wan"
      }
      chain forward {
        type filter hook forward priority 0; policy drop;
        iifname { "enp2s0" } oifname { "enp1s0" } accept comment "Allow trusted LAN to WAN"
        iifname { "enp1s0" } oifname { "enp2s0" } ct { state established, related } accept comment "Allow established back to LANs"
      }
    }

    table ip nat {
      chain postrouting {
        type nat hook postrouting priority 100; policy accept;
        oifname "enp1s0" masquerade
      } 
    }

    table ip6 filter {
      chain input {
        type filter hook input priority 0; policy drop;
      }
      chain forward {
        type filter hook forward priority 0; policy drop;
      }
    }
  '';
};

Here we have the nftables configuration set containing an enabled flag and a ruleset. Most nftables docs show rules added as individual commands, but it’s nice to see the rules in one file, like this nix config section or the gentoo examples above.

Unlike iptables, nftables does not ship with any tables. It’s a blank slate. We’ll discuss this in more detail, but at a high level we have filters for incoming traffic, filters for forwarding, and we enable masquerading. Without these filters, the router will not block any inbound packets and it will attempt forward any packets it receives that are not destined for itself.

First we create a table of family ip named filter. Using the ip family means that the rules in this table’s chains only apply to IP version 4 traffic. The address family man page lists all the families including ip6 and inet, a family that lets you make rules for IPv4 and IPv6 together. For now we will only create IPv4 rules. The table man page explains more on tables.

Input chain

Next we create a chain called input. We must give the chain a type. filter is the most common type of chain and is used to allow and deny packets. We are also required to provide a hook. We choose the input hook to apply filters when they are input into the kernel’s flow logic as shown in the hooks diagram. We also give the chain an integer priority value. If there are more than one chain with the same hook, the chain with the lower priority will be processed first. Finally we define a default policy for the chain, drop, which means any packet not matching one of the chains rules will be dropped. Notice these descriptive configuration lines for the chain use semicolons. Do other nftables config lines use semicolons? No, not unless you want to put multiple statements on one line. Why do we use semicolons here then? Beats me. But you can see in the chains man page that they are required.

Let’s filter some stuff.

If the input interface name is enp2s0 then accept the packets. enp2s0 is the downstream connection to our network. Currently my laptop is the only device connected to it. This rule allows traffic coming in on enp2s0 to be routed to local processes on the router itself. With this, I can SSH into the router.
If the input interface name is enp1s0 and the connection tracking state is established or related then accept the packets. In other words, the packets must be in response to a request made from within the network or from the router itself. Oskar Anderson covered connection tracking well in his book on iptables.
If the input interface name is enp1s0 and the packet is one of several ICMP types, the accept it. Some people don’t want their router to respond to pings, but we are going to allow it. There are important ICMP request that need to be allowed. You can filter by ICMP type and code. The data type man pages have more info. For more info on which ICMP traffic to allow, see Should I block ICMP?
Finally in the input chain, we make a specific rule whatever traffic remains coming in on enp1s0. It would be dropped anyway, according to the chain policy, but we do this to add a counter. We added a counter to the ICMP rule too. This way, we can see how much potentially undesirable traffic from the internet is hitting our router. nft list ruleset will display the current rules and values for the counters.

Forward chain

Here we create a chain to hold rules about forwarding packets. We of course use the forward hook. We set a priority of 0 for this chain and set the policy to drop unmatched packets.

If the input interface is enp2s0 and the output interface is enp1s0, then, as the comment says, let the downstream network send packets upstream to the internet. When these packets go out, whether over TCP or UDP, the conntrack logic will mark them as NEW. When replies come back, the conntrack state machine will match them up to the requests (or open connection) and mark them as ESTABLISHED. Oskar’s book explains conntrack in detail.
Our next rule leverages conntrack ct to forward established and related from the internet back to their internal network destinations.

NAT chain

Out final table is again targeted at IPv4. We call it nat since it will be used for network address translation rules. We create a chain called postrouting of type nat and hooked in at the postrouting phase. The priority this time is 100. The default is to accept. That just means that we’re not interested in preventing outbound traffic, but we do want to use this chain to enable source network address translation. An IP packet that comes from my laptop, bound for Google, will get forwarded by the router to the gateway interface enp1s0. But the source address in that IP packet’s header will be 10.13.84.2, which is in the address range reserved for internal networks. Google’s servers would not be allowed, or even capable of routing it back. So we use the masquerade statement to tell nftables to replace 10.13.84.2 with the upstream IP address of the router, 192.168.1.200 in this case, but it would typically be the public IP address assigned to my cable modem, and bridged to this router.

Block IPv6 chain

Here we create a table of family ip6 named filter. We create chains to hook into input and forward each having only a block policy. A lot of programs use IPv6. iPhones and TVs are particularly crafty about creating ah-hoc networks over IPv6 for streaming. We want a lot of those cool features on our network, but since we’re building this up feature by feature, we need to disable IPv6 now, and focus on traditional IPv4.

Packages and timezone

time.timeZone = "America/New_York";

environment.systemPackages = with pkgs; [
  pciutils 
  tcpdump
];

Here we set the timezone. We tell nixos to install the pciutils package (which we can use to get information on the ethernet adapters). We also install tcpdump.

Services

services = {
  openssh.enable = true;
  openssh.permitRootLogin = "yes";
  
  dhcpd4 = {
    enable = true;
    interfaces = [ "enp2s0" ];
    extraConfig = ''
      subnet 10.13.84.0 netmask 255.255.255.0 {
        option routers 10.13.84.1;
        option domain-name-servers ${publicDnsServer};
        option subnet-mask 255.255.255.0;
        interface enp2s0;
        range 10.13.84.2 10.13.84.254;
      }
    '';
  };

};

First we’re going to enable openssh and allow root login. We’ll disable root login later when we add a certificate from the laptop.

Next we’ll configure the DHCP server to hand out addresses to clients. This is pretty simple. It’s going to listen on the enp2s0 interface. We’re configuring one subnet at 10.13.84.0/24. For this subnet, we provide the router or default gateway, the dns server, the subnet mask, and the interface again, which seems redundant, but since we could have specified multiple interfaces to listen on, we could assign them to individual subnets here. Finally, we specify a range of IP addresses to lease.

You can see all the DHCP server’s options in the dhcpd man pages. There’s also the DHCP RFC, the DHCP optional fields RFC and finally the dhcpd4 nix config docs.

Default Nixos release settings

system.stateVersion = "21.05";

The documentation for this setting reads as follows.

Every once in a while, a new NixOS release may change configuration defaults in a way incompatible with stateful data. For instance, if the default version of PostgreSQL changes, the new version will probably be unable to read your existing databases. To prevent such breakage, you should set the value of this option to the NixOS release with which you want to be compatible. The effect is that NixOS will use defaults corresponding to the specified release (such as using an older version of PostgreSQL). It‘s perfectly fine and recommended to leave this value at the release version of the first install of this system. Changing this option will not upgrade your system. In fact it is meant to stay constant exactly when you upgrade your system. You should only bump this option, if you are sure that you can or have migrated all state on your system which is affected by this option.

Changing the configuration

Since this is a chronology, I’m pretending that I don’t already have network access to the router. In reality, you may have network access to the router if you have it’s upstream plugged into your home network and there is no firewall enabled. But let’s just assume you do not. You need to get this configuration onto the router. You could put it on a web server somewhere and use curl. Or add it to a git repo, and clone it. If curl or git are not installed, you can install them like this:

$ nix-env -i curl
$ nix-env -i git

If you’re logged into the router as root, you can copy the configuration file to /etc/nixos/configuration.nix overwriting the file that is already there. Now you’re ready to change the configuration. There are a couple choices.

# builds the configuration, sets it to default if it succeeds, and switches to it immediately
nixos-rebuild switch

# builds the configuration, labels it 'test' in the Grub menu if it succeeds, but does not switch to it.
# you must reboot the router and select 'test' from the Grub menu
nixos-rebuild test

The second option is safer when making major changes.

Using the router!

If you were able to build and activate the configuration, you should now have a basic router capable of assigning your computer an IP address and providing NAT access to the internet. It should also router traffic between multiple devices, but we’re not going to try that yet.

Part 1, router and laptop

Here I have my Macbook plugged directly into the router. There is no switch. I have Wi-Fi disabled on the Macbook. We going to use some networking tools to see the state of the Macbook’s network config before and after connecting to the router.

First, the routing table. Let’s flush out any left over routes, and then print it.

bash-3.2$ sudo route -n flush
bash-3.2$ sudo route -n flush
bash-3.2$ netstat -rn
Routing tables

Internet:
Destination        Gateway            Flags           Netif Expire
default            link#27            UCSIg       bridge100      !
default            link#29            UCSIg       bridge101      !
10.37.129/24       link#29            UC          bridge101      !
10.211.55/24       link#27            UC          bridge100      !
127                127.0.0.1          UCS               lo0       
127.0.0.1          127.0.0.1          UH                lo0       
224.0.0            link#1             UmCS              lo0       
239.255.255.250    1:0:5e:7f:ff:fa    UHmLWIg     bridge100       
239.255.255.250    1:0:5e:7f:ff:fa    UHmLWIg     bridge101

There is definitely still stuff here, and I don’t know what it all is, save the loopback adapter, but none of the routes are using my ethernet or Wi-Fi adapters. So that’s as empty as we can get.

Here we’re using tcpdump to monitor DHCP and ARP traffic right when we plug in the network cable. The -i flag is used the pass the interface name. You can run ifconfig to get a list or interfaces. arp or port 67 or port 68 is the filter. Ports 67 and 68 are used by DHCP. This is easier in Wireshark as you just use bootp or arp as the filter. Wireshark is much nicer than tcpdump if you have a graphical system. The -n is means not to covert addresses to names and -v means to output verbose info.

bash-3.2$ sudo tcpdump -i en7 arp or port 67 or port 68 -n -v
tcpdump: listening on en7, link-type EN10MB (Ethernet), capture size 262144 bytes
13:30:59.986671 IP (tos 0x0, ttl 255, id 12648, offset 0, flags [none], proto UDP (17), length 328)
    0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 3c:18:a0:b7:b9:d3, length 300, xid 0x54c1cea2, Flags [none]
	  Client-Ethernet-Address 3c:18:a0:b7:b9:d3
	  Vendor-rfc1048 Extensions
	    Magic Cookie 0x63825363
	    DHCP-Message Option 53, length 1: Discover
	    Parameter-Request Option 55, length 12: 
	      Subnet-Mask, Classless-Static-Route, Default-Gateway, Domain-Name-Server
	      Domain-Name, Option 108, URL, Option 119
	      Option 252, LDAP, Netbios-Name-Server, Netbios-Node
	    MSZ Option 57, length 2: 1500
	    Client-ID Option 61, length 7: ether 3c:18:a0:b7:b9:d3
	    Lease-Time Option 51, length 4: 7776000
	    Hostname Option 12, length 7: "JJP4G-2"
13:30:59.987402 IP (tos 0x10, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 328)
    10.13.84.1.67 > 10.13.84.2.68: BOOTP/DHCP, Reply, length 300, xid 0x54c1cea2, Flags [none]
	  Your-IP 10.13.84.2
	  Client-Ethernet-Address 3c:18:a0:b7:b9:d3
	  Vendor-rfc1048 Extensions
	    Magic Cookie 0x63825363
	    DHCP-Message Option 53, length 1: Offer
	    Server-ID Option 54, length 4: 10.13.84.1
	    Lease-Time Option 51, length 4: 7200
	    Subnet-Mask Option 1, length 4: 255.255.255.0
	    Default-Gateway Option 3, length 4: 10.13.84.1
	    Domain-Name-Server Option 6, length 4: 8.8.8.8
13:31:00.398217 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.13.84.2 tell 10.13.84.1, length 46
13:31:00.991800 IP (tos 0x0, ttl 255, id 12649, offset 0, flags [none], proto UDP (17), length 328)
    0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 3c:18:a0:b7:b9:d3, length 300, xid 0x54c1cea2, secs 1, Flags [none]
	  Client-Ethernet-Address 3c:18:a0:b7:b9:d3
	  Vendor-rfc1048 Extensions
	    Magic Cookie 0x63825363
	    DHCP-Message Option 53, length 1: Request
	    Parameter-Request Option 55, length 12: 
	      Subnet-Mask, Classless-Static-Route, Default-Gateway, Domain-Name-Server
	      Domain-Name, Option 108, URL, Option 119
	      Option 252, LDAP, Netbios-Name-Server, Netbios-Node
	    MSZ Option 57, length 2: 1500
	    Client-ID Option 61, length 7: ether 3c:18:a0:b7:b9:d3
	    Requested-IP Option 50, length 4: 10.13.84.2
	    Server-ID Option 54, length 4: 10.13.84.1
	    Hostname Option 12, length 7: "JJP4G-2"
13:31:00.995222 IP (tos 0x10, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 328)
    10.13.84.1.67 > 10.13.84.2.68: BOOTP/DHCP, Reply, length 300, xid 0x54c1cea2, secs 1, Flags [none]
	  Your-IP 10.13.84.2
	  Client-Ethernet-Address 3c:18:a0:b7:b9:d3
	  Vendor-rfc1048 Extensions
	    Magic Cookie 0x63825363
	    DHCP-Message Option 53, length 1: ACK
	    Server-ID Option 54, length 4: 10.13.84.1
	    Lease-Time Option 51, length 4: 600
	    Subnet-Mask Option 1, length 4: 255.255.255.0
	    Default-Gateway Option 3, length 4: 10.13.84.1
	    Domain-Name-Server Option 6, length 4: 8.8.8.8
13:31:00.996352 ARP, Ethernet (len 6), IPv4 (len 4), Probe who-has 10.13.84.2 tell 0.0.0.0, length 28
13:31:01.321759 ARP, Ethernet (len 6), IPv4 (len 4), Probe who-has 10.13.84.2 tell 0.0.0.0, length 28
13:31:01.457488 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.13.84.2 tell 10.13.84.1, length 46
13:31:01.646908 ARP, Ethernet (len 6), IPv4 (len 4), Probe who-has 10.13.84.2 tell 0.0.0.0, length 28
13:31:01.972710 ARP, Ethernet (len 6), IPv4 (len 4), Announcement who-has 10.13.84.2 tell 10.13.84.2, length 28
13:31:02.297730 ARP, Ethernet (len 6), IPv4 (len 4), Announcement who-has 10.13.84.2 tell 10.13.84.2, length 28
13:31:02.624058 ARP, Ethernet (len 6), IPv4 (len 4), Announcement who-has 10.13.84.2 tell 10.13.84.2, length 28
13:31:02.627311 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.13.84.1 tell 10.13.84.2, length 28
13:31:02.627859 ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.13.84.1 is-at 20:7c:14:a0:4d:38, length 46

Let’s go through these tcmpdump results.

We have a DHCP broadcast request from the laptop to the network to discover DHCP servers. We know it’s a broadcast because the destination is 255.255.255.255 which is reserved for broadcasts to the whole network. The DHCP protocol is an extension of the BOOTP protocol. You can learn more in the DHCP RFC, the BOOTP RFC and the DHCP Options RFC.
Next we have a unicast DHCP reply message from the router at 10.13.84.1 to the laptop suggesting it use IP address 10.13.84.2 as well as providing a lease time, subnet mask, default gateway, and DNS server address. The destination IP address in the IP header is 10.13.84.2 even though the client has not accepted it yet. The message is routed to the laptop’s MAC address so it doesn’t matter. According to the DHCP spec, the laptop should wait a bit for multiple DHCP offers, and then select one.
The router sends out an ARP request asking who has 10.13.84.2, the IP address it just offered the laptop. The laptop hasn’t accepted it yet. I’ve captured this initial traffic a few times, and the router doesn’t always make this ARP request so soon.
The laptop broadcasts a DHCP request message, containing the IP address it was offered, and is now requesting, and the IP address of the router (in case there is more that one router/DHCP-server offering).
The server, after having committed the new lease to storage, send a unicast DHCP ACK message to the laptop, letting it know that the offered IP address is officially assigned.
The next two lines are the laptop sending ARP probes to see if anyone else have 10.13.84.2. There is an RFC, IPv4 Address Conflict Detection that explains how DHCP don’t give a you-know-what about about how to use ARP to make sure there are no address conflicts. The RFC prescribes that the client do some probing, waiting, more probing, and finally, if no one else replies they are using the IP address, to send out a few ARP broadcasts announcing that it is now using the IP address. Some actors on the network might not be so polite, and ARP Cache Poisoning Attacks are definitely a thing.
Next the server is getting a little antsy and wants some confirmation that we’re using the leased IP address.
Finally, the laptop is ready to broadcast an ARP announcement that it is using 10.13.84.2.

Now that our network settings are configured, let’s take another look at the routing table.

bash-3.2$ netstat -rn
Routing tables

Internet:
Destination        Gateway            Flags           Netif Expire
default            10.13.84.1         UGScg             en7       
default            link#27            UCSIg       bridge100      !
default            link#29            UCSIg       bridge101      !
10.13.84/24        link#30            UCS               en7      !
10.13.84.1/32      link#30            UCS               en7      !
10.13.84.1         20:7c:14:a0:4d:38  UHLWIir           en7   1180
10.13.84.2/32      link#30            UCS               en7      !
10.37.129/24       link#29            UC          bridge101      !
10.211.55/24       link#27            UC          bridge100      !
127                127.0.0.1          UCS               lo0       
127.0.0.1          127.0.0.1          UH                lo0       
169.254            link#30            UCS               en7      !
224.0.0/4          link#30            UmCS              en7      !
224.0.0.251        1:0:5e:0:0:fb      UHmLWI            en7       
239.255.255.250    1:0:5e:7f:ff:fa    UHmLWIg     bridge100       
239.255.255.250    1:0:5e:7f:ff:fa    UHmLWIg     bridge101       
239.255.255.250    1:0:5e:7f:ff:fa    UHmLWI            en7       
255.255.255.255/32 link#30            UCS               en7      !

Still a lot of junk, but we’ve added the router 10.13.84.1 as a gateway, and we have new routes to the router, the subnet, and ourself.

Step 2 - Add a user and an SSH key

Now that we have the laptop on the new network, we can update and apply the nix configuration via SSH and SCP. To do this we’re going to make a new user, add them to the wheel group, and disable the sudo password.

We’ll start by making a new ssh key pair.

ssh-keygen -f ~/.ssh/nix-router

Then add the following lines to ~/.ssh/config

Host nix-router
  HostName nix-router
  User josh
  IdentityFile ~/.ssh/nix-router

Then we’ll add a new user to the configuration.nix file.

security.sudo.wheelNeedsPassword = false;

users.users.josh = {
  isNormalUser = true;
  home = "/home/josh";
  description = "Josh Pearce";
  extraGroups = [ "wheel" "networkmanager" ];
  openssh.authorizedKeys.keys = [ "ssh-rsa AAAA...ZK2Z3UM= josh@JJP4G" ];
};

Now we’ll make a little helper script to apply config changes to the router. The first time we run it, it needs to use the root user, but then we’ll change it to josh after creating the new user. The script takes one argument, which is provided to nixos-rebuild. If you don’t give the argument, it default to test but you can sent switch to apply the changes immediately.

#!/usr/bin/env bash

rebuild_flag=test
if [[ "" != "$1" ]]
  then
    rebuild_flag="$1"
fi

scp configuration.nix root@nix-router:~
ssh root@nix-router << EOF
   set -x
   sudo cp configuration.nix /etc/nixos/configuration.nix
   sudo nixos-rebuild --show-trace ${rebuild_flag}
   set +x
EOF

So now we run the script. You’ll need to enter the root password twice, once for scp command and again for the ssh.

./push_to_router.sh switch

If everything succeeds, we change the user in the script to josh and run it one more time to test the SSH. There should be no password prompt now.

Step 3 - Add virtual LANs

Now we’re going to configure vlans in the router. Virtual LANs were first conceived in the 1980’s to alleviate broadcast traffic congestion in Ethernet networks. They work by inserting a four byte tag field after the source MAC in the Ethernet frame. Capable network equipment can be configured to filter traffic based on this tag. The Wikipedia VLAN article is pretty good.

I’m going to use virtual LANs to isolate my physical network into separate areas for:

lan - Trusted devices like family laptops, servers, and phones.
guest - Guest devices for when someone is visiting.
iot - For shady Internet of Crap devices, typically made by low-margin hardware venders who can’t be bothered with basic security practices. This includes TVs, most home automation, etc.

To create the vlans, we to make the following changes to configuration.nix.

Add a vlan section to networking.
Add an entry for each vlan in networking.interfaces.
And and modify rules in networking.nftables.
Create a subnet for each vlan in services.dhcpd4.

Define the vlans

firewall.enable = false;

vlans = {
  lan = {
    interface = "enp2s0";
    id = 84;
  };
  iot = {
    interface = "enp2s0";
    id = 93;
  };
  guest = {
    interface = "enp2s0";
    id = 83;
  };
};

interfaces = {

Here we define the vlans. It’s pretty simple. each vlan gets a name, a physical interface, and a vlan tag number.

Define the networking interfaces

interfaces = {
  enp1s0.useDHCP = true;
  enp3s0.useDHCP = false;
  enp4s0.useDHCP = false;

  enp2s0 = {
    useDHCP = false;
    ipv4.addresses = [{
        address = "192.168.0.1";
        prefixLength = 24;
    }];
  };
  lan = {
    ipv4.addresses = [{
      address = "10.13.84.1";
      prefixLength = 24;
    }];
  };
  iot = {
    ipv4.addresses = [{
      address = "10.13.93.1";
      prefixLength = 24;
    }];
  };
  guest = {
    ipv4.addresses = [{
      address = "10.13.83.1";
      prefixLength = 24;
    }];
  };
};

We’ll use subnets in the 10.0.0.0 – 10.255.255.255 range, one of the three IPv4 ranges available for private networks. I chose these numbers because I’m a Miami Dolphins fan. 13 is Dan Marino’s number. 83, 84, and 93 are each years of his career with significant meaning. 83 was his rookie year, and he only played half the season. So, like our guest network he had limited functionality. In 84 Dan played all year and set every significant single-season passing record. This works well for out privileged lan network. Finally, we have 93 for IoT. In 1993, Dan tore his achilles and missed most of the year. I think that’s a good match for IoT gear.

The last network is the default non-vlan subnet. Any hardware that does not specifically opt into a vlan will be on this.

Update the firewall

nftables = {
  enable = true;
  ruleset = ''
    table ip filter {
      chain input {
        type filter hook input priority 0; policy drop;
        iifname { "enp2s0", "lan" } accept comment "Allow trusted local network to access the router"
        iifname "enp1s0" ct state { established, related } accept comment "Allow established traffic"
        iifname "enp1s0" icmp type { echo-request, destination-unreachable, time-exceeded } counter accept comment "Allow select ICMP"
        iifname "enp1s0" counter drop comment "Drop all other unsolicited traffic from wan"
      }
      chain forward {
        type filter hook forward priority 0; policy drop;
        iifname { "enp2s0", "lan", "iot", "guest" } oifname { "enp1s0" } accept comment "Allow LAN to WAN"
        iifname { "enp1s0" } oifname { "enp2s0", "lan", "iot", "guest" } ct state { established, related } accept comment "Allow established back to LANs"
        iifname { "lan" } oifname { "iot" } counter accept comment "Allow trusted LAN to IoT"
        iifname { "iot" } oifname { "lan" } ct state { established, related } counter accept comment "Allow established back to LANs"
      }
    }

    table ip nat {
      chain postrouting {
        type nat hook postrouting priority 100; policy accept;
        oifname "enp1s0" masquerade
      } 
    }

    table ip6 filter {
      chain input {
        type filter hook input priority 0; policy drop;
      }
      chain forward {
        type filter hook forward priority 0; policy drop;
      }
    }
  '';
};

The first change is that we’re only allowing traffic from the lan vlan and the default non-vlan adapter enp2s0 to also access processes on the router. Remember from the netfilter hooks diagram that the input hook happens after the routing decision between local and forwarding. You might be thinking: How are devices on the guest and iot vlans going to get assigned IP addresses since the DHCP service is a local process? I wondered that too, and I’ll talk about it in the next section on debugging with nftrace.

Netfilter local routing decision

The next two changes allow traffic from any of the vlans to be forwarded to the wan interface, and stateful traffic from wan to flow back to the vlans.

Finally, we allow traffic from lan to be forwarded to iot, and stateful traffic from iot back to lan. This is so an iPhone app can control a TV for example.

Update DHCP Service

dhcpd4 = {
  enable = true;
  interfaces = [ "enp2s0" "lan" "iot" "guest" ];
  extraConfig = ''

    subnet 10.13.84.0 netmask 255.255.255.0 {
      option broadcast-address 10.13.84.255;
      option routers 10.13.84.1;
      option domain-name-servers ${publicDnsServer};
      option subnet-mask 255.255.255.0;
      interface lan;
      range 10.13.84.2 10.13.84.254;
    }

    subnet 10.13.93.0 netmask 255.255.255.0 {
      option broadcast-address 10.13.93.255;
      option routers 10.13.93.1;
      option domain-name-servers ${publicDnsServer};
      option subnet-mask 255.255.255.0;
      interface iot;
      range 10.13.93.2 10.13.93.254;
    }

    subnet 10.13.83.0 netmask 255.255.255.0 {
      option broadcast-address 10.13.83.255;
      option routers 10.13.83.1;
      option domain-name-servers ${publicDnsServer};
      option subnet-mask 255.255.255.0;
      interface guest;
      range 10.13.83.2 10.13.83.254;
    }

    subnet 192.168.0.0 netmask 255.255.255.0 {
      option broadcast-address 192.168.0.255;
      option routers 192.168.0.1;
      option domain-name-servers ${publicDnsServer};
      option subnet-mask 255.255.255.0;
      interface enp2s0;
      range 192.168.0.2 192.168.0.254;
    }
  '';
};

This is pretty simple. We just add a subnet for each interface, and the DHCP server knows what to do.

Update the router

We’ll just use our shell script to update the router.

./push_to_router.sh switch

After the nixos configuration changes take effect, I found I needed to disable and reenable my network adapter for it to get a new DHCP lease. After that’s done, our simple network looks like this.

Step three

Join a vlan

VLANs really only make sense when you use switches and access points that support them, and we’ll get there. But in the interest of keeping it very simple, we’re going to use the same direct connection from the laptop to the router. In Macos, I took the following steps to create a virtual network interface to join the iot vlan.

Open System Preferences and go to Network.
In the bottom of the interface list, click the ellipsis and chose to Manage Virtual Interfaces.
From that dialog, click the (+) to add a new VLAN.
Select the current physical interface, and enter 93 for the tag.

I created virtual adapters for all three vlans. They all are able to connect simultaneously and get separate configurations over DHCP.

IMPORTANT: See the next section on MTU issues for a very importing setting that is required when creating virtual vlan adapters.

Virtual VLAN Adapters

Let’s take a look at a tcpdump of the DHCP discover broadcast and offer. Good ol' number 84 vlan tag is highlighted.

No.     Time           Source                Destination           Protocol Length Info
    288 4.529580       0.0.0.0               255.255.255.255       DHCP     346    DHCP Discover - Transaction ID 0x2026330c

Frame 288: 346 bytes on wire (2768 bits), 346 bytes captured (2768 bits) on interface en7, id 0
Ethernet II, Src: Luxshare_b7:b9:d3 (3c:18:a0:b7:b9:d3), Dst: Broadcast (ff:ff:ff:ff:ff:ff)
    Destination: Broadcast (ff:ff:ff:ff:ff:ff)
    Source: Luxshare_b7:b9:d3 (3c:18:a0:b7:b9:d3)
    Type: 802.1Q Virtual LAN (0x8100)
802.1Q Virtual LAN, PRI: 0, DEI: 0, ID: 84
    000. .... .... .... = Priority: Best Effort (default) (0)
    ...0 .... .... .... = DEI: Ineligible
    .... 0000 0101 0100 = ID: 84
    Type: IPv4 (0x0800)
Internet Protocol Version 4, Src: 0.0.0.0, Dst: 255.255.255.255
User Datagram Protocol, Src Port: 68, Dst Port: 67
Dynamic Host Configuration Protocol (Discover)

No.     Time           Source                Destination           Protocol Length Info
    289 4.530663       10.13.84.1            10.13.84.2            DHCP     346    DHCP Offer    - Transaction ID 0x2026330c

Frame 289: 346 bytes on wire (2768 bits), 346 bytes captured (2768 bits) on interface en7, id 0
Ethernet II, Src: Qotom_a0:4d:38 (20:7c:14:a0:4d:38), Dst: Luxshare_b7:b9:d3 (3c:18:a0:b7:b9:d3)
    Destination: Luxshare_b7:b9:d3 (3c:18:a0:b7:b9:d3)
    Source: Qotom_a0:4d:38 (20:7c:14:a0:4d:38)
    Type: 802.1Q Virtual LAN (0x8100)
802.1Q Virtual LAN, PRI: 0, DEI: 0, ID: 84
    000. .... .... .... = Priority: Best Effort (default) (0)
    ...0 .... .... .... = DEI: Ineligible
    .... 0000 0101 0100 = ID: 84
    Type: IPv4 (0x0800)
Internet Protocol Version 4, Src: 10.13.84.1, Dst: 10.13.84.2
User Datagram Protocol, Src Port: 67, Dst Port: 68
Dynamic Host Configuration Protocol (Offer)

And of course, now that were are connected to four different networks, the routing table is going to look a lot different.

bash-3.2$ netstat -rn
Routing tables

Internet:
Destination        Gateway            Flags           Netif Expire
default            192.168.0.1        UGScg             en7       
default            10.13.84.1         UGScIg          vlan0       
default            link#27            UCSIg       bridge100      !
default            link#29            UCSIg       bridge101      !
default            10.13.83.1         UGScIg          vlan1       
default            10.13.93.1         UGScIg          vlan2       
10.13.83/24        link#32            UCS             vlan1      !
10.13.83.1/32      link#32            UCS             vlan1      !
10.13.83.1         20.7c.14.a0.4d.38  UHLWIir         vlan1   1146
10.13.83.2/32      link#32            UCS             vlan1      !
10.13.84/24        link#18            UCS             vlan0      !
10.13.84.1/32      link#18            UCS             vlan0      !
10.13.84.2/32      link#18            UCS             vlan0      !
10.13.93/24        link#33            UCS             vlan2      !
10.13.93.1/32      link#33            UCS             vlan2      !
10.13.93.1         20.7c.14.a0.4d.38  UHLWIir         vlan2   1146
10.13.93.2/32      link#33            UCS             vlan2      !
10.13.93.50        link#33            UHLWIi          vlan2      !
10.37.129/24       link#29            UC          bridge101      !
10.211.55/24       link#27            UC          bridge100      !
127                127.0.0.1          UCS               lo0       
127.0.0.1          127.0.0.1          UH                lo0       
169.254            link#30            UCS               en7      !
169.254            link#18            UCSI            vlan0      !
169.254            link#32            UCSI            vlan1      !
169.254            link#33            UCSI            vlan2      !
192.168.0          link#30            UCS               en7      !
192.168.0.1/32     link#30            UCS               en7      !
192.168.0.1        20:7c:14:a0:4d:38  UHLWIir           en7   1170
192.168.0.2/32     link#30            UCS               en7      !

Debugging with nftrace

As I mentioned in the firewall section above, I had some issues with the hook input chain rules and DHCP. Or at least I thought I did. When I first connected my Macbook to all the networks: enp2s0, lan, guest, and iot, the last two guest and iot failed to be assigned an IP address through DHCP. I looked at other people’s router configurations, and they were similarly blocking some vlans from communicating with local router processes, but still expecting DHCP to work.

I decided to use nftrace to investigate. There is a simple page on ruleset debugging on the netfilter wiki. After reading that, I added the following chain to my ip filter table, making sure to give the trace chain a lower priority than the input chain.

chain trace_chain {
  type filter hook prerouting priority -1;
  iifname { "iot" } nftrace set 1
}

I originally called this chain “trace” and I was not able to see any trace debugging. Only after finally running journalctl -u nftables.service -n100 did I see that trace is a reserved word, and I needed to rename the chain.

I switched nixos to the new configuration with tracing. I disabled the virtual iot adapter on the Macbook. Then I logged directly into the router and ran nft monitor trace to start viewing the trace messages. Finally, I re-enabled the virtual iot adapter on the Macbook, and captured the following trace messages. Each packet is given an id. The DHCP Discover packet has IP header ID 12745. We see that it’s accepted by the trace_chain chain and flows to the input chain, where it is dropped. The exact same thing happens for the DHCP Request packet, IP header ID 12746. Interestingly, I think the trace ID is supposed to be different for each packet, but nftrace seems be be getting confused here as both have trace ID 949555e6.

trace id 949555e6 ip filter trace_chain packet: iif "iot" ether saddr 3c:18:a0:b7:b9:d3 ether daddr ff:ff:ff:ff:ff:ff 
   vlan pcp 0 vlan dei 0 vlan id 93 ip saddr 0.0.0.0 ip daddr 255.255.255.255 ip dscp cs0 ip ecn not-ect ip ttl 255 
   ip id 12745 ip length 328 udp sport 68 udp dport 67 udp length 308 @th,64,96 310722286284201070451228672 
trace id 949555e6 ip filter trace_chain rule iifname "iot" meta nftrace set 1 (verdict continue)
trace id 949555e6 ip filter trace_chain verdict continue 
trace id 949555e6 ip filter trace_chain policy accept 
trace id 949555e6 ip filter input packet: iif "iot" ether saddr 3c:18:a0:b7:b9:d3 ether daddr ff:ff:ff:ff:ff:ff
   vlan pcp 0 vlan dei 0 vlan id 93 ip saddr 0.0.0.0 ip daddr 255.255.255.255 ip dscp cs0 ip ecn not-ect ip ttl 255 
   ip id 12745 ip length 328 udp sport 68 udp dport 67 udp length 308 @th,64,96 310722286284201070451228672 
trace id 949555e6 ip filter input verdict continue 
trace id 949555e6 ip filter input policy drop 
trace id 949555e6 ip filter trace_chain packet: iif "iot" ether saddr 3c:18:a0:b7:b9:d3 ether daddr ff:ff:ff:ff:ff:ff 
   vlan pcp 0 vlan dei 0 vlan id 93 ip saddr 0.0.0.0 ip daddr 255.255.255.255 ip dscp cs0 ip ecn not-ect ip ttl 255 
   ip id 12746 ip length 328 udp sport 68 udp dport 67 udp length 308 @th,64,96 310722286284201070451359744 
trace id 949555e6 ip filter trace_chain rule iifname "iot" meta nftrace set 1 (verdict continue)
trace id 949555e6 ip filter trace_chain verdict continue 
trace id 949555e6 ip filter trace_chain policy accept 
trace id 949555e6 ip filter input packet: iif "iot" ether saddr 3c:18:a0:b7:b9:d3 ether daddr ff:ff:ff:ff:ff:ff 
   vlan pcp 0 vlan dei 0 vlan id 93 ip saddr 0.0.0.0 ip daddr 255.255.255.255 ip dscp cs0 ip ecn not-ect ip ttl 255 
   ip id 12746 ip length 328 udp sport 68 udp dport 67 udp length 308 @th,64,96 310722286284201070451359744 
trace id 949555e6 ip filter input verdict continue 
trace id 949555e6 ip filter input policy drop

A lot of people online are confused about this when setting up their own router. It turns out that DHCP is special and bypasses the firewall. This is actually great! It’s less rules that I have to write. Devices on iot and guest are allowed to forward packets to the internet, but not allowed to talk to local services, except DHCP. Remember that DHCP works hand-in-hand with ARP? ARP packets are also getting through, so devices and the router can confirm who had which IP addresses. Our firewall rules don’t prevent this, because ARP is a different address family in nftables, and we don’t have any rules in place for it.

VLANs and MTU issues

As we just saw, the DHCP messages are working, and we’re getting assigned different IP addresses for each virtual interface. We should be ready to browser the internet and such now. Unfortunately the VLAN connections are likely worthless for most usages without a small configuration change. I found this out the hard way. I was trying to scp down the configuration file from the router, since I have made changes to it, and the transfer kept hanging. It took me a while to figure out the issue. Even if I tried to cat the file while in an SSH session, it hung. Eventually someone pointed me in the right direction.

When we enable VLAN tagging, a 32 bit integer is inserted between the source mac address and the ether type fields.

VLAN tag in packet

The way I understand the problem is that the virtual interface on Macos, doesn’t really know that it’s in a VLAN. It behaves as if it’s just another interface, and when it passes its packets up to the physical interface, a vlan tag is added to the ethernet frame. When this happens, the ethernet frame can get larger than 1518. 1518 is the standard maximum frame size for ethernet. It’s made up of:

A 14 byte ethernet header
The IP header, protocol headers like for TCP, the protocol payload
A four byte checksum.

In order to fix this we need to set the MTU (maximum transmission unit) on the virtual device to something smaller. It’s going to default to 1500 bytes, and we can likely just set it to 1496, subtracting the four bytes for the vlan tag, and everything will work. But, we can use ping to test out how large packets have to get to start failing.

Using the following script, we can increment the the ping payload size until the ping fails. The script will report the largest successful ping.

#!/usr/bin/env bash

dest=yahoo.com
if [[ "" != "$1" ]]
  then
    dest="$1"
fi

sizes=($(seq 1300 10 1600))
max=0

for s in ${sizes[@]};
do
  ping -D -c 2 -W 2000 -o -s $s $dest
  if [[ $? == 0 ]]; then
    max=$s
  else
    break
  fi
done

limit=$(expr $max + 10)
sizes=($(seq $max 1 $limit))
lastsuccess=0

for s in ${sizes[@]};
do
  ping -D -c 2 -W 2000 -o -s $s $dest
    if [[ $? == 0 ]]; then
    lastsuccess=$s
  else
    break
  fi
done

echo "largest successful payload: ${lastsuccess}"

Using a non-vlan connection, the script reports the largest successful payload as 1472 bytes. Let’s take a look at the Wireshark capture for that frame.

Frame 174: 1514 bytes on wire (12112 bits), 1514 bytes captured (12112 bits) on interface en0, id 0
Ethernet II, Src: Apple_09:8a:57 (f0:2f:4b:09:8a:57), Dst: Ubiquiti_7f:47:30 (74:83:c2:7f:47:30)
Internet Protocol Version 4, Src: 192.168.1.190, Dst: 74.6.143.25
Internet Control Message Protocol
    Type: 8 (Echo (ping) request)
    Code: 0
    Checksum: 0x24da [correct]
    [Checksum Status: Good]
    Identifier (BE): 53437 (0xd0bd)
    Identifier (LE): 48592 (0xbdd0)
    Sequence Number (BE): 0 (0x0000)
    Sequence Number (LE): 0 (0x0000)
    [Response frame: 175]
    Timestamp from icmp data: Feb 23, 2022 15:50:37.297181000 EST
    [Timestamp from icmp data (relative): 0.000137000 seconds]
    Data (1464 bytes)

So the frame is listed as 1514 bytes. Adding the 4 byte checksum, that gives us the maximum of 1518 bytes. Here is a breakdown.

14 byte ethernet header
20 byte IP header.
8 byte ICMP header
8 byte timestamp + 1464 bytes of extra data for a total of the reported 1472 byte ping payload.
4 byte checksum.

Using a vlan connection, the script reports the largest successful payload as 1468 bytes. Let’s take a look at the Wireshark capture for that frame.

*Note, Wireshark lets you choose which adapter to capture traffic on. I am choosing the physical adapter for both captures. If I chose the virtual adapter, I would not see the vlan tags.

Frame 4334: 1514 bytes on wire (12112 bits), 1514 bytes captured (12112 bits) on interface en7, id 0
Ethernet II, Src: Luxshare_b7:b9:d3 (3c:18:a0:b7:b9:d3), Dst: Qotom_a0:4d:38 (20:7c:14:a0:4d:38)
802.1Q Virtual LAN, PRI: 0, DEI: 0, ID: 84
Internet Protocol Version 4, Src: 10.13.84.2, Dst: 74.6.143.26
Internet Control Message Protocol
    Type: 8 (Echo (ping) request)
    Code: 0
    Checksum: 0x1c78 [correct]
    [Checksum Status: Good]
    Identifier (BE): 65214 (0xfebe)
    Identifier (LE): 48894 (0xbefe)
    Sequence Number (BE): 0 (0x0000)
    Sequence Number (LE): 0 (0x0000)
    [Response frame: 4335]
    Timestamp from icmp data: Feb 23, 2022 16:38:31.250754000 EST
    [Timestamp from icmp data (relative): 0.000103000 seconds]
    Data (1460 bytes)

What to notice here is that we are reaching the largest successful payload size four bytes sooner (1468 vs 1472). If we don’t set the MTU to be four bytes smaller, the network adapter will fail to fragment large packets at the right boundary and our connection will be almost useless.

The MTU setting for the virtual adapter is under the hardware tab.

Macos virtual adapter MTU setting

The MTU could be made smaller, but not larger than 1496. With the MTU reduced, websites are loading properly, and I can SCP more than tiny files.

Adding Switches and APs to Create a Home Network

Check out part two, Adding Switches and APs to Create a Home Network