OPNsense and PPPoE with high availability: tricky, but doable

Why

Recently, my fiber-to-the-home (FTTH) connection changed from a slighly easier DHCP based setup to an additional required authentication step using PPPoE to connect to the Internet. The advantage is that, for IPv4, I no longer have to use double-NAT, as my own OPNsense firewall can directly get the (dynamically assigned through PPPoE, but statically allocated at the ISP end) IPv4 WAN address instead of the previous ISP router in between that then handed out transfer net non-routable IPv4 addresses via DHCP.

However, the disadvantage is that now only one of the two firewalls in my OPNsense cluster can get an actual WAN address at the same time (the one statically allocated by my ISP). That is, only one of the two can successfully connect its PPPoE WAN interface at any given time.

How

Without real dynamic routing, e.g. with BGP towards the ISP (which none of them supports for non-business accounts), the best solution is to run the firewalls in an active-passive cluster configuration with only the active/master one being in charge of routing/NAT and the single WAN IPv4 address. OPNsense supports this configuration easily with CARP. As I already had such a CARP setup for all my local network interfaces and previously for the WAN transfer net towards the ISP router, the only missing link was to connect PPPoE dialup to CARP state. With OPNsense version 24.1, the necessary scripts (particularly /usr/local/etc/rc.syshook.d/carp/20-ppp) have already been integrated upstream, but the setup is slightly tricky.

For additional context, my new ISP (Infotech) uses the BBOÖ FTTH network that terminates with a Huawei EG8010Hv6-10 ONT, offering a single 1Gbps Ethernet interface towards the customer. The admin account is locked, but a separate user called “Epuser” (with, as far as I can tell, unique password for each device) can be used to query some state info. The default local (transfer net) IP is 192.168.18.1/24, which I didn’t change.

To connect both OPNsense instances to the ONT, I created a separate VLAN on my Mikrotik CRS326-24G-2S+ core switch to bridge the APU4d4 hardware instance, the VM instance (running on Proxmox), and the ONT.

After some experimentation and trial-and-error, this setup works for me:

  1. Interfaces -> Virtual IPs -> Assignments: Assign the actual Ethernet interface connecting to the ONT (e.g., igb0) to a new interface name different to the wan interface. I called it WAN_Port, and internally on my system the assigned port name is opt12 (but that assigned name is not important). Note the physical Ethernet interface name, though.
  2. Interfaces -> WAN_Port: Assign a Static IPv4 address. I use 192.168.18.2/24 for the master, and 192.168.18.3/24 for the backup firewall.
  3. Interfaces -> Virtual IPs -> Settings: Create a CARP IP address on that newly assigned interface (WAN_Port) with the same VHID group and password for both firewalls. I use 192.168.18.10/24 as virtual IP. Note that the specific interface name and IP address are unimportant and not used as part of the actual PPPoE setup. For the HA integration scripts to work, the Ethernet interface simply needs to have a CARP virtual IP address working and switching over between the firewalls.
  4. System -> High Availability -> Settings: Turn on Disconnect dialup interfaces.
  5. Interfaces -> WAN:
    • Switch IPv4 Configuration Type to PPPoE
    • Switch IPv6 Configuration Type to DHCPv6 for BBOÖ/Infotech
    • Under section PPPoE configuration, set Username and Password appropriately
    • Under section DHCPv6 client configuration, select Use IPv4 connectivity and set Prefix delegation size appropriately (60 in my case)
    • Save those settings, then click Click here for additional PPPoE configuration options. under PPPoE configuration
  6. Under the Interfaces -> Point-to-Point -> Devices screen this brings up, set Link interface(s) to the actual Ethernet interface connecting to the ONT - in my case this is igb0.

Notes

Note 1: With OPNsense version 24.1.5_3, at the time of this writing, the script /usr/local/etc/rc.syshook.d/carp/20-ppp only stops the PPPoE connection when the associated CARP interfaces goes into “BACKUP” mode, but not when manually setting Temporarily Disable CARP under Interfaces -> Virtual IPS -> Status. This is bad for debugging/testing. I have therefore patched both of my firewalls to also disable PPPoE when CARP is in “INIT” mode (which is the case when temporarily disabling CARP):

--- /tmp/20-ppp.orig    2024-04-18 12:53:43.767650142 +0200
+++ /tmp/20-ppp.fixed   2024-04-18 12:55:51.638549481 +0200
@@ -36,7 +36,7 @@
 
 $a_hasync = &config_read_array('hasync');
 if (!empty($a_hasync['disconnectppps'])) {
-    if ($type != 'MASTER' && $type != 'BACKUP') {
+    if ($type != 'MASTER' && $type != 'BACKUP' && $type != 'INIT') {
        log_msg("Carp '$type' event unknown from source '{$subsystem}'");
        exit(1);
     } elseif (!strstr($subsystem, '@')) {
@@ -51,7 +51,7 @@
             foreach($config['interfaces'] as $ifkey => $interface) {
                 if ($ppp['if'] == $interface['if']) {
                     log_msg("{$iface} is connected to ppp interface {$ifkey} set new status {$type}");
-                    if ($type == 'BACKUP') {
+                    if ($type == 'BACKUP' || $type == 'INIT') {
                         interface_suspend($ifkey);
                     } else {
                         interface_ppps_configure($ifkey);

Note 2: The overall solution is unstable at this point, mostly because PPPoE connection setup after an active/passive switch-over often fails because of timeouts (no replies from the OLT headend or PPPoE server). This seems to be because of a MAC address filter implemented on the OLT side, allowing only 2 MAC addresses seen behind the ONT (a configuration applied by Energie-AG as part of the BBOÖ fiber network), and not because of the CARP-based switching on OPNsense. I am still trying to work around that limitation, potentially using bridge/switch filter rules on my Mikrotik switch to avoid e.g. the virtual CARP MAC addresses or Mikrotik switch MAC addresses from leaking towards the ONT/OLT and filling the MAC tables there. If successful, I will update this post with details.

Conclusions

Active/passive OPNsense firewall clusters are possible with PPPoE upstream WAN interfaces, but getting them to run is tricky and the official OPNsense documentation doesn’t mention this option at all. Finding the right combination of options took reading a lot of forum posts and eventually adding log messages to the CARP ppp script to find out how to set the Link interface(s) parameter to make it work.

René Mayrhofer
René Mayrhofer
Professor of Networks and Security & Director of Engineering at Android Platform Security; pacifist, privacy fan, recovering hypocrite; generally here to question and learn