Skip to content

Instantly share code, notes, and snippets.

@spali
Last active July 10, 2025 21:42
Show Gist options
  • Save spali/2da4f23e488219504b2ada12ac59a7dc to your computer and use it in GitHub Desktop.
Save spali/2da4f23e488219504b2ada12ac59a7dc to your computer and use it in GitHub Desktop.
Disable WAN Interface on CARP Backup
#!/usr/local/bin/php
<?php
require_once("config.inc");
require_once("interfaces.inc");
require_once("util.inc");
$subsystem = !empty($argv[1]) ? $argv[1] : '';
$type = !empty($argv[2]) ? $argv[2] : '';
if ($type != 'MASTER' && $type != 'BACKUP') {
log_error("Carp '$type' event unknown from source '{$subsystem}'");
exit(1);
}
if (!strstr($subsystem, '@')) {
log_error("Carp '$type' event triggered from wrong source '{$subsystem}'");
exit(1);
}
$ifkey = 'wan';
if ($type === "MASTER") {
log_error("enable interface '$ifkey' due CARP event '$type'");
$config['interfaces'][$ifkey]['enable'] = '1';
write_config("enable interface '$ifkey' due CARP event '$type'", false);
interface_configure(false, $ifkey, false, false);
} else {
log_error("disable interface '$ifkey' due CARP event '$type'");
unset($config['interfaces'][$ifkey]['enable']);
write_config("disable interface '$ifkey' due CARP event '$type'", false);
interface_configure(false, $ifkey, false, false);
}
@jwbryan
Copy link

jwbryan commented Jan 3, 2025

Thank you for your efforts on this. I've got it set up and working when failing over. However, when the other device comes back online, I'm experiencing an issue. At that point, both firewalls are active and - since I duplicated the MAC address - competing for the IP address from the ISP. Has anyone else experienced this issue? How have you worked around it?

@vc1cv1
Copy link

vc1cv1 commented Jan 3, 2025

Thank you for your efforts on this. I've got it set up and working when failing over. However, when the other device comes back online, I'm experiencing an issue. At that point, both firewalls are active and - since I duplicated the MAC address - competing for the IP address from the ISP. Has anyone else experienced this issue? How have you worked around it?

which revision of the code are you using? Normally, the backup's interface should remained disabled unless the CARP status changes.

also, under HA -> settings -> "disable preempt" -- do you have that checked or unchecked? Mine is unchecked -- maybe you have this checked.

"When this device is configured as CARP master it will try to switch to master when powering up, this option will keep this one slave if there already is a master on the network. A reboot is required to take effect."

@jwbryan
Copy link

jwbryan commented Jan 3, 2025

I'm using the one from above, I think you posted it "last week". I did update it to handle my second ISP (I have two ISPs, but neither provide a second IP). Preempt is disabled.

I THINK even though it will come up as a backup, it still tries to grab an IP address at bootup because CARP has not yet been initialized. I see an increase in loss (on the master WAN links) right as the (other, backup) system boots and when it gets to parts (during the boot) where it says something about configuring the WAN interfaces. This makes sense, since the backup does not yet have an awareness of CARP on those interfaces (since they're not configured for CARP) and should logically try to get an IP (with a duplicated MAC) and it is attempting to bring those interfaces up. I may try to spend some time in the other RC directories to see if there is a logical place to down the WAN interfaces until CARP is up and the system's role can be determined. I wasn't sure if others had seen the same issue and - if they had - what may have been done to work around it.

@lcasale
Copy link

lcasale commented Feb 18, 2025

Has anyone tried this on 25.x yet? Either I'm being very dumb or there's a bug where additional scripts in /usr/local/etc/rc.syshook.d/carp/ are not executed. If I move the code to 20-openvpn it works. If I copy all the code from 20-openvpn into 10-wancarp it does not execute. Permissions should be correct

image

Am I missing something obvious?

@toddgonzo74
Copy link

Been on 25.x for a couple of weeks.. took the plunge after taking a snapshot of both firewalls. Zero issues on this end.. scripts working as intended.

@magomez96
Copy link

magomez96 commented Jun 18, 2025

Has anyone tried this on 25.x yet? Either I'm being very dumb or there's a bug where additional scripts in /usr/local/etc/rc.syshook.d/carp/ are not executed. If I move the code to 20-openvpn it works. If I copy all the code from 20-openvpn into 10-wancarp it does not execute. Permissions should be correct

image

Am I missing something obvious?

I'm also seeing the same issue on 25.1.8_1, did you ever find a solution?

@magomez96
Copy link

Has anyone tried this on 25.x yet? Either I'm being very dumb or there's a bug where additional scripts in /usr/local/etc/rc.syshook.d/carp/ are not executed. If I move the code to 20-openvpn it works. If I copy all the code from 20-openvpn into 10-wancarp it does not execute. Permissions should be correct
image
Am I missing something obvious?

I'm also seeing the same issue on 25.1.8_1, did you ever find a solution?

Got this fixed. The #! has to be the first line in the script and I had a comment above it

@lcasale
Copy link

lcasale commented Jun 18, 2025

Has anyone tried this on 25.x yet? Either I'm being very dumb or there's a bug where additional scripts in /usr/local/etc/rc.syshook.d/carp/ are not executed. If I move the code to 20-openvpn it works. If I copy all the code from 20-openvpn into 10-wancarp it does not execute. Permissions should be correct
image
Am I missing something obvious?

I'm also seeing the same issue on 25.1.8_1, did you ever find a solution?

Got this fixed. The #! has to be the first line in the script and I had a comment above it

Glad you were able to fix it. My problem was some encoding issue uploading through scp. Once I created and edited the files directly on the router things worked as expected.

@lavacano
Copy link

lavacano commented Jul 10, 2025

I was vibing this. havent tried it yet

#!/usr/local/bin/php
<?php
/*
 * OPNsense HA Failover Script for Single Static WAN IP
 *
 * Manages a single static WAN IP in a High Availability cluster,
 * ensuring the backup node retains internet access via the master and
 * that stateful connections fail over cleanly.
 *
 * v2.1 - 2025-07-15
 * - Integrated user feedback for robust multi-VIP support.
 * - Added lock file to prevent race conditions.
 * - Replaced raw exec() with mwexecf() for security.
 * - Replaced manual route manipulation with system_default_route() for robustness.
 * - Added state killing on BACKUP event for seamless failover.
 * - Implemented verbose, configurable logging.
 * - Added error handling and cleanup routines.
 */

// #################### CONFIGURATION ####################
// The logical interface name for your WAN (e.g., 'wan').
$ifkey = 'wan';
// The CARP VIP on your LAN for gateway redirection.
$lan_vip_v4 = '10.0.1.1';
$lan_vip_v6 = '2006::1';
// Set to 'true' for detailed logging in System -> Log Files -> General.
$verbose_logging = true;
// Path for the lock file to prevent concurrent execution.
$lock_file = '/tmp/wan_failover.lock';
// #######################################################

// Required OPNsense libraries
require_once("config.inc");
require_once("interfaces.inc");
require_once("util.inc");
require_once("system.inc");

// --- Helper Functions ---

/**
 * Custom logger for this script.
 * @param string $message The message to log.
 */
function log_failover($message)
{
    global $verbose_logging;
    if ($verbose_logging) {
        log_msg("WAN Failover: ". $message, LOG_NOTICE);
    }
}

// --- Main Execution ---

// Ensure the lock file is removed on script exit
register_shutdown_function(function () use ($lock_file) {
    if (file_exists($lock_file)) {
        unlink($lock_file);
    }
});

// Prevent concurrent execution
$lock_handle = fopen($lock_file, 'w');
if ($lock_handle === false ||!flock($lock_handle, LOCK_EX | LOCK_NB)) {
    log_msg("WAN Failover: Script is already running. Exiting to prevent race condition.", LOG_WARNING);
    exit(1);
}

// Read CARP event arguments
$subsystem =!empty($argv[1])? $argv[1] : '';
$type =!empty($argv[2])? $argv[2] : '';

// Exit if the event type isn't one we care about.
if (!in_array($type,)) {
    log_failover("Ignoring event type '{$type}' on '{$subsystem}'.");
    exit(0);
}

// Exit if the event source isn't a CARP VIP
if (!strstr($subsystem, '@')) {
    log_msg("WAN Failover: Script triggered from non-CARP source '{$subsystem}'. Ignoring.", LOG_WARNING);
    exit(1);
}

global $config;

if ($type === "MASTER") {
    /**********************
     * BECOME MASTER NODE *
     **********************/
    log_msg("WAN Failover: CARP MASTER event on {$subsystem}. Enabling WAN interface.", LOG_NOTICE);

    // Set WAN interface to be enabled with its static IP config
    log_failover("Setting interface '{$ifkey}' to enabled and ipaddr 'static'.");
    $config['interfaces'][$ifkey]['enable'] = true;
    $config['interfaces'][$ifkey]['ipaddr'] = 'static';
    write_config("WAN Failover: Set {$ifkey} to enabled (MASTER)", false);

    // Apply the interface configuration. This brings the interface up, assigns the static IP,
    // and triggers a routing recalculation to set the default gateway to the ISP.
    log_failover("Applying interface configuration for '{$ifkey}'.");
    interface_configure(false, $ifkey, true, false);

    // Explicitly reconfigure routing to ensure a clean state.
    log_failover("Triggering system routing configuration.");
    system_routing_configure();

} else { // Handles "BACKUP" state
    /**********************
     * BECOME BACKUP NODE *
     **********************/
    log_msg("WAN Failover: CARP BACKUP event on {$subsystem}. Disabling WAN IP and rerouting traffic.", LOG_NOTICE);

    // This is the critical step for seamless failover. Kill all firewall states
    // that are associated with traffic going through the WAN interface. This forces
    // clients to re-establish their connections through the new master.
    log_failover("Killing states on interface '{$ifkey}' to ensure clean failover.");
    mwexecf('/sbin/pfctl -i %s -F states', [$ifkey]);

    // Set WAN IPv4 to "none" to release the static IP but keep the interface link up.
    log_failover("Setting interface '{$ifkey}' ipaddr to 'none'.");
    $config['interfaces'][$ifkey]['ipaddr'] = 'none';
    unset($config['interfaces'][$ifkey]['enable']);
    write_config("WAN Failover: Set {$ifkey} IP to none (BACKUP)", false);

    // Apply the interface configuration without a full reload to avoid routing conflicts.
    log_failover("Applying light interface configuration for '{$ifkey}'.");
    interface_configure(false, $ifkey, false, false);

    // Find the real LAN interface to use for the gateway by searching all VIPs.
    $lan_if = null;
    foreach ($config['virtualip']['vip'] as $vip) {
        if (isset($vip['subnet']) && $vip['subnet'] == $lan_vip_v4) {
            $lan_if = $vip['interface'];
            break;
        }
    }

    if ($lan_if) {
        $real_lan_if = get_real_interface($lan_if);
        log_failover("Rerouting default gateways through LAN VIPs on interface '{$real_lan_if}'.");

        // Reroute IPv4 default gateway
        $gw_v4 = ['gateway' => $lan_vip_v4, 'if' => $real_lan_if];
        system_default_route($gw_v4,);

        // Reroute IPv6 default gateway
        $gw_v6 = ['gateway' => $lan_vip_v6, 'if' => $real_lan_if];
        system_default_route($gw_v6,);
    } else {
        log_msg("WAN Failover: Could not find LAN interface for VIP {$lan_vip_v4}. Cannot set backup gateway.", LOG_ERR);
    }
}

log_failover("Script finished for event '{$type}'.");
exit(0);
?>

Deployment and Verification
The following steps should be performed on both nodes of the HA cluster:

  • Placement: Place the refactored script, named 10-wan-failover.php, in the directory /usr/local/etc/rc.syshook.d/carp/.
  • Configuration: Edit the script's configuration section to match your environment's WAN interface key ($ifkey) and LAN CARP VIP addresses ($lan_vip_v4, $lan_vip_v6).
  • Permissions: Set the execute permission on the script file: chmod +x /usr/local/etc/rc.syshook.d/carp/10-wan-failover.php.

To test the failover functionality:

  • Navigate to Interfaces -> Virtual IPs -> Status on the current MASTER node.
  • Click the "Enter Persistent CARP Maintenance Mode" button. This will force the node into a permanent BACKUP state and trigger a failover.
  • Observe the system logs (System -> Log Files -> General) on the BACKUP node. You should see log entries from the "WAN Failover" script indicating the transition to MASTER.
  • On the new MASTER node, verify the routing table using the shell command netstat -rn. The default route should now point out the physical WAN interface to your ISP's gateway.
  • On the old MASTER node (now in maintenance mode), verify its routing table. The default route should now point to your LAN CARP VIP.
  • From a client machine on the LAN, test outbound connectivity (e.g., browse a website, run a continuous ping). The transition should be nearly seamless, though new connections may have a brief delay.
  • To test failback, click "Leave Persistent CARP Maintenance Mode" on the original MASTER node. The system should revert to its original state.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment