High Availability Settings

To configure High Availability settings, navigate to the following page in the Panzura CloudFS WebUI:

Configuration > High Availability

Panzura node HA Solutions

The following HA solutions are supported:

  • HA Local: An active node is protected by a dedicated standby. When the active node fails, the passive standby assumes its identity and takes over operations. The takeover operation can be automatic or manual. HA Local is similar to the methods used by legacy enterprise storage product. In this configuration, an active node is protected by a dedicated, passive standby. When the active node fails, the standby takes over ownership of the file system and the node operations. The following HA Local options are supported:
  • Local: The active and standby nodes have different hostnames and IP addresses.
  • Local with shared address: The active node and passive standby have an additional shared hostname and IP address, which simplifies the takeover process. A shared address is required if you are using Auto Failover. (Maximum length of the shared hostname is 15 characters.)
  • HA Global: One or more nodes are protected by one or more shared standbys, which can be separated geographically from the nodes they protect.

Sample HA Deployment

The following figure shows a Panzura deployment with three working sites—Los Angeles, London, and Paris—and two sites provisioned for HA—Phoenix and Amsterdam. A Panzura node is physically deployed at each site. Users at the three working sites connect to their local node, have a complete view of the shared file system, and experience LAN access speeds to the data in the global file system.

The figure shows HA options deployed as follows:

  • HA-Global. The node in Amsterdam protects both subordinates in London and Paris, as well as the Master node in LA.
  • HA-Local. The node in Phoenix is dedicated to protecting the Master node in Los Angeles.

The following HA-Local options are supported:

  • Local: The active and standby nodes have different hostnames and IP addresses.
  • Local with shared address: The active node and passive standby have an additional shared hostname and IP address, which simplifies the takeover process. (Maximum length of the shared hostname is 15 characters.)

Auto Failover

HA Local can be configured for Auto Failover. Auto Failover enables High Availability of an active-standby pair of CloudFS nodes to automatically perform a failover.

In an Auto Failover configuration, the Active and Standby nodes require a shared virtual IP (VIP) address and regularly exchange health and status information. The nodes regularly exchange status information in two ways:

• Directly via peer-to-peer connection (SSH)

• Posting status information to the cloud in state files. Status information about both Active and Standby nodes is available to both the Active and Standby nodes.

Note: Please take note that most public clouds do not support the use of VIP. Because of this, Auto Failover is not supported for in-cloud deployments.

Auto Failover Scenarios

There are three scenarios whereby Auto Failover can occur:

1. Active node loses communication to both the cloud object store and to the Standby node. This is determined by the Active node when the following two conditions occur:

  • Communication to cloud object store has failed for 160 consecutive attempts (either download or upload) and for ten minutes or more AND
  • Communication to the Standby node has failed for 10 minutes (state information exchanged with Standby every 30s. Twenty (20) consecutive failures required to meet 10 minute threshold.)

If both conditions are met, the Active node will change its own state to “Standby” and stop accepting user connections. This is to avoid the possibility of a split brain scenario.

The Standby node will monitor communications from the Active node and initiate a takeover process to become the Active node when:

  • Communication to the current Active node has failed for 10 minutes (state information exchanged with Standby every 30s. Twenty (20) consecutive failures required to meet 10 minute threshold.) AND
  • The Standby node has determined that the Active node has not updated the state file in the cloud for at least 10 minutes.

In this case, Panzura CloudFS’s architecture ensures that the Standby node assumes the role of the Active node.

2. A second scenario for failover requires that the Active node perform health checks on itself. The Active node must make the following determinations in order to change its own state to Standby:

  • Assess its own critical operational processes (including file system status) and find it is in an unhealthy state AND
  • The Standby node is in a healthy state AND
  • The Standby node is less than 50 system snapshots behind the Active node

If all conditions are met, the Active node will change its own state to “Standby” and communicate to the Standby node to become the Active node.

3. A third scenario for failover may occur during maintenance activity of the Active node. There may be instances where the Active node would require reboot. During this process, the Standby node is aware of the Active node reboot. If the Standby node does not receive communication from the Active node for 12 minutes after reboot has been initiated, the Standby node will take over as the Active node.

Maintenance note: If, during the course of a planned maintenance window for any 7.1.x release version and higher, the Active node needs to be powered down, Auto Failover may need to be disabled first. Please consider your requirements for the Active node and for the Standby node.

Status Information Exchanged by the Active and Standby nodes

The Active and Standby nodes (peers) exchange information in order to make a coordinated decision on triggering a takeover. The following information is exchanged between the two peers, over an SSH connection that is automatically established between the two nodes when the HA pair is configured:

  • Cloud status: Determined by a configured number of upload/download failures over a period of time.
  • File system status: Status of the file system (based on number of metaslab errors encountered).
  • Import status: Based on whether the node successfully imports the file system after a reboot (either forced or unscheduled).
  • Critical process failed: Triggers failover if all remaining retries for a process fail.
  • Snapshot sync status: Determines eligibility for a failover based on snapshot sync status. If the snapshot sync status is 50 or more snapshots behind, failover is not allowed.
  • Scheduled reboot: If the active node is scheduled to reboot, HA takes this into account and does not perform a failover when the rebooting node goes offline during its reboot.
  • State change: Used to figure out whether the takeover is impending.

Requirements for Auto Failover

  • Both the Active and Standby nodes must be in the same subnet.
  • The Active-Standby pair must use a shared IP address and hostname, which must be registered in DNS. Clients and other devices will reach whichever node is Active using the shared address. This prevents the need to reconfigure DNS following a failover.

Auto Failover Default Setting

Auto Failover is enabled by default in new HA Local configurations created on nodes running on the latest version release (7.1.x and later). For HA configurations created using earlier software versions, you can upgrade to the latest Panzura version release and enable auto failover using the WebUI.


  • Auto Failover is supported in deployments where the Standby node is configured as an HA-Local node.
  • Auto Failover requires a virtual IP (VIP) address. Auto Failover is supported only for HA configurations that use a VIP.
  • Auto Failover is not supported on Panzura CloudFS Virtual Hard Disk (VHD) for Microsoft. This is because VIPs are not supported in the Azure Cloud.
  • Prior to PZOS 8.1, Panzura does not enable takeover option when Auto Failover is enabled it follows different code paths.

Setting Up HA

HA can be configured either during initial setup using the setup wizard, or later using the management WebUI.

Setting Up Local HA with Auto Failover

To set up Local HA with Auto Failover, use the following steps.

During Initial Setup

When configuring the secondary node (the one that initially will be the standby), use the following settings in the Role section of the wizard:

  • Configuration Mode: HA Local
  • Auto Failover: Enable
  • Shared DNS Hostname: DNS hostname shared by the active-standby pair.
  • Shared IP Address: IP address that is mapped to the shared hostname on the DNS server.
  • Peer-to-Peer Authentication Key: Click Upload to load the Master node’s authentication key onto the secondary (HA Local) node.

No specific settings are required on the primary node (the one that initially will be the active node).

Setting Up Auto Failover After Software Upgrade

If the nodes you plan to configure for Auto Failover are already deployed, use the following steps:

  1. Navigate to the Master node and log in to the WebUI.
  2. Navigate to Management > High Availability.
  3. Set the Virtual IP option to enable, if not already enabled.
  4. Enter the shared hostname and IP address for the active-standby pair of nodes.
  5. Set the Auto Failover option to enable.
  6. Click Done.
  7. Click Save to write the changes to the configuration.

Additional Information

If the Active node in this HA local is pairing with the Master node, please use these steps

  1. Update all the node's master-cc-host settings to direct to the shared hostname of the master.
  2. The master-cc-host field can be changed via CLI or webui (webui->Configuration->Configuration Mode->Subodrinate->Master: shared hostname and not the active hostname of the master node)
  3. Access the node with \\sharedhostname\ instead of \\active-node-hostname\
  4. Update DFS-N to targets to use \\sharehostname\ instead of \\active-node-hostname

All of these points make sense because now that we configured a share hostname and a share virtual IP (which points to that shared hostname in the DNS), we might as well use it.

Auto Configuration Options

You can configure the following Auto Failover options.

Option Description Default
Primary node node that initially will be the active node in the Auto Failover pair. This node remains the active node until there is a failover. (none)
Secondary node

node that will begin as the standby node.

(none)
Virtual IP Enables the active-standby pair to use a shared IP address and hostname. The Virtual IP (VIP) option allows clients and other devices to reach whichever node is active, even if a failover has occurred. (none)
Shared Hostname DNS name shared by the active-standby pair. (none)
Shared IP IP address shared by the active-standby pair. This is the address that is mapped to the active pair’s shared hostname in DNS. (none)
Time To Wait for Maintenance Reboot If the active node has a scheduled reboot (typically for maintenance), this is the number of minutes the standby node allows for the reboot to occur. This timer prevents unnecessary failovers that occur because the standby node assumes the active node is unavailable. 12
Number of Allowed Dirty Snapshots Maximum number of un-synced snapshots the active node can have, and still remain eligible for failover to the standby. The unsynced snapshots reside in the active node’s dirty cache, in the lost+found folder on the failed node. If the file can be rebooted, the un-synced snapshots can be recovered from this folder . 50
Peer Update Threshold

Maximum number of seconds the active and standby nodes wait for updates from one another. These updates are exchanged directly between the nodes over SSH.

If the standby node does not receive an update from the active node before this threshold expires, failover may occur. (The other failover criteria also must be met.)

200
Cloud Update Threshold Maximum number of seconds the active and standby nodes are allowed to take to send status updates to the cloud. These updates are not exchanged directly between the nodes but instead are read by each node from the cloud. 10
Cloud Failure Count Maximum number of acceptable cloud failures. 20

Setting Up HA Local (no Auto Failover)

To set up HA Local (with no Auto Failover), use the following steps.

During Initial Setup

When configuring the secondary node, use the following settings in the Role section of the wizard:

  • Configuration Mode: HA Local
  • Auto Failover: Disable
  • (optional) Shared DNS Hostname: DNS hostname shared by the active-standby pair.
  • (optional) Shared IP Address: IP address that is mapped to the shared hostname on the DNS server.
  • Peer-to-Peer Authentication Key: Click Upload to load the Master node’s authentication key onto the secondary (HA Local) node.

The shared hostname and IP address are optional. If you do not configure them, you will need to update DNS to point to this node following a failover.

Using the WebUI

After the HA Local active-standby nodes are deployed, you can change HA settings from the WebUI of the Master node.

  1. Navigate to the Master node and log in to the WebUI.
  2. Navigate to Management > High Availability.
  3. Set the Virtual IP option to enable, if not already enabled.
  4. Enter the shared hostname and IP address for the active-standby pair of nodes.
  5. Set the Auto Failover option to enable.
  6. Click Done.
  7. Click Save to write the changes to the configuration.

Additional Information

If the Active node in this HA local is pairing with the Master node, please use these steps

  1. Update all the node's master-cc-host settings to direct to the shared hostname of the master.
  2. The master-cc-host field can be changed via CLI or webui (webui->Configuration->Configuration Mode->Subodrinate->Master: shared hostname and not the active hostname of the master node)
  3. Access the node with \\sharedhostname\ instead of \\active-node-hostname\
  4. Update DFS-N to targets to use \\sharehostname\ instead of \\active-node-hostname\

All of these points make sense because now that we configured a share hostname and a share virtual IP (which points to that shared hostname in the DNS), we might as well use it.

Setting Up HA Global (no Auto Failover)

To set up HA Global (with no Auto Failover), use the following steps listed in the link here.

During Initial Setup

When configuring the node that will be the global standby, use the following settings in the Role section of the wizard:

  • Configuration Mode: HA Global
  • Peer-to-Peer Authentication Key: Click Upload to load the Master node’s authentication key onto the secondary (HA Global) node.

Using the WebUI

Use the setup wizard to configure the global standby.