High Availability for Panzura Filers

This guide describes the High Availability (HA) solutions for protecting Panzura Filers. HA enables a standby filer to take over for an active filer that becomes unavailable.

Panzura Filer HA Solutions

The following HA solutions are supported:

HA Local: An active filer is protected by a dedicated standby. When the active filer fails, the passive standby assumes its identity and takes over operations. The takeover operation can be automatic or manual. HA Local is similar to the methods used by legacy enterprise storage product. In this configuration, an active filer is protected by a dedicated, passive standby. When the active filer fails, the standby takes over ownership of the file system and the filer operations. The following HA Local options are supported:

Local: The active and standby filers have different hostnames and IP addresses.

Local with shared address: The active filer and passive standby have an additional shared hostname and IP address, which simplifies the takeover process. This is required for Auto Failover. (Maximum length of the shared hostname is 15 characters.)

HA Global: One or more filers are protected by one or more shared standbys, which can be separated geographically from the filers they protect.

Sample HA Deployment

The following figure shows a Panzura deployment with three working sites—Los Angeles, London, and Paris—and two sites provisioned for HA—Phoenix, and Amsterdam. A Panzura Filer is physically deployed at each site. Users at the three working sites connect to their local filer, have a complete view of the shared file system, and experience LAN access speeds to the data in the global file system.

The figure shows HA options deployed as follows:

HA-Global: The Filer in Amsterdam protects both subordinates in London and Paris, as well as the Master Filer in LA.

HA-Local: The Filer in Phoenix is dedicated to protecting the master Filer in Los Angeles.

The following HA-Local options are supported:

• Local: The active and standby filers have different hostnames and IP addresses.

• Local with shared address: The active filer and passive standby have an additional shared hostname and IP address, which simplifies the takeover process. (Maximum length of the shared hostname is 15 characters.)

Auto Failover

HA Local can be configured for Auto Failover. Auto Failover enables either of the filers in an active- standby pair that have a shared virtual IP (VIP) address to automatically perform a failover.

In an Auto Failover configuration, both the active and standby filers regularly exchange health and status information. The filers regularly exchange status information in two ways:

  • Directly over a peer-to-peer connection (SSH)

  • Post status information to the cloud in state files

Auto Failover Triggers

In an HA Local pair configured for Auto Failover, either the active or standby can trigger a failover.

Causes for Active Filer To Initiate Failover

When the Active loses connectivity to the cloud and the peer, it will change its state to standby and stop user connections. This is to avoid split brain, because most likely the standby triggered the failover and became active.

Causes for Standby Filer To Initiate Failover

Standby will trigger a failover, when any of the following conditions occur.

  • Peer connection is down while cloud upload and download are good. The timestamp in the active state has not been updated for a configured time.

  • Active state file indicates that active changed state. This happens when active decides it’s health is bad and could not communicate with the standby using the local path.

  • Active triggers a takeover.

  • If the standby can’t reach the cloud, it will keep its current state.

Status Information Exchanged by the Active and Standby Filers

The active and standby filers exchange information that provides them the ability to make a decision on triggering a takeover. The following information is exchanged between the two peers, over an SSH connection that is automatically established between the two filers.

  • Cloud status: Determined by a configured number of upload/download failures over a period of time.

  • File system status: Status of the file system (based on number of metaslab errors encountered).

  • Import status: Based on whether the filer successfully imports the file system after a reboot

    (either forced or unscheduled).

  • Critical process failed: Triggers failover if all remaining retries for a process fail.

  • Snapshot sync status: Determines eligibility for a failover based on snapshot sync status. If the snapshot sync status is 50 or more snapshots behind, failover is not allowed.

  • Scheduled reboot: If the active filer is scheduled to reboot, HA takes this into account and does not perform a failover when the rebooting filer goes offline during its reboot.

  • State change: Used to figure out whether the takeover is impending.

Requirements for Auto Failover

  • Both the active and standby filers must be in the same subnet.

  • They must use a shared IP address and hostname. These must be registered in DNS. Clients and other devices will reach whichever filer is active using the shared address. This prevents the need to reconfigure DNS following a failover.

Auto-Failover Default Setting

Auto Failover is enabled by default in new HA Local configurations created on filers running 7.1. For HA configurations created in earlier software releases, you can upgrade to the 7.1 release and enable auto-failover using the WebUI.

 

  • Auto Failover is supported in deployments where the standby is configured as an HA-Local filer.

  • Auto Failover requires a virtual IP (VIP) address. Auto Failover is supported only for HA

    configurations that use a VIP.

  • Auto Failover is not supported on Panzura Freedom Virtual Hard Disk (VHD) for Microsoft. This is because VIPs are not supported in the Azure cloud.

Setting Up HA

HA can be configured either during initial setup using the setup wizard, or later using the management WebUI.

Setting Up Local HA with Auto Failover

To set up Local HA with Auto Failover, use the following steps.

During Initial Setup

When configuring the secondary filer (the one that initially will be the standby), use the following settings in the Role section of the wizard:

  • Configuration Mode: HA Local

  • Auto Failover: Enable

  • Shared DNS Hostname: DNS hostname shared by the active-standby pair.

  • Shared IP Address: IP address that is mapped to the shared hostname on the DNS server.

  • Peer-to-Peer Authentication Key: Click Upload to load the Master Filer’s authentication key onto the secondary (HA Local) filer.

No specific settings are required on the primary filer (the one that initially will be the active filer).

Setting Up Auto Failover After Software Upgrade

If the filers you plan to configure for Auto Failover are already deployed, use the following steps:

  1. Navigate to the Master Filer and log in to the WebUI.

  2. Navigate to Management > High Availability.

  3. Set the Virtual IP option to enable, if not already enabled.

  4. Enter the shared hostname and IP address for the active-standby pair of filers.

  5. Set the Auto Failover option to enable.

  6. Click Done.

  7. Click Save to write the changes to the configuration.

Auto-Configuration Options

Option

Description

Default

Primary Filer

Filer that initially will be the active filer in the Auto Failover pair. This filer remains the active filer until there is a failover.

(none)

Virtual IP

Enables the active-standby pair to use a shared IP address and hostname. The Virtual IP (VIP) option allows clients and other devices to reach whichever filer is active, even if a failover has occurred.

(none)

Shared IP

IP address shared by the active-standby pair. This is the address that is mapped to the active pair’s shared hostname in DNS.

(none)

Option

Description

Default

Time To Wait for Maintenance Reboot

If the active filer has a scheduled reboot (typically for maintenance), this is the number of minutes the standby filer allows for the reboot to occur. This timer prevents unnecessary failovers that occur because the standby filer assumes the active filer is unavailable.

12

Number of Allowed Dirty Snapshots

Maximum number of un-synced snapshots the active filer can have, and still remain eligible for failover to the standby.

The unsynced snapshots reside in the active filer’s dirty cache, in the lost+found folder on the failed filer. If the file can be rebooted, the un-synced snapshots can be recovered from this folder.

50

Peer Update Threshold

Maximum number of seconds the active and standby filers wait for updates from one another. These updates are exchanged directly between the filers over SSH.

If the standby filer does not receive an update from the active filer before this threshold expires, failover may occur. (The other failover criteria also must be met. See Status Information Exchanged by the Active and Standby Filers.)

200

Cloud Update Threshold

Maximum number of seconds the active and standby filers are allowed to take to send status updates to the cloud. These updates are not exchanged directly between the filers but instead are read by each filer from the cloud.

10

Cloud Failure Count

Maximum number of acceptable cloud failures.

 20

Setting Up HA Local (no Auto Failover)

To set up HA Local (with no Auto Failover), use the following steps.

During Initial Setup

When configuring the secondary filer, use the following settings in the Role section of the wizard:

  • Configuration Mode: HA Local

  • Auto Failover: Disable

  • (optional) Shared DNS Hostname: DNS hostname shared by the active-standby pair.

  • (optional) Shared IP Address: IP address that is mapped to the shared hostname on the DNS server.

  • Peer-to-Peer Authentication Key: Click Upload to load the Master Filer’s authentication key onto the secondary (HA Local) filer.

The shared hostname and IP address are optional. If you do not configure them, you will need to update DNS to point to this filer following a failover.

Using the WebUI

After the HA Local active-standby filers are deployed, you can change HA settings from the WebUI of the Master Filer.

  1. Navigate to the Master Filer and log in to the WebUI.

  2. Navigate to Management > High Availability.

  3. Set the Virtual IP option to enable, if not already enabled.

  4. Enter the shared hostname and IP address for the active-standby pair of filers.

  5. Set the Auto Failover option to enable.

  6. Click Done.

  7. Click Save to write the changes to the configuration.

Setting Up HA Global (no Auto Failover)

To set up HA Global (with no Auto Failover), use the following steps.

During Initial Setup

When configuring the filer that will be the global standby, use the following settings in the Role section of the wizard:

  • Configuration Mode: HA Global

  • Peer-to-Peer Authentication Key: Click Upload to load the Master Filer’s authentication key onto the secondary (HA Global) filer.

Using the WebUI

Use the setup wizard to configure the global standby.

Viewing HA Status

To view HA status, log in to the WebUI and navigate to the Cloud File System Dashboard. HA status is shown in the HA Filer Status dashlet (located by default under the Active Filer Status dashlet).

 

The HA Filer Status dashlet shows the following information:

Column

Description

File System

Name of the file system protected by HA.

To find the filer associated with the file system, see the File System and Filer Hostname columns in the Active Filer Status dashlet.

Note: For Global HA, the file system name is “All”.

Primary / Secondary

File system names that are being protected.

  • Local HA: Names of the file systems on the filers in the active-standby air. The primary filer is the one that initially is the active filer. The asterisk ( * ) indicates that the filer that is currently active.

  • Global HA: Name of the filer that is configured as the standby for Global HA.

State

The HA state of the file system:

  • Ready: Failover is possible.

  • Not Ready: Failover is not possible because at least one of the filers is currently not meeting criteria for failover. For example, in an Auto Failover configuration, if the active filer’s number of dirty snapshots is too high, an automated failover is not allowed.

  • Down: The file system is down

Auto Failover

State of the Auto Failover feature.

Auto Failover Details

To display Auto Failover details for an active-standby pair of filers, click on a row in the
Primary / Secondary column of the HA Filer Status dashlet. The Auto Failover Status dialog appears:

 

In this example, Auto Failover details for active-standby pair cc8 / cc7. The active filer in the pair is cc8.

High Availability Takeover

To perform a takeover, the active filer must have failed or powered down. The HA standby can be activated only if the original active controller is no longer online. There is a delay between the time that the active filer becomes unavailable and when the takeover is possible.

Takeover for HA-Global or HA-Local (no shared address)

Follow this procedure for HA-Global or HA-Local without shared address. See Takeover for HA-Local with shared address for takeover with shared address.

  1. Verify that the Filer that is the target of the takeover is down.

  2. Log in to the standby filer if you are not already logged in, and verify that it is the standby. Click the "i" in the upper right corner of the Web UI and verify that the Configuration Mode is Standby. If you have configured HA-Local and there are multiple HA Local pairs in the CloudFS, verify that you are on the correct one.

  3. If this is a planned takeover, check the sync status by opening the Dashboard page and looking at the Active Filer Status and Spare Filer Status sections. If it is not a planned takeover, the process will take longer if the Filers are not synchronized.

  4. Select Maintenance > High Availability.

  5. Click Takeover.

  6. Click OK to continue.
    The Takeover process log appears and provides process information. IMPORTANT: Do not close the process log window.

  7. When the message “Takeover Complete,” appears, click OK to continue.
    IMPORTANT: Do not change any DNS settings until the Takeover Complete message is displayed. Doing so could cause the filer to become confused over which filer is the source.

  8. On the DNS server, locate the active records for the previous active filer and the standby filer.

  9. Switch the IP addresses so the standby filer is now the active Filer.

  10. On the new active Filer go to Configuration > Active Directory.

  11. Rejoin the filer to the Active Directory.

  12. Verify that the DNS is pointing to the correct filer, that the IP addresses were properly switched.

  13. When the previously active filer becomes available, you can bring it up as the new stand-by filer.

The process is now complete. The new active filer should be up and running.

Takeover for HA-Local with shared address

As described in Important Information About HA-Local with Shared Address, the shared IP address/hostname that you configured when setting up the HA-Local standby is the one that is used to access the active Filer during normal operations. You will use the same address/hostname to access the standby Filer when it comes up as the new active filer.

  1. Verify that the active filer is down.

  2. Log in to the standby filer, and verify that it is the standby. Click the "i" in the upper right corner of the WebUI and verify that the Configuration Mode is Standby. If there are multiple HA-Local pairs in the CloudFS, verify that you are on the correct one.

  3. If this is a planned takeover, check the sync status by opening the Dashboard page and looking at the Active Filer Status and Spare Filer Status sections. If it is not a planned takeover, the process will take longer if the filers are not synchronized.

  4. Select Maintenance > High Availability.

  5. Click Takeover.

  6. Click OK to continue.
    The Takeover process log appears and provides process information. IMPORTANT: IMPORTANT: Do not close the process log window.

  7. When the message “Takeover Complete,” appears, click OK to continue.

  8. On the new active filer go to Configuration > Active Directory.

  9. Rejoin the filer to Active Directory and proceed with normal operations.

  10. When the previously active filer becomes available, you can bring it up as the new stand-by filer.

The process is now complete. The new active filer should be up and running.

 

Failback (HA-Local only)

Failback applies only to HA Local (with or without shared address). For HA Global, there is no failback process. Following an HA Global takeover, you must reinitialize and reconfigure the failed filer to add it again as either an active or standby filer.

  1. Verify that snapshots are in sync by checking the Dashboard page on the current active filer (the former standby filer), and that there is no dirty cache.

  2. Shut down the current active filer.

  3. Bring the current standby filer (the original active filer) up and sign in.

  4. Follow the steps in Takeover to switch back to the original active filer.