Monitoring System Settings

To view or assign network settings, navigate to the following page in the filer's WebUI:

Configuration > Monitoring

The following table describes the monitoring settings you can configure.

Monitoring Settings Description
Syslog
Syslog Server

Enter the IP address or hostname of the syslog server.

Note: Only UDP is supported for syslog messages.

Logging Level Select the minimum level of messages to send to the server. The selected level and all higher levels are sent. For example, if you specify Error, then messages of type Error, Critical, Alert, and Emergency are sent.
Trace Logs Click Add Trace Log to specify the applications to be logged in addition to the standard syslog logging. Select a service and a logging level, and click Add. Add additional entries as needed.
Email

SMTP Server

Enter the hostname or IP address of the email server.

Sender email address Enter the email address to appear in the From field for alert notifications.
SMTP Port Enter the port on the email server.
Use encryption Select to encrypt the alert messages using SMTPS.
Use authentication Select to require authentication for the SMTP server. If the SMTP server requires authentication, enable this setting and enter a username and password.
Username Enter the username for access to the SMTP server.
Password Enter the password for the user accessing the SMTP server.
Test Click to test connection.
Test email address Email address of the recipient for the email test.
Email Alert Settings
Allow Repeating Email

Enable to send only one message for any given alert. If this setting is disabled, alerts continue to be sent until the alert clears. This setting is disabled by default.

Notes regarding repeating alerts:

  • If repeating alerts are not suppressed and multiple alerts are sent for the same event, the timestamp of all the repeated alerts is the time when the event first occurred.
  • Filer Rebooted and Disk Offline events are reported only once, even if repeated alerts are not suppressed.
Email interval (minutes) Select the interval for aggregating events before sending email notification. Default is 5 minutes.
Email Alert Recipients Click Add Recipient. Enter an email address and select the types of alerts to notify the recipient about (hardware, system, filesystem, network). To delete a recipient, deselect all of the alert types for the recipient.
Type of Alerts
Hardware System Alert

This type of Alert includes information about:-

  • Fan Failure
  • Power Supply Failure
  • Chassis Intrusion Detection
  • System Hardware Temperature Near Critical over 42c and over 47c
  • Disk Offline
  • Managed Capacity Memory Error
  • Disk Offline
  • Memory Error
System Alerts

This type of Alert includes information about:-

  • Memory available is less than or equal to 1 GB
  • System CPU usage over 70%
  • Filer Reboot
  • Disk Blockage
  • Disk Thrashing
  • Disk Fill
  • Datapath and system generation
  • Filesystem existence in case of Disaster Recovery
  • CCID presence and snapshotsync information
  • PFOS version
  • Metadata, memory and load average alerts
  • General distress and link change
  • Reboots
  • Product License
  • Non- threshold traps
  • Network config alerts such as DNS information
    File System Alerts

    This type of Alert includes information about:-

    • Snapshots Upload/Download
    • Disk Resilvering
    • MetaData and Cloud Space
    • State of controller
    • Filesystem existence
    Network Alerts

    This type of Alert includes information about:-

    • Email Failures
    • Site to Site Latency
    SNMP

    Read Community String

    Enter the community string for read-only communication between the filer and the trap receiver. If you specify a custom community string, the public community string is disabled.

    Recipient IP

    Recipient Community String

    Enter the hostnames or IP addresses of up to two SNMP trap receivers and the associated community strings.
    SNMP Trap Threshold Settings Enter the usage thresholds (percent) that will trigger SNMP trap messages of particular types: CPU usage, memory usage, disk usage, and cloud usage. The filer generates SNMP traps of the specified type if the usage meets or exceeds the threshold.
    SNMP Users
    Add User Click to add an SNMP user. Added users are listed. Listed users can be deleted from the list.
    Username Enter a user name for the SNMP user.
    Authentication Protocol Select the algorithm to use for authenticating the SNMP user (SHA or MD5).
    Authentication Password Enter a password to authenticate the SNMP user.
    Authentication Data Privacy Method Select an option for data encryption (DES or AES).

    Privacy Password

    Privacy Password Confirm

    Enter a password for the DES or AES encryption.

    Simple Network Management Protocol

    Simple Network Management Protocol (SNMP) is an Internet Standard protocol that allows customers to monitor networked devices through a single tool. Devices that support SNMP include routers, modems, switches, and servers. SNMP monitors values exported by the SNMP agent and allows push notifications, which are traps in SNMP language. The two components of SNMP include SNMP agents and SNMP managers. An SNMP agent is a software on a managed device. The software allows the SNMP manager to communicate with the device using SNMP. The SNMP manager is an external software that queries, receives events, and gets responses from devices. A managed device implements an SNMP interface for node-specific information.

    A Management Information Base (MIB) is a collection of information organized hierarchically. These are accessed using a protocol such as SNMP. MIBs are created by Managed Device vendors (Panzura), and they are stored on the device. MIBs must be provided to the SNMP manager so it knows how to translate the MIB values from the managed device. MIBs use a hierarchical namespace that contains object identifiers (OID). An OID identifies a variable that can be read using SNMP.

    There are two transaction types: polling and traps. When polling occurs, the SNMP manager sends an SNMP request to a managed device at the default polling interval, which is 120 seconds. The managed device then responds with an SNMP response status. The second type of transaction is a trap or push notification. SNMP managers are always ready to receive a trap from a managed device. In order to receive and translate a trap from a managed device, the managed device must be configured to send traps to the manager, and the SNMP manager must be provided with a trap MIB from the managed device.

    The following are the three versions of SNMP security:

    • Version 1. Plain text authentication.
    • Version 2. Improved authentication. Community strings are still transmitted over the wire in clear, plain text.
    • Version 3. Provides the following three levels of authentication:
      o NoAuthNoPriv. Users who use this level don't have authentication or privacy when they send and receive messages.
      o AuthNoPriv. This level requires users to authenticate, but does not encrypt sent or received messages.
      o AuthPriv. This level is the most secure. Authentication is required and sent and received messages are encrypted.

    You can download the MIB zip archive using the following URL: https://docs.panzura.com/PANZURA_SNMP.tgz

    After downloading the MIB archive, decompress it and load files into your SNMP manager.

    This download is also available by selecting Maintenance> System Operations> Download MIB.

    To configure the SNMP settings on a Panzura filer:

    1. Log in to your Panzura filer.

    2. Select CONFIGURATION > Monitoring > SNMP Users > SNMP Settings.

    3. In the Read Community String field, enter the community string for read-only communication between the filer and SNMP manager.

    If you specify a custom community string, the public community string is disabled.

    4. In the Recipient IP field, enter the SNMP manager IP (this is where we send traps).

    5. In the Recipient Community String, enter the SNMP manager associated with community strings.

    6. In the SNMP Trap Threshold Settings field, Enter the usage thresholds (percent) that will trigger SNMP trap messages.

    To edit a trap:

    1. Log in to your Panzura filer.

    2. Select CONFIGURATION > Monitoring > SNMP Users > SNMP Settings.

    3. In the SNMP Trap Thresholds section, select Actions > Edit SNMP Trap.

    To add an SNMP user:

    1. Log in to your Panzura filer.

    2. Select CONFIGURATION > SNMP Users.

    3. Click the ADD button.

    The following table displays the SNMP traps that Panzura supports:

    Name Trigger Condition ID
    pzCloudControllerHighCPUUsage cpu_load > threshold_value SNMPv2- SMI::enterprises.32853.1.2.1.2.1000
    pzCloudControllerHighMemoryUsage (used_memory * 100 / total_memory) > threshold_value SNMPv2- SMI::enterprises.32853.1.2.1.2.1001
    pzCloudControllerHighDiskUsage (used_disk * 100 / total_disk) > threshold_value SNMPv2- SMI::enterprises.32853.1.2.1.2.1002
    pzCloudControllerHighCloudUsage (used_cloud * 100 / total_cloud) > threshold_value SNMPv2- SMI::enterprises.32853.1.2.1.2.1003
    pzTrapMetaSpill (meta_space_used * 100 / total_meta_ssd) > threshold_value SNMPv2- SMI::enterprises.32853.1.2.1.2.1004
    pzTrapMetaAllocFail vfs.zfs.metaslab.stats.mg_spill>100 and it keeps increasing SNMPv2- SMI::enterprises.32853.1.2.1.2.1005
    pzTrapActiveDown /opt/pixel8/bin/pz_ping <host>, failed for 3 times, we think the host is down, then this trap is sent SNMPv2- SMI::enterprises.32853.1.2.1.2.1006
    pzAutoFailover AutoFailover occurred SNMPv2- SMI::enterprises.32853.1.2.1.2.1007
    pzRegularFailover RegularFailover occurred SNMPv2- SMI::enterprises.32853.1.2.1.2.1008
    pzAlertTrap An alert is shown on GUI, an email will be sent to customer, and 1009 trap will be sent also SNMPv2- SMI::enterprises.32853.1.2.1.2.1009
    pzCloudWriteFailureTrap Sent when cloud write failures exceed threshold SNMPv2- SMI::enterprises.32853.1.2.1.2.1011
    pzWarnTrap A warning is shown in theGUI, and an email will be sent to customer. Trap1009 is sent as well. SNMPv2-SMI::enterprises.32853.1.2.1.2.1012
    pzInfoTrap Info is shown in the GUI, and an email is sent to the customer. Trap 1009 trap is sent as well. SNMPv2-SMI::enterprises.32853.1.2.1.2.1013
    pzSwapUsage If the usage is grater then 50% in the /usr/sbin/swapinfo output, this trap is sent. SNMPv2-SMI::enterprises.32853.1.2.1.2.1014

    Syslog Message Categories

    When using syslog to monitor for events needing attention, Panzura recommends monitoring for LOG_EMERG, LOG_ALERT, and LOG_CRIT.

    • LOG_EMERG and LOG_ALERT messages typically represent conditions requiring immediate attention.
    • LOG_CRIT events represent events that can become EMERG or ALERT if not addressed.

    The following table provides a list of Syslog message categories.

    Syslog Message Category Description
    LOG_NOTICE Conditions that are not error conditions, but should possibly be handled specially.
    LOG_INFO Informational messages.
    LOG_WARNING Warning messages.
    LOG_ALERT A condition that should be corrected immediately, such as a corrupted system database.
    LOG_ERR Indicates a general error for general notification. Can occur often even on a filer that is operating normally.
    LOG_CRIT Critical conditions, such as hard device errors.
    LOG_EMERG A panic condition. This message is normally broadcast to all users.
    LOG_DEBUG Messages that contain information normally of use only when debugging.

     

    Panzura MIB Objects

    The Panzura Filer MIB provides access to the following types of system information and statistics:

    • Filer ID, CloudFS version, and hostname
    • Visibility into caching:
      • Hot, warm, and cold for automated caching
      • Hot, warm, cold for pinned files
      • Cache hits/misses for automated caching
      • Cache hits/misses for pinned
    • Cloud statistics
      • Number of drive files uploaded to cloud
      • Number of drive files downloaded
      • Number of upload failures
      • Number of download failures
    • SMB users
      • Total number of SMB users currently connected to the filer
      • Total number of files locked by SMB users currently connected
    • CloudFS local configuration
      • Mode of filer: master/subordinate
      • Hostname of master filer
    • Local system snapshot information
      • Latest system snapshot generated reference #
      • Latest system snapshot uploaded to cloud reference #
      • Date and time when the latest master snapshot was successfully generated
    • CloudFS remote configuration
      • Remote filer hostnames
      • Latency from filer to remote filer
      • Remote filer up or down
      • Down means there is no communication
    • Remote snapshot information from all other filers
      • Latest snapshot reference # synchronized from the specified remote filer
      • The latest snapshot reference # uploaded by the specified remote filer

    MIB Download*

    The following table lists the objects in the Panzura Filer Management Information Base (MIB).

    Object Name Object ID (OID) Type Description
    System Identification
    ccSysCCID .1.3.6.1.4.1.32853.1.4.1.1.0 Sensor The filer’s CCID
    ccSysVersion .1.3.6.1.4.1.32853.1.4.1.2.0 Sensor PFOS version
    System Usage
    cpuLoad .1.3.6.1.4.1.32853.1.3.1.1.1 Sensor CPU usage averaged over the previous 5 minutes. Measures the number of processes waiting for CPU resources.
    pzCloudFilerHighCPUUsage .1.3.6.1.4.1.32853.1.2.1.2.1000 Trap CPU usage averaged over the previous 5 minutes. Measures the number of processes waiting for CPU resources.
    memUsed .1.3.6.1.4.1.32853.1.3.1.2.1 Sensor Memory usage at the time of the measurement in KB.
    pzCloudFilerHighMemoryUsage .1.3.6.1.4.1.32853.1.2.1.2.1001 Trap Memory usage at the time of the measurement in KB.
    localHDUsed .1.3.6.1.4.1.32853.1.3.1.3.1 Sensor Disk usage at the time of the measurement in KB.
    pzCloudFilerHighD .1.3.6.1.4.1.32853.1.2.1.2.1002 Trap Disk usage at the time of the measurement in KB.
    cloudStiskUsageatsUsed .1.3.6.1.4.1.32853.1.3.1.4.1 Sensor Cloud usage at the time of the measurement in KB.
    pzCloudFilerHighCloudUsage .1.3.6.1.4.1.32853.1.2.1.2.1003 Trap Cloud usage at the time of the measurement in KB.
    High Availability
    pzTrapActiveDown .1.3.6.1.4.1.32853.1.2.1.2.1006 Trap HA‐Local active filer failure notification.
    Cache
    ccStatCaHotAutoCache .1.3.6.1.4.1.32853.1.4.2.1.1.0 Sensor The total number of bytes in data cache storage with an Auto Cache Smart Cache rule that were accessed during the last week.
    ccStatCaHotAutoPinned .1.3.6.1.4.1.32853.1.4.2.1.2.0 Sensor The total number of bytes in data cache storage with a Pinned Smart Cache rule that have been accessed in the last week.
    ccStatCaWarmAutoCache .1.3.6.1.4.1.32853.1.4.2.1.3.0 Sensor The total number of bytes in data cache storage with an Auto Cache Smart Cache rule that were accessed more than a week ago but less than one month ago.
    ccStatCaWarmAutoPinned .1.3.6.1.4.1.32853.1.4.2.1.4.0 Sensor The total number of bytes in data cache storage with a Pinned Smart Cache rule that were accessed more than a week ago but less than one month ago.
    ccStatCaColdAutoCache .1.3.6.1.4.1.32853.1.4.2.1.5.0 Sensor The total number of bytes in data cache storage with an Auto Cache Smart Cache rule that were accessed more than one month ago.
    ccStatCaColdAutoPinned .1.3.6.1.4.1.32853.1.4.2.1.6.0 Sensor The total number of bytes in data cache storage with a Pinned Smart Cache rule that were accessed more than one month ago.
    ccStatCaCacheHits .1.3.6.1.4.1.32853.1.4.2.1.7.0 Sensor The total number of cache hit bytes in data cache storage with an Auto_Cache Data Locality rule.
    ccStatCaPinnedHits .1.3.6.1.4.1.32853.1.4.2.1.8.0 Sensor The total number of cache hit bytes in data cache storage with a Pinned Data Locality rule.
    ccStatCaCacheMissed .1.3.6.1.4.1.32853.1.4.2.1.9.0 Sensor The total number of cache missed bytes in data cache storage with an Auto_Cache Data Locality rule.
    ccStatCaPinnedMissed .1.3.6.1.4.1.32853.1.4.2.1.10.0 Sensor The total number of cache missed bytes in data cache storage with a Pinned Data Locality rule.
    ccStatCaEvited .1.3.6.1.4.1.32853.1.4.2.1.11.0 Sensor The total number of evicted bytes in data cache storage.
    Drive File Operations
    ccStatClUploads .1.3.6.1.4.1.32853.1.4.2.2.1.0 Sensor The total number of drive files uploaded to cloud storage.
    ccStatClUploadFails .1.3.6.1.4.1.32853.1.4.2.2.2.0 Sensor The total number of upload failures to upload a drive file to cloud storage.
    ccStatClDownloads .1.3.6.1.4.1.32853.1.4.2.2.3.0 Sensor The total number of drive files downloaded from cloud storage.
    ccStatClDownloadFails .1.3.6.1.4.1.32853.1.4.2.2.4.0 Sensor The total number of download failures to download a drive file from cloud storage.
    SMB Users
    ccStatSmbUsers .1.3.6.1.4.1.32853.1.4.2.3.1.0 Sensor The total number of SMB users currently connected to the filer.
    ccStatSmbLockedFiles .1.3.6.1.4.1.32853.1.4.2.3.2.0 Sensor The total number of files locked by SMB users currently connected to the filer.
    Snapshots
    ccInfoLoSnLastGenSnapNum .1.3.6.1.4.1.32853.1.4.3.1.1.0 Sensor The reference number of the latest snapshot generated by the filer.
    ccInfoLoSnLastUploadSnapNum .1.3.6.1.4.1.32853.1.4.3.1.2.0 Sensor The reference number of the latest snapshot uploaded to cloud storage.
    ccInfoLoSnLastMasterSnap .1.3.6.1.4.1.32853.1.4.3.1.3.0 Sensor The date when the latest master snapshot was generated successfully.
    Status of snapshot synchronization from remote filers
    ccInfoReSnIdx .1.3.6.1.4.1.32853.1.4.3.2.1.1.1 Sensor Index number.
    ccInfoReSnHostname .1.3.6.1.4.1.32853.1.4.3.2.1.1.2 Sensor The remote filer’s hostname.
    ccInfoReSnLastSyncSnapNum .1.3.6.1.4.1.32853.1.4.3.2.1.1.3 Sensor The reference number of the latest snapshot synchronized from the specified remote filer.
    ccInfoReSnLastUploadSnapNum .1.3.6.1.4.1.32853.1.4.3.2.1.1.4 Sensor The reference number of the latest snapshot uploaded by the specified remote filer.
    CloudFS
    ccInfoCfsCfgIdx .1.3.6.1.4.1.32853.1.4.3.3.1.1.1 Sensor Index number.
    ccInfoCfsCfgHostname .1.3.6.1.4.1.32853.1.4.3.3.1.1.2 Sensor The hostname of the filer.
    ccInfoCfsCfgFilesystemName .1.3.6.1.4.1.32853.1.4.3.3.1.1.3 Sensor The filesystem name that the filer is hosting.
    ccInfoCfsCfgState .1.3.6.1.4.1.32853.1.4.3.3.1.1.4 Sensor The operational state of the filer.
    ccInfoCfsCfgStatus .1.3.6.1.4.1.32853.1.4.3.3.1.1.5 Sensor The status of the filer.
    Network latency from a filer to other filers in the CloudFS
    ccInfocfsLaIdx .1.3.6.1.4.1.32853.1.4.3.3.2.1.1 Sensor Index number.
    ccInfocfsLaHostname .1.3.6.1.4.1.32853.1.4.3.3.2.1.2 Sensor The hostname of the filer included in the CloudFS.
    ccInfocfsLaHelloLatency .1.3.6.1.4.1.32853.1.4.3.3.2.1.3 Sensor The network latency in milliseconds from the remote filer.
    ccInfoCfsCfgLoMode .1.3.6.1.4.1.32853.1.4.3.3.3.1.0 Sensor The configuration mode of the filer, i.e. master or subordinate.
    ccInfoCfsCfgLoCfgMaster .1.3.6.1.4.1.32853.1.4.3.3.3.2.0 Sensor The name of the master configuration filer.

    Monitoring Recommendations for CPU Load

    The amount of CPU load placed on a filer is measured in terms of the number of processes that are ready to run and in the queue awaiting CPU resources. These are the recommended thresholds by model. (CPU load is measured by OID .1.3.6.1.4.1.32853.1.3.1.1.1.)

    Filer Model Number of Processes in the Queue and Ready To Run Recommendation
    28xx models Above 320 Monitor closely
    Above 400 Look into it
    Above 800 Take action
    40xx models Above 640 Monitor closely
    Above 800 Look into it
    Above 1600 Take action
    5100 model Above 480 Monitor closely
    Above 600 Look into it
    Above 1200 Take action
    5300 and 5500 models Above 960 Monitor closely
    Above 1200 Look into it
    Above 2400 Take action
    6xxx models Above 1200 Look into it

     

    VM Filers

    For filers operating in virtualized environments, such as VMware ESXi or Amazon Web Services (AWS), first determine the number of CPU cores assigned to the filer. When 4 cores are assigned, use the values given for the 28xx models.

    For a larger number of cores, scale the values upward. For example, a filer operating within AWS can have 8 cores.

    To scale the values, first divide the number of cores by 4 to get 2. Next, multiply this by the values for the 28xx models. The CPU load recommendations become (2*320=640), (2*400=800), and (2*800=1600).

    *Pertains to those running Panzura's version 8 filers