Panzura Data Services Pulse

Pulse gathers information on the operational performance information of Panzura CloudFS and CloudFS nodes and presents them in an at-a-glance format.

Pulse receives operational performance metrics from all connected Panzura nodes, every five minutes, and presents them as follows:

  • Overview: which presents the combination of most important metrics, for the best or worst performing nodes
  • System view: which presents information mainly focused on hardware metrics
  • Storage view: which provides a variety of information about the local and Cloud storage
  • Cloud view: which provides metrics related to the cloud communication and inter-node and snapshots 
  • Events view: which is dedicated to Panzura node events and their statistics

Gathering performance metrics for all nodes is on by default.

Pulse ingests data from all nodes within your CloudFS deployment.  However, they present the data for only one CloudFS deployment at a time. If you run multiple CloudFS deployments, please first select the CloudFS to display.

data-services_selectcloudfs

Optional filters include setting the time range and selecting specific nodes to display, as shown in the two images below.

data-services_pulsedaterange

data-services_pulsenodeselect

Pulse: Overview

The Overview provides a panel of highest (and when relevant, lowest) performing nodes in a cluster side by side in the following charts for comparison. The charts in this view are not historical, they are based on the most up to date data that Data Services has received from each nodes.

At the top right, the view displays the numbers of the active nodes, based on the selected nodes through the node selection menu, and to its right, the number of the active users (i.e. the number of active SMB connections) for the selected nodes are displayed.

The rest of the Overview view presents the following charts and tables:

  • Top nodes - CPU Utilization: which shows the nodes with highest CPU utilization in the cluster, ordered based on their CPU utilization percentages
  • Top nodes - Memory Utilization: which shows the nodes with highest memory utilization in the cluster, ordered based on their memory utilization percentages
  • Top nodes - Local Disk Usage: which shows the nodes with highest local disk usage in the cluster, ordered based on their local disk usage percentages
  • Lowest nodes - Cache Hit Ratio: which shows the nodes with lowest cache hit ratio in the cluster, ordered based on their cache hit ratio percentages
  • Top nodes - LAN Upload: which shows the nodes with highest LAN upload in the cluster, ordered based on their upload data volume
  • Top nodes - LAN Download: which shows the nodes with highest LAN download in the cluster, ordered based on their download data volume
  • Top nodes - WAN Upload: which shows the nodes with highest WAN upload in the cluster, ordered based on their upload data volume
  • Top nodes - WAN Download: which shows the nodes with highest WAN download in the cluster, ordered based on their download data volume
  • Top nodes - Snapshots Behind: which shows the nodes with highest number of the snapshots they are behind to be synchronized in the cluster, ordered based on their number of snapshots behind
  • Node List: which table that shows the nodes operational summaries, as follows:
  • Node: which shows the node names
  • Version: which shows the CloudFS version running 
  • Last Active 
  • Uptime (YY:MM:DD:hh:mm:ss): which shows the last timestamp on which Data Services has communicated with them
  • Cores: which shows the number of the CPU the nodes have access to
  • RAM: which shows the size of memory on which the nodes are operating
  • SMB Count: which shows the number of the SMB connections to the nodes

Pulse: System

In the System view of the Pulse service, the main focus of the information provided is on system and hardware performance metrics of the nodes in the cluster. As other views, users can select the nodes they are interested in, through the dropdown menu on top.

The charts and tables in the System view are as follows:

  • CPU Utilization: which shows the CPU utilization percentages for the selected nodes over the time, for the selected time period
  • CPU Load Peak: which shows the average CPU load peak per minute for the selected nodes, over the time, for the selected time period
  • Memory Utilization: which shows the memory utilization percentages and size for the selected nodes over the time, for the selected time period
  • Bandwidth Limit: which shows the maximum upload and download bandwidths for the selected nodes 
  • Network Traffic: which shows the network traffic in terms of data rate for the selected nodes incoming and outgoing traffic, for both LAN and WAN, over the time for the selected time period 
  • SMB Count: which shows the number of the SMB connections for the selected nodes over time, for the selected time period, both as a graph and a table. The data in the table could be exported for external use as Raw data or Formatted
  • Disk I/O Stat: which shows the rate of disk I/O operations for the selected nodes over the time, for the selected time period, for both read and write

data-services_pulsesystem

Pulse: Storage

The Storage view of the Pulse service mainly provides charts and tables with regard to volumes of stored data and cache statistics of the nodes in the cluster. The charts and tables in the Storage view are as follows:

  • Cache Statistics: which is a table that lists the most recent local disk and various types of caches’ sizes for the selected nodes. The data of this table could be exported for external use as Raw data or Formatted
  • Local Disk: which shows the proportion used versus unused local disk for the selected nodes, in terms of percentage. Clicking on one chart, shows their those in terms of their volumes as well 
  • Cloud Disk: which shows the proportion used versus unused Cloud disk by the selected nodes, in terms of percentage. Clicking on one chart, shows their those in terms of their volumes as well
  • Metadata Storage: which shows the proportion of metadata usage of the whole local disk for the selected nodes, in terms of percentage. Clicking on one chart, shows their those in terms of their volumes as well 
  • Metadata Storage Utilization: which shows the metadata usage of the local disk for the selected nodes, in terms of and volume and percentage, over the time, and for the selected time period
  • Managed Capacity Usage: which shows the usage of the licensed Manage Capacity by the selected nodes and for the selected nodes, in terms of and volume and percentage, over the time, and for the selected time period
  • Working Cache: which shows the usage of the Total Cache, Total Pinned Cache, and Dirty Cache, for the selected nodes, in terms of and volume and percentage, over the time, and for the selected time period
  • Cached Usage: which shows the Cold Cache, Warm Cache, and Hot Cache, for the selected nodes, in terms of and volume and percentage, over the time, and for the selected time period
  • Cache Hit: which shows the volume of the Hit Cache over the time, and for the selected time period
  • Cache Missed: which shows the volume of the Missed Cache over the time, and for the selected time period
  • Cache Hit Ratio: which shows the percentage of the Cache Hit Ratio for the selected nodes, in terms of and volume and percentage, over the time, and for the selected time period. The Cache Hit Ratio is defined as the ratio of Cache Hit volume over the sum of Cache Missed volume and Cache Hit volume 
  • Cache Stats: which shows the volume of the Cached, Deduped, Evicted and Freed data over the time, and for the selected time period

data-services_pulsestorage

Pulse: Cloud

The Cloud view of the Pulse service provides data and statistics on nodes cloud and one-to-one communications and the snapshot status of them. The Cloud view presents the following charts and tables:

  • Cloud Upload/Download Failure Rate: which shows the percentage of the upload and download failures for the selected nodes, over the time, and for the selected time period, both for Cloud Service Providers (CSPs) and Cloud Mirror Providers (CMPs)
  • Site To Site Latency: which shows the time latency of communications between the nodes for the selected nodes, in terms of and time, over the time, and for the selected time period. A specific destination filler could be selected in the node2 dropdown menu, for the selected nodes as the destination
  • Snap Sync/min: which shows the number of the snapshots that the nodes are behind from synchronization with each other, for the selected nodes over the time, and for the selected time period
  • Current Snap Status: which is a table that shows the snapshot synchronization status of the selected nodes over the time, and for the selected time period, individually

data-services_pulsecloud

Pulse: Events

This dashboard relies on CloudFS node event logs. In this view, in addition to selecting nodes, users can select the Event Level as well, in order to limit the provided metrics and statistics to specify severity of events.

In addition to number of the active nodes, and the total number of the events, the following charts and tables are presented in the Event view:

  • Event Heatmap: which is world map on which the occurred event are shown based on their locations and the color coding for the severity of them based on the Event Level, for the selected nodes
  • Event Time Count: which shows the number the event over the time for the selected period of time and the selected nodes
  • Event Per Level: which shows the number and percentage of the each levels of the occurred in a pie chart
  • Event Per node: which shows the number and level of the each event for the selected nodes
  • Events: which is a table that list the events, their timestamp, the source nodes, the event level and the event message for the selected nodes