Skip to content

How volume latency metrics are calculated

Hardikl edited this page Feb 3, 2025 · 7 revisions

1. How latency counter would be cooked

The perf metric values are calculated from two consecutive polls (therefore, no metrics are emitted after the first poll). The calculation algorithm depends on the property and base-counter attributes of each metric, the following properties are supported:

property formula description
raw x = xi no post-processing, value x is submitted as it is
delta x = xi - xi-1 delta of two poll values, xi and xi-1
rate x = (xi - xi-1) / (ti - ti-1) delta divided by the interval of the two polls in seconds
average x = (xi - xi-1) / (yi - yi-1) delta divided by the delta of the base counter y
percent x = 100 * (xi - xi-1) / (yi - yi-1) average multiplied by 100

latency_io_reqd special field used for latency only.

parameter type description default
latency_io_reqd int, optional threshold of IOPs for calculating latency metrics (latencies based on very few IOPs are unreliable) 10

In case of latency calculation, base-counter is the mandatory for further processing. This belongs to the case of average from the above table.

Step1:

We first take delta of latency counter of current poll with the previous poll.

Step2:

Take delta of base-counter of current poll with previous poll. There is slight change as thresholds also involved in the calculation.

Step3:

There is latency_io_reqd optional field which used for controlling threshold value, default value is 10.

Based on the latency_io_reqd field value, minimumBase value has been decided and if delta of base-counter is greater than this minimumBase, then only current latency counters are being processed further else set as 0(zero) because latencies based on very few IOPs are unreliable.

Step4:

If Step3 condition fulfilled then apply average formula as delta of latency counter/ delta of base-counter and export latency counters.

Step5:

In case, delta of latency counter value is less than 0 or delta of base-counter value is less than 0 or any other case than Step4, latency counters are not exported.

2. Applying the thresholding where any counter ends with latency and it has a property of average or percent

Latency calculation would belongs to average case as per the above table. Also, instead of normal delta division for average, here it's applying delta division with thresholds DivideWithThreshold function. As mentioned in section1, latency_io_reqd field would be used for deriving minimumBase and based on that it will decide latency counters need to be processed or not as latencies based on very few IOPs are unreliable

These are the links where this logic would be used in all perf collectors .

ZapiPerf, RestPerf, KeyPerf

3. Special case as applying minimum number of iops as threshold

There are some special case for thresholds while calculating the latencies where minimumBase value would not help because there would be some objects like ontaps3_svm and few latency metrics like optimal_point_latency and scan_latency where base counter would be always has a few IOPS. So, intentionally we have made minimumBase as 0 to ensure it should proceed further and calculate the latencies appropriately.

This is the link where this logic would be used in Harvest codeflow.

4. Comparison of Prometheus and/or Grafana, ONTAP CLI statistics, PAS, and System Manager screenshots for the same time period

Clone this wiki locally