Skip to content

Commit

Permalink
Specify optional Exponential Histogram Aggregation, add example code …
Browse files Browse the repository at this point in the history
…in the data model (#2252)
  • Loading branch information
jmacd committed May 13, 2022
1 parent e043ee4 commit 3788987
Show file tree
Hide file tree
Showing 3 changed files with 191 additions and 12 deletions.
11 changes: 11 additions & 0 deletions CHANGELOG.md
Expand Up @@ -13,6 +13,17 @@ release.

### Metrics

- Clarify that API support for multi-instrument callbacks is permitted.
([#2263](https://github.com/open-telemetry/opentelemetry-specification/pull/2263)).
- Clarify SDK behavior when view conflicts are present
([#2462](https://github.com/open-telemetry/opentelemetry-specification/pull/2462)).
- Clarify MetricReader.Collect result
([#2495](https://github.com/open-telemetry/opentelemetry-specification/pull/2495)).
- Add database connection pool metrics semantic conventions
([#2273](https://github.com/open-telemetry/opentelemetry-specification/pull/2273)).
- Specify optional support for an Exponential Histogram Aggregation.
([#2252](https://github.com/open-telemetry/opentelemetry-specification/pull/2252))

### Logs

### Resource
Expand Down
77 changes: 71 additions & 6 deletions specification/metrics/datamodel.md
Expand Up @@ -22,6 +22,7 @@
* [Sums](#sums)
* [Gauge](#gauge)
* [Histogram](#histogram)
+ [Histogram: Bucket inclusivity](#histogram-bucket-inclusivity)
* [ExponentialHistogram](#exponentialhistogram)
+ [Exponential Scale](#exponential-scale)
+ [Exponential Buckets](#exponential-buckets)
Expand All @@ -33,6 +34,7 @@
- [Positive Scale: Use a Lookup Table](#positive-scale-use-a-lookup-table)
+ [ExponentialHistogram: Producer Recommendations](#exponentialhistogram-producer-recommendations)
+ [ExponentialHistogram: Consumer Recommendations](#exponentialhistogram-consumer-recommendations)
+ [ExponentialHistogram: Bucket inclusivity](#exponentialhistogram-bucket-inclusivity)
* [Summary (Legacy)](#summary-legacy)
- [Exemplars](#exemplars)
- [Single-Writer](#single-writer)
Expand Down Expand Up @@ -522,6 +524,8 @@ Bucket counts are optional. A Histogram without buckets conveys a
population in terms of only the sum and count, and may be interpreted
as a histogram with single bucket covering `(-Inf, +Inf)`.

#### Histogram: Bucket inclusivity

Bucket upper-bounds are inclusive (except for the case where the
upper-bound is +Inf) while bucket lower-bounds are exclusive. That is,
buckets express the number of values that are greater than their lower
Expand Down Expand Up @@ -716,6 +720,21 @@ func GetExponent(value float64) int32 {
}
```

Implementations are permitted to round subnormal values up to the
smallest normal value, which may permit the use of a built-in function:

```golang

func GetExponent(value float64) int {
// Note: Frexp() rounds submnormal values to the smallest normal
// value and returns an exponent corresponding to fractions in the
// range [0.5, 1), whereas we want [1, 2), so subtract 1 from the
// exponent.
_, exp := math.Frexp(value)
return exp - 1
}
```

##### Negative Scale: Extract and Shift the Exponent

For negative scales, the index of a value equals the normalized
Expand All @@ -727,19 +746,59 @@ correct rounding for the negative indices. This may be written as:
return GetExponent(value) >> -scale
```

The reverse mapping function is:

```golang
return math.Ldexp(1, index << -scale)
```

Note that the reverse mapping function is expected to produce
subnormal values even when the mapping function rounds them into
normal values, since the lower boundary of the bucket containing the
smallest normal value may be subnormal. For example, at scale -4 the
smallest normal value `0x1p-1022` falls into a bucket with lower
boundary `0x1p-1024`.

##### All Scales: Use the Logarithm Function

For any scale, use of the built-in natural logarithm
function. A multiplicative factor equal to `2**scale / ln(2)`
proves useful (where `ln()` is the natural logarithm), for example:
For any scale, the built-in natural logarithm function can be used to
compute the bucket index. A multiplicative factor equal to `2**scale
/ ln(2)` proves useful (where `ln()` is the natural logarithm), for
example:

```golang
scaleFactor := math.Log2E * math.Exp2(scale)
return int64(math.Floor(math.Log(value) * scaleFactor))
scaleFactor := math.Ldexp(math.Log2E, scale)
return math.Floor(math.Log(value) * scaleFactor)
```

Note that in the example Golang code above, the built-in `math.Log2E`
is defined as `1 / ln(2)`.
is defined as the inverse of the natural logarithm of 2, i.e., `1 / ln(2)`.

The reverse mapping function is:

```golang
inverseFactor := math.Ldexp(math.Ln2, -scale)
return math.Exp(index * inverseFactor), nil
```

Implementations are expected to verify that their mapping function and
inverse mapping function are correct near the lowest and highest IEEE
floating point values. A mathematically correct formula may produce
wrong result, because of accumulated floating point calculation error
or underflow/overflow of intermediate results. In the Golang
reference implementation, for example, the above formula computes
`+Inf` for the maximum-index bucket. In this case, it is appropriate
to subtract `1<<scale` from the index and multiply the result by `2`.

```golang
// Use this form in case the equation above computes +Inf
// as the lower boundary of a valid bucket.
inverseFactor := math.Ldexp(math.Ln2, -scale)
return 2.0 * math.Exp((index - (1 << scale)) * inverseFactor), nil
```

*Note that floating-point to integer type conversions have been
omitted from the code fragments above, to improve readability.*

##### Positive Scale: Use a Lookup Table

Expand Down Expand Up @@ -781,6 +840,12 @@ bucket indices that overflow or underflow this representation.
Consumers that reject such data SHOULD warn the user through error
logging that out-of-range data was received.

#### ExponentialHistogram: Bucket inclusivity

The [specification on bucket inclusivity made for explicit-boundary
Histogram data](#histogram-bucket-inclusivity) applies equally to
ExponentialHistogram data.

### Summary (Legacy)

[Summary](https://github.com/open-telemetry/opentelemetry-proto/blob/v0.9.0/opentelemetry/proto/metrics/v1/metrics.proto#L268)
Expand Down
115 changes: 109 additions & 6 deletions specification/metrics/sdk.md
Expand Up @@ -17,7 +17,13 @@
+ [Default Aggregation](#default-aggregation)
+ [Sum Aggregation](#sum-aggregation)
+ [Last Value Aggregation](#last-value-aggregation)
- [Histogram Aggregation common behavior](#histogram-aggregation-common-behavior)
+ [Explicit Bucket Histogram Aggregation](#explicit-bucket-histogram-aggregation)
+ [Exponential Histogram Aggregation](#exponential-histogram-aggregation)
- [Exponential Histogram Aggregation: Handle all normal values](#exponential-histogram-aggregation-handle-all-normal-values)
- [Exponential Histogram Aggregation: Support a minimum and maximum scale](#exponential-histogram-aggregation-support-a-minimum-and-maximum-scale)
- [Exponential Histogram Aggregation: Use the maximum scale for single measurements](#exponential-histogram-aggregation-use-the-maximum-scale-for-single-measurements)
- [Exponential Histogram Aggregation: Maintain the ideal scale](#exponential-histogram-aggregation-maintain-the-ideal-scale)
* [Observations inside asynchronous callbacks](#observations-inside-asynchronous-callbacks)
* [Resolving duplicate instrument registration conflicts](#resolving-duplicate-instrument-registration-conflicts)
- [Attribute limits](#attribute-limits)
Expand Down Expand Up @@ -339,6 +345,10 @@ The SDK MUST provide the following `Aggregation` to support the
- [Last Value](./sdk.md#last-value-aggregation)
- [Explicit Bucket Histogram](./sdk.md#explicit-bucket-histogram-aggregation)

The SDK MAY provide the following `Aggregation`:

- [Exponential Histogram Aggregation](./sdk.md#exponential-histogram-aggregation)

#### Drop Aggregation

The Drop Aggregation informs the SDK to ignore/drop all Instrument Measurements
Expand Down Expand Up @@ -397,6 +407,16 @@ This Aggregation informs the SDK to collect:
- The last `Measurement`.
- The timestamp of the last `Measurement`.

##### Histogram Aggregation common behavior

All histogram Aggregations inform the SDK to collect:

- Count of `Measurement` values in population.
- Arithmetic sum of `Measurement` values in population. This SHOULD NOT be collected when used with
instruments that record negative measurements (e.g. `UpDownCounter` or `ObservableGauge`).
- Min (optional) `Measurement` value in population.
- Max (optional) `Measurement` value in population.

#### Explicit Bucket Histogram Aggregation

The Explicit Bucket Histogram Aggregation informs the SDK to collect data for
Expand All @@ -410,13 +430,96 @@ This Aggregation honors the following configuration parameters:
| Boundaries | double\[\] | [ 0, 5, 10, 25, 50, 75, 100, 250, 500, 1000 ] | Array of increasing values representing explicit bucket boundary values.<br><br>The Default Value represents the following buckets:<br>(-&infin;, 0], (0, 5.0], (5.0, 10.0], (10.0, 25.0], (25.0, 50.0], (50.0, 75.0], (75.0, 100.0], (100.0, 250.0], (250.0, 500.0], (500.0, 1000.0], (1000.0, +&infin;) |
| RecordMinMax | true, false | true | Whether to record min and max. |

This Aggregation informs the SDK to collect:
Explicit buckets are stated in terms of their upper boundary. Buckets
are exclusive of their lower boundary and inclusive of their upper
bound (except at positive infinity). A measurement is defined to fall
into the greatest-numbered bucket with boundary that is greater than
or equal to the measurement.

- Count of `Measurement` values falling within explicit bucket boundaries.
- Arithmetic sum of `Measurement` values in population. This SHOULD NOT be collected when used with
instruments that record negative measurements, e.g. `UpDownCounter` or `ObservableGauge`.
- Min (optional) `Measurement` value in population.
- Max (optional) `Measurement` value in population.
#### Exponential Histogram Aggregation

The Exponential Histogram Aggregation informs the SDK to collect data
for the [Exponential Histogram Metric
Point](./datamodel.md#exponentialhistogram), which uses an exponential
formula to determine bucket boundaries and an integer `scale`
parameter to control resolution.

Scale is not a configurable property of this Aggregation, the
implementation will adjust it as necessary given the data. This
Aggregation honors the following configuration parameter:

| Key | Value | Default Value | Description |
|---------|---------|---------------|--------------------------------------------------------------------------------------------------------------|
| MaxSize | integer | 160 | Maximum number of buckets in each of the positive and negative ranges, not counting the special zero bucket. |

The default of 160 buckets is selected to establish default support
for a high-resolution histogram able to cover a long-tail latency
distribution from 1ms to 100s with less than 5% relative error.
Because 160 can be factored into `10 * 2**K`, maximum contrast is
relatively simple to derive for scale `K`:

| Scale | Maximum data contrast at 10 * 2**K buckets |
|-------|--------------------------------------------|
| K+2 | 5.657 (2**(10/4)) |
| K+1 | 32 (2**(10/2)) |
| K | 1024 (2**10) |
| K-1 | 1048576 (2**20) |

The following table shows how the ideal scale for 160 buckets is
calculated as a function of the input range:

| Input range | Contrast | Ideal Scale | Base | Relative error |
|-------------|----------|-------------|----------|----------------|
| 1ms - 4ms | 4 | 6 | 1.010889 | 0.542% |
| 1ms - 20ms | 20 | 5 | 1.021897 | 1.083% |
| 1ms - 1s | 10**3 | 4 | 1.044274 | 2.166% |
| 1ms - 100s | 10**5 | 3 | 1.090508 | 4.329% |
| 1μs - 10s | 10**7 | 2 | 1.189207 | 8.643% |

Note that relative error is calculated as half of the bucket width
divided by the bucket midpoint, which is the same in every bucket.
Using the bucket from [1, base), we have `(bucketWidth / 2) /
bucketMidpoint = ((base - 1) / 2) / ((base + 1) / 2) = (base - 1) /
(base + 1)`.

This Aggregation uses the notion of "ideal" scale. The ideal scale is
either:

1. The maximum supported scale, generally used for single-value histogram Aggregations where scale is not otherwise constrained
2. The largest value of scale such that no more than the maximum number of buckets are needed to represent the full range of input data in either of the positive or negative ranges.

##### Exponential Histogram Aggregation: Handle all normal values

Implementations are REQUIRED to accept the entire normal range of IEEE
floating point values (i.e., all values except for +Inf, -Inf and NaN
values).

Implementations SHOULD NOT incorporate non-normal values (i.e., +Inf,
-Inf, and NaNs) into the `sum`, `min`, and `max` fields, because these
values do not map into a valid bucket.

Implementations MAY round subnormal values away from zero to the
nearest normal value.

##### Exponential Histogram Aggregation: Support a minimum and maximum scale

The implementation MUST maintain reasonable minimum and maximum scale
parameters that the automatic scale parameter will not exceed.

##### Exponential Histogram Aggregation: Use the maximum scale for single measurements

When the histogram contains not more than one value in either of the
positive or negative ranges, the implementation SHOULD use the maximum
scale.

##### Exponential Histogram Aggregation: Maintain the ideal scale

Implementations SHOULD adjust the histogram scale as necessary to
maintain the best resolution possible, within the constraint of
maximum size (max number of buckets). Best resolution (highest scale)
is achieved when the number of positive or negative range buckets
exceeds half the maximum size, such that increasing scale by one would
not be possible given the size constraint.

### Observations inside asynchronous callbacks

Expand Down

0 comments on commit 3788987

Please sign in to comment.