Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The attribute.Set map key values are not always comparisons of value #3702

Closed
MrAlias opened this issue Feb 9, 2023 · 7 comments
Closed
Assignees
Labels
bug Something isn't working

Comments

@MrAlias
Copy link
Contributor

MrAlias commented Feb 9, 2023

Description

The attribute.Set type is used as a key for many maps in the metric SDK.

The attribute.Set is syntactically comparable, but for sets that contain slice values it does not compare the underlying value.

Steps To Reproduce

go.mod:

module testing/attrcmp

go 1.19

require (
	github.com/stretchr/testify v1.7.1
	go.opentelemetry.io/otel v1.9.0
)

require (
	github.com/davecgh/go-spew v1.1.0 // indirect
	github.com/pmezard/go-difflib v1.0.0 // indirect
	gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c // indirect
)
import (
	"strings"
	"testing"

	"github.com/stretchr/testify/assert"
	"go.opentelemetry.io/otel/attribute"
)

const key = "key"

type name []string

func (n name) String() string { return strings.Join(n, " ") }

func TestMapComparison(t *testing.T) {
	m := map[attribute.Set]bool{
		attribute.NewSet(attribute.Bool(key, true)):                                 true,
		attribute.NewSet(attribute.BoolSlice(key, []bool{true})):                    true,
		attribute.NewSet(attribute.Int(key, 1)):                                     true,
		attribute.NewSet(attribute.IntSlice(key, []int{1})):                         true,
		attribute.NewSet(attribute.Int64(key, 2)):                                   true,
		attribute.NewSet(attribute.Int64Slice(key, []int64{2})):                     true,
		attribute.NewSet(attribute.Float64(key, 2)):                                 true,
		attribute.NewSet(attribute.Float64Slice(key, []float64{2})):                 true,
		attribute.NewSet(attribute.String(key, "val")):                              true,
		attribute.NewSet(attribute.StringSlice(key, []string{"val"})):               true,
		attribute.NewSet(attribute.Stringer(key, name([]string{"Alice", "Apple"}))): true,
	}

	assert.True(t, m[attribute.NewSet(attribute.Bool(key, true))], "invalid bool lookup")
	assert.True(t, m[attribute.NewSet(attribute.BoolSlice(key, []bool{true}))], "invalid []bool lookup")
	assert.True(t, m[attribute.NewSet(attribute.Int(key, 1))], "invalid int lookup")
	assert.True(t, m[attribute.NewSet(attribute.IntSlice(key, []int{1}))], "invalid []int lookup")
	assert.True(t, m[attribute.NewSet(attribute.Int64(key, 2))], "invalid int64 lookup")
	assert.True(t, m[attribute.NewSet(attribute.Int64Slice(key, []int64{2}))], "invalid []int64 lookup")
	assert.True(t, m[attribute.NewSet(attribute.Float64(key, 2))], "invalid float64 lookup")
	assert.True(t, m[attribute.NewSet(attribute.Float64Slice(key, []float64{2}))], "invalid []float64 lookup")
	assert.True(t, m[attribute.NewSet(attribute.String(key, "val"))], "invalid string lookup")
	assert.True(t, m[attribute.NewSet(attribute.StringSlice(key, []string{"val"}))], "invalid []string lookup")
	assert.True(t, m[attribute.NewSet(attribute.Stringer(key, name([]string{"Alice", "Apple"})))], "invalid Stringer lookup")
}
$ go test ./...
--- FAIL: TestMapComparison (0.00s)
    cmp_test.go:46:
        	Error Trace:	cmp_test.go:46
        	Error:      	Should be true
        	Test:       	TestMapComparison
        	Messages:   	invalid []bool lookup
    cmp_test.go:48:
        	Error Trace:	cmp_test.go:48
        	Error:      	Should be true
        	Test:       	TestMapComparison
        	Messages:   	invalid []int lookup
    cmp_test.go:50:
        	Error Trace:	cmp_test.go:50
        	Error:      	Should be true
        	Test:       	TestMapComparison
        	Messages:   	invalid []int64 lookup
    cmp_test.go:52:
        	Error Trace:	cmp_test.go:52
        	Error:      	Should be true
        	Test:       	TestMapComparison
        	Messages:   	invalid []float64 lookup
    cmp_test.go:54:
        	Error Trace:	cmp_test.go:54
        	Error:      	Should be true
        	Test:       	TestMapComparison
        	Messages:   	invalid []string lookup
FAIL
FAIL	testing/attrcmp	0.003s
FAIL

Expected behavior

The provided test should pass.

@MrAlias MrAlias added the bug Something isn't working label Feb 9, 2023
@MrAlias MrAlias added this to the Metric v0.37.0 milestone Feb 9, 2023
@MrAlias
Copy link
Contributor Author

MrAlias commented Feb 9, 2023

A proposal from today's SIG meeting is to look into having a set return a "hash". That hash could generated by value and be used as a map key.

Also, our attributes should be converting the slice types to arrays which should compare values. However, it is storing them all as an interface which is likely the reason the reference is compared.

slice interface{}

An investigation into possibly storing this differently might help resolve the issue.

@MrAlias MrAlias self-assigned this Feb 10, 2023
@MrAlias
Copy link
Contributor Author

MrAlias commented Feb 10, 2023

PoC hashing function for attribute sets using built-in fnv hash:

import (
	"hash/fnv"
	"math"

	"go.opentelemetry.io/otel/attribute"
)

func hash(s attribute.Set) uint64 {
	h := fnv.New64a()
	iter := s.Iter()
	for iter.Next() {
		a := iter.Attribute()

		_, _ = h.Write([]byte(a.Key))

		val := a.Value
		switch val.Type() {
		case attribute.BOOL:
			b := boolBytes(val.AsBool())
			_, _ = h.Write(b[:])
		case attribute.BOOLSLICE:
			for _, v := range val.AsBoolSlice() {
				b := boolBytes(v)
				_, _ = h.Write(b[:])
			}
		case attribute.INT64:
			b := intBytes(val.AsInt64())
			_, _ = h.Write(b[:])
		case attribute.INT64SLICE:
			for _, v := range val.AsInt64Slice() {
				b := intBytes(v)
				_, _ = h.Write(b[:])
			}
		case attribute.FLOAT64:
			b := float64Bytes(val.AsFloat64())
			_, _ = h.Write(b[:])
		case attribute.FLOAT64SLICE:
			for _, v := range val.AsFloat64Slice() {
				b := float64Bytes(v)
				_, _ = h.Write(b[:])
			}
		case attribute.STRING:
			_, _ = h.Write([]byte(val.AsString()))
		case attribute.STRINGSLICE:
			for _, v := range val.AsStringSlice() {
				_, _ = h.Write([]byte(v))
			}
		}
	}
	return h.Sum64()
}

func boolBytes(val bool) [1]byte {
	if val {
		return [1]byte{1}
	}
	return [1]byte{0}
}

func intBytes[N int64 | uint64](val N) [8]byte {
	// Used for hashing, endianness doesn't matter.
	return [8]byte{
		byte(0xff & val),
		byte(0xff & (val >> 8)),
		byte(0xff & (val >> 16)),
		byte(0xff & (val >> 24)),
		byte(0xff & (val >> 32)),
		byte(0xff & (val >> 40)),
		byte(0xff & (val >> 48)),
		byte(0xff & (val >> 56)),
	}
}

func float64Bytes(val float64) [8]byte {
	return intBytes(math.Float64bits(val))
}

tests/benchmarks:

import (
	"testing"

	"github.com/stretchr/testify/assert"
	"go.opentelemetry.io/otel/attribute"
)

var tests = []struct {
	name       string
	attr       attribute.Set
	attrPrime  attribute.Set
	attrAltKey attribute.Set
	attrAltVal attribute.Set
}{
	{
		name:       "Bool",
		attr:       attribute.NewSet(attribute.Bool("k", true)),
		attrPrime:  attribute.NewSet(attribute.Bool("k", true)),
		attrAltKey: attribute.NewSet(attribute.Bool("j", true)),
		attrAltVal: attribute.NewSet(attribute.Bool("k", false)),
	},
	{
		name:       "BoolSlice",
		attr:       attribute.NewSet(attribute.BoolSlice("k", []bool{true, false, true})),
		attrPrime:  attribute.NewSet(attribute.BoolSlice("k", []bool{true, false, true})),
		attrAltKey: attribute.NewSet(attribute.BoolSlice("j", []bool{true, false, true})),
		attrAltVal: attribute.NewSet(attribute.BoolSlice("k", []bool{false, true, true})),
	},
	{
		name:       "Int",
		attr:       attribute.NewSet(attribute.Int("k", 1)),
		attrPrime:  attribute.NewSet(attribute.Int("k", 1)),
		attrAltKey: attribute.NewSet(attribute.Int("j", 1)),
		attrAltVal: attribute.NewSet(attribute.Int("k", -1)),
	},
	{
		name:       "IntSlice",
		attr:       attribute.NewSet(attribute.IntSlice("k", []int{-3, 0, 1})),
		attrPrime:  attribute.NewSet(attribute.IntSlice("k", []int{-3, 0, 1})),
		attrAltKey: attribute.NewSet(attribute.IntSlice("j", []int{-3, 0, 1})),
		attrAltVal: attribute.NewSet(attribute.IntSlice("k", []int{-4, 0, 2})),
	},
	{
		name:       "Int64",
		attr:       attribute.NewSet(attribute.Int64("k", -1)),
		attrPrime:  attribute.NewSet(attribute.Int64("k", -1)),
		attrAltKey: attribute.NewSet(attribute.Int64("j", -1)),
		attrAltVal: attribute.NewSet(attribute.Int64("k", 1)),
	},
	{
		name:       "Int64Slice",
		attr:       attribute.NewSet(attribute.Int64Slice("k", []int64{-4, 0, 2})),
		attrPrime:  attribute.NewSet(attribute.Int64Slice("k", []int64{-4, 0, 2})),
		attrAltKey: attribute.NewSet(attribute.Int64Slice("j", []int64{-4, 0, 2})),
		attrAltVal: attribute.NewSet(attribute.Int64Slice("k", []int64{-3, 0, 1})),
	},
	{
		name:       "Float64",
		attr:       attribute.NewSet(attribute.Float64("k", 2.3)),
		attrPrime:  attribute.NewSet(attribute.Float64("k", 2.3)),
		attrAltKey: attribute.NewSet(attribute.Float64("j", 2.3)),
		attrAltVal: attribute.NewSet(attribute.Float64("k", -1e3)),
	},
	{
		name:       "Float64Slice",
		attr:       attribute.NewSet(attribute.Float64Slice("k", []float64{-1.2, 0.32, 1e9})),
		attrPrime:  attribute.NewSet(attribute.Float64Slice("k", []float64{-1.2, 0.32, 1e9})),
		attrAltKey: attribute.NewSet(attribute.Float64Slice("j", []float64{-1.2, 0.32, 1e9})),
		attrAltVal: attribute.NewSet(attribute.Float64Slice("k", []float64{-1.2, 1e9, 0.32})),
	},
	{
		name:       "String",
		attr:       attribute.NewSet(attribute.String("k", "val")),
		attrPrime:  attribute.NewSet(attribute.String("k", "val")),
		attrAltKey: attribute.NewSet(attribute.String("j", "val")),
		attrAltVal: attribute.NewSet(attribute.String("k", "alt")),
	},
	{
		name:       "StringSlice",
		attr:       attribute.NewSet(attribute.StringSlice("k", []string{"zero", "one", ""})),
		attrPrime:  attribute.NewSet(attribute.StringSlice("k", []string{"zero", "one", ""})),
		attrAltKey: attribute.NewSet(attribute.StringSlice("j", []string{"zero", "one", ""})),
		attrAltVal: attribute.NewSet(attribute.StringSlice("k", []string{"", "one", "zero"})),
	},
	{
		name: "All",
		attr: attribute.NewSet(
			attribute.Bool("k", true),
			attribute.BoolSlice("k", []bool{true, false, true}),
			attribute.Int("k", 1),
			attribute.IntSlice("k", []int{-3, 0, 1}),
			attribute.Int64("k", -1),
			attribute.Int64Slice("k", []int64{-4, 0, 2}),
			attribute.Float64("k", 2.3),
			attribute.Float64Slice("k", []float64{-1.2, 0.32, 1e9}),
			attribute.String("k", "val"),
			attribute.StringSlice("k", []string{"zero", "one", ""}),
		),
		attrPrime: attribute.NewSet(
			attribute.Bool("k", true),
			attribute.BoolSlice("k", []bool{true, false, true}),
			attribute.Int("k", 1),
			attribute.IntSlice("k", []int{-3, 0, 1}),
			attribute.Int64("k", -1),
			attribute.Int64Slice("k", []int64{-4, 0, 2}),
			attribute.Float64("k", 2.3),
			attribute.Float64Slice("k", []float64{-1.2, 0.32, 1e9}),
			attribute.String("k", "val"),
			attribute.StringSlice("k", []string{"zero", "one", ""}),
		),
		attrAltKey: attribute.NewSet(
			attribute.Bool("j", true),
			attribute.BoolSlice("j", []bool{true, false, true}),
			attribute.Int("j", 1),
			attribute.IntSlice("j", []int{-3, 0, 1}),
			attribute.Int64("j", -1),
			attribute.Int64Slice("j", []int64{-4, 0, 2}),
			attribute.Float64("j", 2.3),
			attribute.Float64Slice("j", []float64{-1.2, 0.32, 1e9}),
			attribute.String("j", "val"),
			attribute.StringSlice("j", []string{"zero", "one", ""}),
		),
		attrAltVal: attribute.NewSet(
			attribute.Bool("k", false),
			attribute.BoolSlice("k", []bool{false, true, true}),
			attribute.Int("k", -1),
			attribute.IntSlice("k", []int{-4, 0, 2}),
			attribute.Int64("k", 1),
			attribute.Int64Slice("k", []int64{-3, 0, 1}),
			attribute.Float64("k", -1e3),
			attribute.Float64Slice("k", []float64{-1.2, 1e9, 0.32}),
			attribute.String("k", "alt"),
			attribute.StringSlice("k", []string{"", "one", "zero"}),
		),
	},
}

func TestHash(t *testing.T) {
	for _, test := range tests {
		t.Run(test.name, testHash(test.attr, test.attrPrime, test.attrAltKey, test.attrAltVal))
	}
}

func testHash(a, ap, ak, av attribute.Set) func(*testing.T) {
	return func(t *testing.T) {
		h := hash(a)
		assert.Equal(t, h, hash(ap), "same keys/values")
		assert.NotEqual(t, h, hash(ak), "different keys")
		assert.NotEqual(t, h, hash(av), "different values")
	}
}

var benchmarks = []struct {
	name string
	attr attribute.Set
}{
	{name: "Bool", attr: attribute.NewSet(attribute.Bool("k", true))},
	{name: "BoolSlice", attr: attribute.NewSet(attribute.BoolSlice("k", []bool{true, false, true}))},
	{name: "Int", attr: attribute.NewSet(attribute.Int("k", 1))},
	{name: "IntSlice", attr: attribute.NewSet(attribute.IntSlice("k", []int{-3, 0, 1}))},
	{name: "Int64", attr: attribute.NewSet(attribute.Int64("k", -1))},
	{name: "Int64Slice", attr: attribute.NewSet(attribute.Int64Slice("k", []int64{-4, 0, 2}))},
	{name: "Float64", attr: attribute.NewSet(attribute.Float64("k", 2.3))},
	{name: "Float64Slice", attr: attribute.NewSet(attribute.Float64Slice("k", []float64{-1.2, 0.32, 1e9}))},
	{name: "String", attr: attribute.NewSet(attribute.String("k", "val"))},
	{name: "StringSlice", attr: attribute.NewSet(attribute.StringSlice("k", []string{"zero", "one", ""}))},
	{
		name: "All",
		attr: attribute.NewSet(
			attribute.Bool("k", true),
			attribute.BoolSlice("k", []bool{true, false, true}),
			attribute.Int("k", 1),
			attribute.IntSlice("k", []int{-3, 0, 1}),
			attribute.Int64("k", -1),
			attribute.Int64Slice("k", []int64{-4, 0, 2}),
			attribute.Float64("k", 2.3),
			attribute.Float64Slice("k", []float64{-1.2, 0.32, 1e9}),
			attribute.String("k", "val"),
			attribute.StringSlice("k", []string{"zero", "one", ""}),
		),
	},
}

func BenchmarkHash(b *testing.B) {
	for _, bench := range benchmarks {
		b.Run(bench.name, benchHash(bench.attr))
	}
}

var hashResult uint64

func benchHash(s attribute.Set) func(*testing.B) {
	return func(b *testing.B) {
		b.ReportAllocs()
		for n := 0; n < b.N; n++ {
			hashResult = hash(s)
		}
	}
}
$ go test -bench=BenchmarkHash
goos: linux
goarch: amd64
pkg: go.opentelemetry.io/otel/sdk/metric/internal/attrmap
cpu: Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz
BenchmarkHash/Bool-8         	21972621	        48.18 ns/op	       0 B/op	       0 allocs/op
BenchmarkHash/BoolSlice-8    	 5012506	       230.4 ns/op	      27 B/op	       2 allocs/op
BenchmarkHash/Int-8          	19157548	        61.13 ns/op	       0 B/op	       0 allocs/op
BenchmarkHash/IntSlice-8     	 4312996	       275.4 ns/op	      48 B/op	       2 allocs/op
BenchmarkHash/Int64-8        	18901846	        60.37 ns/op	       0 B/op	       0 allocs/op
BenchmarkHash/Int64Slice-8   	 4343298	       273.8 ns/op	      48 B/op	       2 allocs/op
BenchmarkHash/Float64-8      	21570406	        56.45 ns/op	       0 B/op	       0 allocs/op
BenchmarkHash/Float64Slice-8 	 4494706	       262.6 ns/op	      48 B/op	       2 allocs/op
BenchmarkHash/String-8       	23715417	        51.26 ns/op	       0 B/op	       0 allocs/op
BenchmarkHash/StringSlice-8  	 4096106	       286.0 ns/op	      72 B/op	       2 allocs/op
BenchmarkHash/All-8          	 4162212	       285.6 ns/op	      72 B/op	       2 allocs/op
PASS
ok  	go.opentelemetry.io/otel/sdk/metric/internal/attrmap	14.855s

@MrAlias
Copy link
Contributor Author

MrAlias commented Feb 10, 2023

If the underlying slice storage of the attribute.Value is accessible to the hash function, we can remove all allocations during the hash computation. E.g.

func hash(s Set) uint64 {
	h := fnv.New64a()
	iter := s.Iter()
	for iter.Next() {
		a := iter.Attribute()

		_, _ = h.Write([]byte(a.Key))

		val := a.Value
		switch val.Type() {
		case BOOL:
			b := boolBytes(val.AsBool())
			_, _ = h.Write(b[:])
		case BOOLSLICE:
			rv := reflect.ValueOf(val.slice)
			if rv.Type().Kind() != reflect.Array {
				// FIXME: continue and skip
				return 0
			}
			for i := 0; i < rv.Len(); i++ {
				v := rv.Index(i)
				if v.Type().Kind() != reflect.Bool {
					// Should never happen.
					continue
				}
				b := boolBytes(v.Bool())
				_, _ = h.Write(b[:])
			}
		case INT64:
			b := intBytes(val.AsInt64())
			_, _ = h.Write(b[:])
		case INT64SLICE:
			rv := reflect.ValueOf(val.slice)
			if rv.Type().Kind() != reflect.Array {
				// FIXME: continue and skip
				return 0
			}
			for i := 0; i < rv.Len(); i++ {
				v := rv.Index(i)
				if v.Type().Kind() != reflect.Int64 {
					// Should never happen.
					continue
				}
				b := intBytes(v.Int())
				_, _ = h.Write(b[:])
			}
		case FLOAT64:
			b := float64Bytes(val.AsFloat64())
			_, _ = h.Write(b[:])
		case FLOAT64SLICE:
			rv := reflect.ValueOf(val.slice)
			if rv.Type().Kind() != reflect.Array {
				// FIXME: continue and skip
				return 0
			}
			for i := 0; i < rv.Len(); i++ {
				v := rv.Index(i)
				if v.Type().Kind() != reflect.Float64 {
					// Should never happen.
					continue
				}
				b := float64Bytes(v.Float())
				_, _ = h.Write(b[:])
			}
		case STRING:
			_, _ = h.Write([]byte(val.AsString()))
		case STRINGSLICE:
			rv := reflect.ValueOf(val.slice)
			if rv.Type().Kind() != reflect.Array {
				// FIXME: continue and skip
				return 0
			}
			for i := 0; i < rv.Len(); i++ {
				v := rv.Index(i)
				if v.Type().Kind() != reflect.String {
					// Should never happen.
					continue
				}
				_, _ = h.Write([]byte(v.String()))
			}
		}
	}
	return h.Sum64()
}
go test -bench=BenchmarkHash
goos: linux
goarch: amd64
pkg: go.opentelemetry.io/otel/sdk/metric/internal/attrmap
cpu: Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz
BenchmarkHash/Bool-8         	20826471	        49.77 ns/op	       0 B/op	       0 allocs/op
BenchmarkHash/BoolSlice-8    	15753393	        73.98 ns/op	       0 B/op	       0 allocs/op
BenchmarkHash/Int-8          	19913074	        59.37 ns/op	       0 B/op	       0 allocs/op
BenchmarkHash/IntSlice-8     	 9514028	       119.7 ns/op	       0 B/op	       0 allocs/op
BenchmarkHash/Int64-8        	19977120	        59.82 ns/op	       0 B/op	       0 allocs/op
BenchmarkHash/Int64Slice-8   	 9953578	       118.4 ns/op	       0 B/op	       0 allocs/op
BenchmarkHash/Float64-8      	22275105	        53.32 ns/op	       0 B/op	       0 allocs/op
BenchmarkHash/Float64Slice-8 	13028379	        93.49 ns/op	       0 B/op	       0 allocs/op
BenchmarkHash/String-8       	23566215	        49.64 ns/op	       0 B/op	       0 allocs/op
BenchmarkHash/StringSlice-8  	13284459	        94.14 ns/op	       0 B/op	       0 allocs/op
BenchmarkHash/All-8          	12743206	        88.03 ns/op	       0 B/op	       0 allocs/op
PASS
ok  	go.opentelemetry.io/otel/sdk/metric/internal/attrmap	13.762s

@MrAlias
Copy link
Contributor Author

MrAlias commented Feb 10, 2023

If the underlying slice storage of the attribute.Value is accessible to the hash function, we can remove all allocations during the hash computation.

Adding a Hash method to the attribute.Set that returns a hash of the contained attributes could solve this.

@MrAlias
Copy link
Contributor Author

MrAlias commented Feb 11, 2023

If the underlying slice storage of the attribute.Value is accessible to the hash function, we can remove all allocations during the hash computation.

Adding a Hash method to the attribute.Set that returns a hash of the contained attributes could solve this.

Looking into updating the attribute.Distinct to be based on this hash given it is the existing type that says it can be used as a map key.

@MrAlias
Copy link
Contributor Author

MrAlias commented Feb 12, 2023

Testing against the latest version of otel resolves the test failures.

go 1.19

require (
	github.com/stretchr/testify v1.8.1
	go.opentelemetry.io/otel v1.13.0
)

require (
	github.com/davecgh/go-spew v1.1.1 // indirect
	github.com/pmezard/go-difflib v1.0.0 // indirect
	gopkg.in/yaml.v3 v3.0.1 // indirect
)
go test -v ./...
=== RUN   TestMapComparison
--- PASS: TestMapComparison (0.00s)
PASS
ok  	testing/attrcmp	0.003s

@MrAlias
Copy link
Contributor Author

MrAlias commented Feb 12, 2023

This looks to have been addressed in #3252

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
No open projects
Status: Done
Development

No branches or pull requests

1 participant