Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add() is not thread-safe when called with more than one attribute #3943

Closed
asafm opened this issue Mar 28, 2023 · 39 comments · Fixed by #3971
Closed

Add() is not thread-safe when called with more than one attribute #3943

asafm opened this issue Mar 28, 2023 · 39 comments · Fixed by #3971
Labels
bug Something isn't working

Comments

@asafm
Copy link

asafm commented Mar 28, 2023

Description

.Add() is not thread-safe when used with more than one attribute

Environment

  • OS: OS X
  • Architecture: M1
  • Go Version: 1.19
  • opentelemetry-go version: 1.14.0

Steps To Reproduce

  1. Create a Counter
  2. Call .Add(() with 2 attributes, from multiple Go routines.
  3. See error

Expected behavior

It should work without errors.

Here's the race detector stack race:

==================
WARNING: DATA RACE
Read at 0x00c000591100 by goroutine 654:
  go.opentelemetry.io/otel/attribute.(*Sortable).Less()
      /Users/abcd/go/pkg/mod/go.opentelemetry.io/otel@v1.14.0/attribute/set.go:423 +0xb4
  sort.insertionSort()
      /opt/homebrew/Cellar/go/1.19.3/libexec/src/sort/zsortinterface.go:12 +0xd8
  sort.stable()
      /opt/homebrew/Cellar/go/1.19.3/libexec/src/sort/zsortinterface.go:343 +0x70
  sort.Stable()
      /opt/homebrew/Cellar/go/1.19.3/libexec/src/sort/sort.go:208 +0x44
  go.opentelemetry.io/otel/attribute.NewSetWithSortableFiltered()
      /Users/abcd/go/pkg/mod/go.opentelemetry.io/otel@v1.14.0/attribute/set.go:257 +0x90
  go.opentelemetry.io/otel/attribute.NewSet()
      /Users/abcd/go/pkg/mod/go.opentelemetry.io/otel@v1.14.0/attribute/set.go:194 +0xa4
  go.opentelemetry.io/otel/sdk/metric.(*instrumentImpl[...]).aggregate()
      /Users/abcd/go/pkg/mod/go.opentelemetry.io/otel/sdk/metric@v0.37.0/instrument.go:198 +0xdc
  go.opentelemetry.io/otel/sdk/metric.(*instrumentImpl[...]).Add()
      /Users/abcd/go/pkg/mod/go.opentelemetry.io/otel/sdk/metric@v0.37.0/instrument.go:186 +0xa0
  go.opentelemetry.io/otel/sdk/metric.(*instrumentImpl[...]).Add()
      <autogenerated>:1 +0x68
  aacb/common/metrics.(*counter).Add()
      /Users/abcd/prg/aacb/common/metrics/counter.go:43 +0xa8
  ...
  (REDACTED)

Previous write at 0x00c000591100 by goroutine 651:
  go.opentelemetry.io/otel/attribute.NewSetWithSortableFiltered()
      /Users/abcd/go/pkg/mod/go.opentelemetry.io/otel@v1.14.0/attribute/set.go:275 +0x2a8
  go.opentelemetry.io/otel/attribute.NewSet()
      /Users/abcd/go/pkg/mod/go.opentelemetry.io/otel@v1.14.0/attribute/set.go:194 +0xa4
  go.opentelemetry.io/otel/sdk/metric.(*instrumentImpl[...]).aggregate()
      /Users/abcd/go/pkg/mod/go.opentelemetry.io/otel/sdk/metric@v0.37.0/instrument.go:198 +0xdc
  go.opentelemetry.io/otel/sdk/metric.(*instrumentImpl[...]).Add()
      /Users/abcd/go/pkg/mod/go.opentelemetry.io/otel/sdk/metric@v0.37.0/instrument.go:186 +0xa0
  go.opentelemetry.io/otel/sdk/metric.(*instrumentImpl[...]).Add()
      <autogenerated>:1 +0x68
  aacb/common/metrics.(*counter).Add()
      /Users/abcd/prg/aacb/common/metrics/counter.go:43 +0xa8
  ...
  (REDACTED)

==================
@asafm asafm added the bug Something isn't working label Mar 28, 2023
@asafm
Copy link
Author

asafm commented Mar 28, 2023

Thanks, @jmacd, for helping me to point out the source (I'm a newbie in Go):

We have

func (i *instrumentImpl[N]) aggregate(ctx context.Context, val N, attrs []attribute.KeyValue) {
	if err := ctx.Err(); err != nil {
		return
	}
	// Do not use single attribute.Sortable and attribute.NewSetWithSortable,
	// this method needs to be concurrent safe. Let the sync.Pool in the
	// attribute package handle allocations of the Sortable.
	s := attribute.NewSet(attrs...)
	for _, agg := range i.aggregators {
		agg.Aggregate(val, s)
	}
}

the call to NewSet looks like this:

func NewSetWithSortableFiltered(kvs []KeyValue, tmp *Sortable, filter Filter) (Set, []KeyValue) {
	// Check for empty set.
	if len(kvs) == 0 {
		return empty(), nil
	}

	*tmp = kvs

	// Stable sort so the following de-duplication can implement
	// last-value-wins semantics.
	sort.Stable(tmp)

So it means the user-provided array is sorted for them, without them knowing it. It means that they will not put a lock on it, as they will not expect it, hence when this array is re-used (it should since you keep recording values) across multiple Go Routines using different native threads, it will cause a race condition as presented, hence a bug.

@Kaushal28
Copy link
Contributor

Hi,

Can I take this bug?

@pellared
Copy link
Member

@Kaushal28 I assigned you

@asafm
Copy link
Author

asafm commented Mar 29, 2023

The is that you construct the KeyValue array once and use the same one from multiple go routines.

@Kaushal28
Copy link
Contributor

Thanks @asafm @pellared. Could you help me reproduce? When you say Create a Counter, Which counter are you talking about? I tried it with Int64Counter with race detector (go run -race main.go) and it's not giving me this warning. Here's my code:

func main() {
	provider := metric.NewMeterProvider()
	meter := provider.Meter("test_meter")
	counter, err := meter.Int64Counter("int64counter")
	if err != nil {
		panic(err)
	}

	ctx := context.Background()
	attrs := []attribute.KeyValue{
		{
			Key:   "key1",
			Value: attribute.StringValue("val1"),
		},
		{
			Key:   "key2",
			Value: attribute.StringValue("val2"),
		},
	}

	go func() {
		for i := 0; i < 1000; i++ {
			counter.Add(ctx, 1, attrs...)
		}
	}()

	go func() {
		for i := 0; i < 1000; i++ {
			counter.Add(ctx, 1, attrs...)
		}
	}()
}

@Aneurysm9 Aneurysm9 modified the milestone: v1.15.0 Mar 29, 2023
@MrAlias
Copy link
Contributor

MrAlias commented Mar 29, 2023

I'm not sure this is a problem that can be solved in the SDK outside of copying the passed attributes before making a set out of them. This still provides a potential for a data race during the copy.

It might be the case that the API needs to be updated to accept attribute sets instead. Give the immutability of attribute sets this will allow them to be reused concurrently. One draw back is the API will lose the nice variadic attribute design. For example, counter.Add(1, attribute.String("user", "Alice')) becomes counter.Add(1, attribute.NewSet(attribute.String("user", "Alice"))), and counter.Add(1) becomes counter.Add(1, attribute.EmptySet()). Though it may look a more cumbersome, it might be a necessary change if we intend to provide concurrency guarantees.

@merlimat
Copy link

@MrAlias Yes please :) . Taking an immutable set will also avoid allocating and copying the attributes each time we need to record an event.

@MrAlias
Copy link
Contributor

MrAlias commented Mar 29, 2023

I'm not sure this is a problem that can be solved in the SDK outside of copying the passed attributes before making a set out of them. This still provides a potential for a data race during the copy.

It might be the case that the API needs to be updated to accept attribute sets instead. Give the immutability of attribute sets this will allow them to be reused concurrently. One draw back is the API will lose the nice variadic attribute design. For example, counter.Add(1, attribute.String("user", "Alice')) becomes counter.Add(1, attribute.NewSet(attribute.String("user", "Alice"))), and counter.Add(1) becomes counter.Add(1, attribute.EmptySet()). Though it may look a more cumbersome, it might be a necessary change if we intend to provide concurrency guarantees.

#3947

@jmacd
Copy link
Contributor

jmacd commented Mar 30, 2023

In my opinion, the current API is working as designed. Comments on the synchronous instruments should state that the input slice of ...attribute.KeyValue is meant to be the one a compiler would generate using the arguments supplied at the call site, because the SDK will perform the necessary sorting step in place without an allocation.

By the way, this conversation is related to open-telemetry/opentelemetry-collector#7455 where I've stated that even calling attribute.NewSet() is too expensive for every synchronous operation.

I think it should be possible to work around the race condition by pre-sorting the list of attributes. For example, I expect attribute.NewSet(input...).ToSlice() to return the correctly sorted, deduplicated attributes and I think it ought to not race. (Does sort.Stable() modify an already-sorted slice? If the answer is yes, probably we should check whether the slice is sorted before sorting it, to avoid this race.

I think it would be appealing to support alternative APIs with both variations, meaning for every Add(context.Context, value int64, ...attribute.KeyValue) there is a less ergonomic form like AddWithAttributes(context.Context, value int64, attribute.Set). This would give the user the choice to have a simple (familiar) API or a more sophisticated one when performance really matters. I also think this is a nice step short of "bound" instruments. I think #3947 is a step too far. By the way, I think the specification supports this direction, with this text:

The [OpenTelemetry API](../overview.md#api) authors MAY decide to allow flexible
[attributes](../common/README.md#attribute) to be passed in as arguments. If
the attribute names and types are provided during the [counter
creation](#counter-creation), the [OpenTelemetry API](../overview.md#api)
authors MAY allow attribute values to be passed in using a more efficient way
(e.g. strong typed struct allocated on the callstack, tuple). The API MUST allow
callers to provide flexible attributes at invocation time rather than having to
register all the possible attribute names during the instrument creation.

@asafm
Copy link
Author

asafm commented Mar 30, 2023

@jmacd What if AddWithAttributes(context.Context, value int64, attribute.Set) is Add(context.Context, value int64, attribute.Set), alongside Add(context.Context, value int64, ...attribute.KeyValue)?

@Kaushal28 Kaushal28 removed their assignment Mar 30, 2023
@MrAlias
Copy link
Contributor

MrAlias commented Mar 30, 2023

In my opinion, the current API is working as designed. Comments on the synchronous instruments should state that the input slice of ...attribute.KeyValue is meant to be the one a compiler would generate using the arguments supplied at the call site, because the SDK will perform the necessary sorting step in place without an allocation.

I think it should be possible to work around the race condition by pre-sorting the list of attributes. For example, I expect attribute.NewSet(input...).ToSlice() to return the correctly sorted, deduplicated attributes and I think it ought to not race. (Does sort.Stable() modify an already-sorted slice? If the answer is yes, probably we should check whether the slice is sorted before sorting it, to avoid this race.

It does not seem appropriate to solve our race condition by saying a user needs to pass us "cleaned" values of an argument. Especially if doing so would have them create a set anyway.

Asking users to clean their input also does not solve the underlying issue, the API would not be concurrent safe under certain situations. This is non-compliant with the specification.

By the way, this conversation is related to open-telemetry/opentelemetry-collector#7455 where I've stated that even calling attribute.NewSet() is too expensive for every synchronous operation.

This supports the change to the API, right? Currently for all our synchronous measurement operations we create a set.

I think it would be appealing to support alternative APIs with both variations, meaning for every Add(context.Context, value int64, ...attribute.KeyValue) there is a less ergonomic form like AddWithAttributes(context.Context, value int64, attribute.Set).

Give we are still pre-1.0 for this API I prefer to not add methods to fix an existing method. Rather we should just fix the existing method.

@jmacd
Copy link
Contributor

jmacd commented Mar 30, 2023

@MrAlias I'm worried about loss of performance for existing use-cases that are correct as they stand. We can work around this issue by copying the input slice for every caller, but only special callers where the compiler does not insert a temporary slice need this support.

Rather we should just fix the existing method.

I think requiring an attribute.Set burdens the user with an inconvenient API. My preference would be to document that users must either use a compiler-generated slice or must take their own lock.

Also, I believe the optimizations in the Lightstep SDK correctly work around this race (when IgnoreCollisions is false) by:

  1. A fingerprint is computed from the exact input slice with its duplicates and out-of-order keys
  2. The exact input slice is then copied, as we need one copy to identify equivalence in subsequent lookups
  3. The copy of the input slice is sorted to create an attribute set if needed
  4. The map continues to use the fingerprint and check equivalence of the unsorted, possibly duplicated data.
  5. The set continues to be used to convey the correct information to the aggregators.

On the other hand, when I set the IgnoreCollisions setting true, then my implementation avoids the copy and does not hold the memory. To fix this I would have to copy the slice unconditionally, but at least it will only happen when a fingerprint misses. (Still, I'd prefer one less memory allocation, to document that the user's slice will be sorted in place, and to let the user lock their attribute lists if they're not using compiler temporaries.)

I think I've demonstrated how the race can be fixed w/o changing our API here--this is also discussed in open-telemetry/opentelemetry-collector#7455.

@jmacd
Copy link
Contributor

jmacd commented Mar 30, 2023

By the way, this conversation is related to open-telemetry/opentelemetry-collector#7455 where I've stated that even calling attribute.NewSet() is too expensive for every synchronous operation.

This supports the change to the API, right? Currently for all our synchronous measurement operations we create a set.

In a way, yes. Although I find it unergonomic, changing the API to accept attribute.Set avoids one kind of performance problem, but it creates a new kind of performance problem as a result of the burdensome API.

If a caller is currently repeating attribute sets but has no good way to cache the result of calculating an attribute.Set, now they'll require two allocations at every call site, one to construct a temporary list and one to construct a set. Then, my feeling is that even the cost of using map[attribute.Set]... (i.e., attribute.Set as map key) is too expensive for the synchronous code path, i.e., I'd prefer not to invoke a compiler-generated hash-and-map-equals function on attribute.Set, that's more expensive than the 64-bit fingerprint lookup which can be had based on hashing the []attribute.KeyValue.

The original OTel conversation about "bound instruments" was another angle on this performance problem. If the user is able to store and cache something to benefit performance, that's nice but it could force them into a major code reorganization. This is where I get the idea that two APIs would be great -- the convenient API and the fast API. Even if we have a way to pass attribute.Set to the metrics API it will not be as fast as the hypothetical bound instrument API, so I imagine users will continue to ask for more performance (and really I can see three various APIs with different costs/benefits).

@merlimat
Copy link

If a caller is currently repeating attribute sets but has no good way to cache the result of calculating an attribute.Set, now they'll require two allocations at every call site, one to construct a temporary list and one to construct a set

I think that's what's really happening right now, no?

You have:

myCounter.Add(context.Background(), 1, attribute.String("x", "y"), attribute.Int("z", 2))

The attrs ...attribute.KeyValue argument, gets converted into a []attribute.KeyValue and it needs an allocation. Later, it gets internally converted into a attribute.Set, allocating the set.

@MrAlias
Copy link
Contributor

MrAlias commented Mar 30, 2023

  • A fingerprint is computed from the exact input slice with its duplicates and out-of-order keys
  • The exact input slice is then copied, as we need one copy to identify equivalence in subsequent lookups
  • The copy of the input slice is sorted to create an attribute set if needed
  • The map continues to use the fingerprint and check equivalence of the unsorted, possibly duplicated data.
  • The set continues to be used to convey the correct information to the aggregators.

How is copying data and then creating a set more performant than just accepting the set?

If a caller is currently repeating attribute sets but has no good way to cache the result of calculating an attribute.Set, now they'll require two allocations at every call site, one to construct a temporary list and one to construct a set.

The set construction does not require an allocation unless it is the only set created. The set temporary pointer is pooled.

@MrAlias
Copy link
Contributor

MrAlias commented Mar 30, 2023

Even if we have a way to pass attribute.Set to the metrics API it will not be as fast as the hypothetical bound instrument API.

Sure, but is it less performant than it is more performant for a user to just re-use a set instead of allocating a new group of attribute args each time? It seems like there is an order of magnitude between the two and accepting a set allows the user to choose how performant they want to be or not without us having to add an additional bound instrument API

@MrAlias
Copy link
Contributor

MrAlias commented Mar 30, 2023

Another reason to accept a set instead of a slice of attributes is that that is what is actually used. There are syntactical niceties to accepting variadic attribute.KeyValue, but it gives the false impression that duplicate keys are allowed.

Accepting a set makes it clear to the user that duplicate keys are not allowed.

@MrAlias
Copy link
Contributor

MrAlias commented Mar 30, 2023

I'd prefer not to invoke a compiler-generated hash-and-map-equals function on attribute.Set, that's more expensive than the 64-bit fingerprint lookup which can be had based on hashing the []attribute.KeyValue.

Do you have benchmark numbers on this? I'd be interested to see the difference.

For what it is worth, I have, in a local branch, extended the Set to have a Hash method that could return an int64. If we ever wanted to use that hashing for map keys it is not out of the question if we accept a Set.

@MrAlias
Copy link
Contributor

MrAlias commented Mar 30, 2023

Another suggestion from the SIG meeting today (cc @MadVikingGod):

Have methods that accept no attributes and an equivalent *WithAttributes method to accept the attributes as an attribute.Set. For example the following methods would be used:

type SyncAdder interface {
	Add(ctx context.Context, value N)
	AddWithAttributes(ctx context.Context, value N, attrs attribute.Set)
}

type SyncRecorder interface {
	Record(ctx context.Context, value N)
	RecordWithAttributes(ctx context.Context, value N, attrs attribute.Set)
}

type AsyncObserver interface {
	Observe(value N)
	ObserveWithAttributes(value N, attrs attributes.Set)
}

This would help eliminate the boilerplate of calling *attribute.EmptySet() when no attributes are wanted.

It's not immediately clear if this will comply with the specification though, i.e.:

Add

Increment the Counter by a fixed amount.

This API SHOULD NOT return a value (it MAY return a dummy value if required by
certain programming languages or systems, for example null, undefined).

This API MUST accept the following parameter:

[...]

Users can provide attributes to associate with the increment value, but it is
up to their discretion. Therefore, this API MUST be structured to accept a
variable number of attributes, including none.

If the specification is talking about a logical API, I think having multiple methods to satisfy it is compliant.

@MrAlias
Copy link
Contributor

MrAlias commented Mar 31, 2023

Another suggestion from the SIG meeting today (cc @MadVikingGod):

Have methods that accept no attributes and an equivalent *WithAttributes method to accept the attributes as an attribute.Set. For example the following methods would be used:

type SyncAdder interface {
	Add(ctx context.Context, value N)
	AddWithAttributes(ctx context.Context, value N, attrs attribute.Set)
}

type SyncRecorder interface {
	Record(ctx context.Context, value N)
	RecordWithAttributes(ctx context.Context, value N, attrs attribute.Set)
}

type AsyncObserver interface {
	Observe(value N)
	ObserveWithAttributes(value N, attrs attributes.Set)
}

PoC: #3955

This approach works well for API users that do not pass attributes. For example, this:

counter.Add(ctx, 100, *attribute.EmptySet())

Becomes:

counter.Add(ctx, 100)

However, when a user includes attributes, there is a noticeable number of characters added to the line. For example, the original:

counter.Add(ctx, 100, attrs)

Becomes:

counter.AddWithAttributes(ctx, 100, attrs)

It seems that this proposal has the trade-off of minimizing boilerplate for the no-attribute case and adding boilerplate in the attribute case. Quantitatively *attribute.EmptySet() is 21 characters of boilerplate vs WithAttributes which is 14, but the use of attributes is only going to be quantifiable by the end user.

@MrAlias
Copy link
Contributor

MrAlias commented Mar 31, 2023

A user could always reduce their boilerplate by declaring var noAttr = *attribute.EmptySet() thereby making:

counter.Add(ctx, 100, *attribute.EmptySet())
hist.Record(ctx, 10, *attribute.EmptySet())

into:

noAttr := *attribute.EmptySet()
counter.Add(ctx, 100, noAttr)
hist.Record(ctx, 10, noAttr)

Conversely, trying to reduce the boilerplate of AddWithAttributes by declaring it as a var:

var ctrAdd = counter.AddWithAttributes
ctrAdd(ctx, 100, attr)
var histRec = hist.RecordWithAttributes
histRec(ctx, 100, attr)

would have limited functionality as it would only apply to one instrument.

@MadVikingGod
Copy link
Contributor

MadVikingGod commented Mar 31, 2023

The concern I have is less about the boilerplate, and more that we want to encourage people to use a particular empty set (*attribute.EmptySet()) and how that is different from the norms of go. If nil or attributes.Set{} were our recommendation then there really isn't a need for this extra api.

@MrAlias
Copy link
Contributor

MrAlias commented Mar 31, 2023

The concern I have is less about the boilerplate, and more that we want to encourage people to use a particular empty set (*attribute.EmptySet()) and how that is different from the norms of go. If nil or attributes.Set{} were our recommendation then there really isn't a need for this extra api.

After #3957 they will be able to pass attribute.Set{} and they can currently just use attribute.NewSet() instead of *attribute.EmptySet().

We could comment attribute.EmptySet to say to use it if you need a pointer (though new(attribute.Set) will also do).

@pellared
Copy link
Member

pellared commented Apr 2, 2023

The concern I have is less about the boilerplate, and more that we want to encourage people to use a particular empty set (*attribute.EmptySet()) and how that is different from the norms of go. If nil or attributes.Set{} were our recommendation then there really isn't a need for this extra api.

After #3957 they will be able to pass attribute.Set{} and they can currently just use attribute.NewSet() instead of *attribute.EmptySet().

We could comment attribute.EmptySet to say to use it if you need a pointer (though new(attribute.Set) will also do).

I think, I would prefer to have a single method. It reduces the API surface which IMO has benefits for both users of the API and for the developers of API implementations. Users who do not like boilerplate can always create their own helpers/adapters.

@jmacd
Copy link
Contributor

jmacd commented Apr 3, 2023

I'm still concerned about an additional, obligatory memory allocation that results from having a single API accept the current attribute.Set. It may be possible to work around some of the limitations by redesigning attribute.Set (and I think we may want to support context-scoped attributes) but as the current code stands, the problem allocation comes from code organized to build an attribute set incrementally, like:

var attrsPool = sync.Pool{
	New: func() any {
		p := new([]attribute.KeyValue)
                 *p = make([]attribute.KeyValue, 0, 16)
                return p
	},
}

func (thing *Thing) instrumentOperation(ctx context.Context, value1, value2, value3 float64, moreAttrs ...attribute.KeyValue) {
  attrs := attrsPool.Get().(*[]attribute.KeyValue)

  // get some attributes from the "scope" (thing.myAttributes...)
  // get some attributes from the context (e.g., from baggage...)
  // get the attributes from the call site, if any (i.e., moreAttrs...) 
  // the slice of attributes is assembled
  instrument1.Add(ctx, value1, *attrs...)
  instrument2.Add(ctx, value2, *attrs...)
  instrument3.Record(ctx, value3, *attrs...)
  attrsPool.Put(attrs)
}

The user in this case already has organized their code to produce a list of attributes, as typical of a statsd library. If we replace the three instrument operations w/ a shared attribute.Set, that is an improvement in a sense, because if the SDK will compute the attribute.Set each time then letting the user compute it once is an improvement.

In the case of the Lightstep metrics SDK, I mentioned the use of a map[uint64]*record which avoids constructing the attribute.Set if the entry already exists, as determined by a fingerprint function. That is, there are no allocations inside the SDK, whereas attribute.Set would be a new allocation if it were to be required. The LS metrics SDK will compute the fingerprint 3 times, but it would still not allocate. (Note that Prometheus server and its Go client library use a similar organization, a map of fingerprint referring to a list of entries.)

This leads me to think that we should find a way to eliminate attribute.Set in favor of something better, call it the AttributeSetBuilder to be concrete. I don't have a prototype to show, but I wonder if it's possible to take the best of both cases, something that allows an incrementally-built list of KeyValue to be reported, so that the attribute.Set is only computed if necessary (once, lazily) and that a fingerprint is also computed only once.

I guess this leads me to agree that it would be nice for high-performance applications to have an option to amortize the construction cost of attribute sets; it's also possible for SDKs to avoid the attribute.Set allocation itself. Can we have the best of both?

@pellared
Copy link
Member

pellared commented Apr 4, 2023

Has one checked how other language's API are handling this problem? I did a quick analysis.

However

So all Java, Python, Node.js accepts an attributes object that CAN be concurrent safe. Everything depends on the SDK and the caller.

It does not seem appropriate to solve our race condition by saying a user needs to pass us "cleaned" values of an argument.

As far as I understand this is what Java, Python and Node.js are doing 😬

However, right now in Go API we cannot even pass a concurrent-safe attribute slice.

Personally, I think that @MrAlias proposal #3947 is very clean and pragmatic.

My open question if the API should accept an interface so that custom Set implementation could be used instead of the one provided by API?

EDIT: I think @MrAlias is the way to go. How the API user would not which Set implementation should be used?

@jmacd
Copy link
Contributor

jmacd commented Apr 4, 2023

Personally, I think that @MrAlias proposal #3947 is very clean and pragmatic.

@pellared Thank you for the research. In a way, it looks like other repositories have validated my first position in this debate which is that it's WAI--the user shouldn't expect the ... args of a varargs list to be concurrent-safe. Since we can't overload a method in Go, the thinking goes, we have to make our one method concurrent safe. Still, this has a cost, and the user who needs concurrent-safe arguments could IMO be asked to use a different calling convention (or to use a sync.Pool as I did in my example).

Ultimately, this debate is about the performance hit to a specific kind of user. I agree that #3947 is clean and pragmatic, but I believe it's going to cause a performance hit to a different kind of user--an obligatory allocation happens to construct the set if we don't redesign the attribute.Set somehow.

I was able to correct the race condition in the Lightstep metrics SDK such that no attribute.Set is constructed when an entry already exists in the map.. This shows that it is possible to work around the race condition without changing our API or sacrificing performance. I'm interested in the outcome of this discussion because I want to make sure both kinds of user are optimized for, and then I want even more optimization: if the attribute.Set could be shared across call sites and I could keep my fingerprinting optimization, that would be ideal.

I believe we could arrange a new API such that would support repeated use of the attributes set and allow the SDK to avoid allocating the attribute set when it already has the same in memory. I'm afraid the thing we'll lose is the ability to write a one-line metrics update with attributes passed as varargs at the call site. I see this as a minor loss and if everyone else is OK with it, that's fine.

@pellared
Copy link
Member

pellared commented Apr 4, 2023

performance hit to a specific kind of user

@jmacd, do we have any benchmark or reported issue which we can use as a reference? I do not see any problems when I see these benchmark results.

@MrAlias
Copy link
Contributor

MrAlias commented Apr 4, 2023

var attrsPool = sync.Pool{
	New: func() any {
		p := new([]attribute.KeyValue)
                 *p = make([]attribute.KeyValue, 0, 16)
                return p
	},
}

func (thing *Thing) instrumentOperation(ctx context.Context, value1, value2, value3 float64, moreAttrs ...attribute.KeyValue) {
  attrs := attrsPool.Get().(*[]attribute.KeyValue)

  // get some attributes from the "scope" (thing.myAttributes...)
  // get some attributes from the context (e.g., from baggage...)
  // get the attributes from the call site, if any (i.e., moreAttrs...) 
  // the slice of attributes is assembled
  instrument1.Add(ctx, value1, *attrs...)
  instrument2.Add(ctx, value2, *attrs...)
  instrument3.Record(ctx, value3, *attrs...)
  attrsPool.Put(attrs)
}

Why can't this pooling approach not be used with the attribute.Set? I.e.

func (thing *Thing) instrumentOperation(ctx context.Context, value1, value2, value3 float64, moreAttrs ...attribute.KeyValue) {
  attrs := attrsPool.Get().(*[]attribute.KeyValue)

  // get some attributes from the "scope" (thing.myAttributes...)
  // get some attributes from the context (e.g., from baggage...)
  // get the attributes from the call site, if any (i.e., moreAttrs...) 
  // the slice of attributes is assembled
  s := attribute.NewSet(*attrs...)
  instrument1.Add(ctx, value1, s)
  instrument2.Add(ctx, value2, s)
  instrument3.Record(ctx, value3, s)
  attrsPool.Put(attrs)
}

This seems like it would still cut down allocations given Add and Record do not need to copy the passed Set the way they would need to copy the passed ...attribute.KeyValue. E.g.

var attrsPool = sync.Pool{
	New: func() any {
		p := new([]attribute.KeyValue)
		*p = make([]attribute.KeyValue, 0, 16)
		return p
	},
}

type Adder struct {
	name string
	w    io.Writer
}

func (a *Adder) AddAttr(v int64, attr ...attribute.KeyValue) {
	cp := make([]attribute.KeyValue, len(attr))
	copy(cp, attr)
	fmt.Fprintf(a.w, "%q.AddAttr(%d, %v)\n", a.name, v, cp)
}

func (a *Adder) AddSet(v int64, s attribute.Set) {
	fmt.Fprintf(a.w, "%q.AddSet(%d, %v)\n", a.name, v, s)
}

type Inst struct {
	attr []attribute.KeyValue

	adder0 *Adder
	adder1 *Adder
}

func NewInst(w io.Writer, attr ...attribute.KeyValue) *Inst {
	return &Inst{
		attr:   attr,
		adder0: &Adder{name: "adder0", w: w},
		adder1: &Adder{name: "adder1", w: w},
	}
}

func (i *Inst) AddAttr(v0, v1 int64, moreAttrs ...attribute.KeyValue) {
	attrs := attrsPool.Get().(*[]attribute.KeyValue)

	*attrs = (*attrs)[:0]
	*attrs = append(*attrs, i.attr...)
	*attrs = append(*attrs, moreAttrs...)

	i.adder0.AddAttr(v0, *attrs...)

	*attrs = append(*attrs, attribute.Bool("additional", true))
	i.adder1.AddAttr(v1, *attrs...)

	attrsPool.Put(attrs)
}

func (i *Inst) AddSet(v0, v1 int64, moreAttrs ...attribute.KeyValue) {
	attrs := attrsPool.Get().(*[]attribute.KeyValue)

	*attrs = (*attrs)[:0]
	*attrs = append(*attrs, i.attr...)
	*attrs = append(*attrs, moreAttrs...)

	s := attribute.NewSet(*attrs...)
	i.adder0.AddSet(v0, s)

	*attrs = append(*attrs, attribute.Bool("additional", true))
	s = attribute.NewSet(*attrs...)
	i.adder1.AddSet(v1, s)

	attrsPool.Put(attrs)
}

func BenchmarkAddAttr(b *testing.B) {
	i := NewInst(io.Discard, attribute.String("user", "alice"))

	b.ReportAllocs()
	b.ResetTimer()
	for n := 0; n < b.N; n++ {
		i.AddAttr(1, 2, attribute.Bool("admin", true))
	}
}

func BenchmarkAddSet(b *testing.B) {
	i := NewInst(io.Discard, attribute.String("user", "alice"))

	b.ReportAllocs()
	b.ResetTimer()
	for n := 0; n < b.N; n++ {
		i.AddSet(1, 2, attribute.Bool("admin", true))
	}
}
$ go test -run='^$' -bench=.
goos: linux
goarch: amd64
cpu: Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz
BenchmarkAddAttr-8   	  407809	      2879 ns/op	    1040 B/op	      21 allocs/op
BenchmarkAddSet-8    	  574300	      1933 ns/op	     432 B/op	       8 allocs/op
PASS

@MrAlias
Copy link
Contributor

MrAlias commented Apr 4, 2023

DistinctIter proposal

Another proposal considered was to have Add (and all others) accept an implementation of this interface:

type DistinctIter interface {
	Distinct() Distinct
	Iter() Iterator
}

which would be added to the otel/attributes package. The Distinct would be used as a map key and Iterator would be called when the attributes need to be referenced.

This would be accompanied with a "lazy set" implementation:

// LazySet is a non-comparable and immutable
type LazySet struct {
	notComparable [0]func()
	// ...
}

func NewLazySet(...KeyValue) *LazySet

func (l *LazySet) Distinct() Distinct

func (l *LazySet) WithAttributes(...KeyValue) *LazySet

func (l *LazySet) Iter() Iterator

And then either Set or LazySet could be passed to Add.

This approach has drawbacks that I feel exclude it from consideration.

  1. There would now be two implementations from attribute that implement DistinctIter. Users trying to easily use the API now need to go research what they can use for DistinctIter and which implementation to use. This adds a considerable burden to the API user
  2. Add(1, NewSet(...)) would not be possible since NewSet returns a Set, but *Set implements DistinctIter not Set.

Add LazySet proposal

Iterating on DistinctIter proposal, it was consider if Add was updated to just accept a LazySet:

Add(/*...*/, *LazySet)

This essentially mean we could provide a v2 version of a Set. It could have a creation function that returns a pointer and Add(/*...*/, nil) could be used for recording no attributes.

This, however, sill doesn't solve the extra allocation of copying user supplied attributes. Both func NewLazySet(...KeyValue) *LazySet and func (l *LazySet) WithAttributes(...KeyValue) *LazySet need to make a copy of the passed attributes to ensure no race condition.

This solution would really only address the following other issues raised about #3947

  1. The new function would now return a pointer so Add could accept a pointer
  2. A hashing function could be used to generate a map key

As for (2), the existing Set can be updated to support this, and given the copy allocation issue above of LazySet it means this isn't really an additional behavior that cannot already just be supported by extending Set.

As for (1) there is also just the possibility to just add another creation function ...

Add NewSetPtr proposal

The empty set addition to Add when a user doesn't want any attributes is a concern with #3947. Instead of adding a whole new set implementation we could add another creation function:

func NewSetPtr(...KeyValue) *Set
  1. This would bring the total ways to create a Set up to 5.
  2. This will likely require an additional allocation if the implementation isn't designed carefully enough not make the compiler think it keeps the reference.

As for (1), this is not ideal, but not necessarily a deal-breaker either.

But regarding (2), providing the user an API that promotes an additional allocation is probably not what we want here.

The current approach of #3947 means users using non-reusable sets like this:

i0.Add(1, attribute.NewSet(...))
i1.Add(2, attribute.NewSet(...))

Are going to be hit with an allocation per NewSet when it copies their attributes.

However if Add accepts a pointer, they are also hit with an additional allocation of the *Set escaping.

This additional *Set allocation can obviously be amortized with a pool or even just reused if a user is smart enough in how they construct the set and then how they pass it to Add. However, the API design itself could remove this need for the user to understand this just by accepting a Set instead of a *Set.

@MrAlias
Copy link
Contributor

MrAlias commented Apr 4, 2023

This specification issue is looking to add an additional argument to Record and possibly Add. I had considered migrating the attribute parameter to our options pattern (i.e. WithAttributes(/*...*/)), but originally thought it would be overkill for the methods as I didn't expect any other options to be added.

That assumption appears to have been incorrect. I'll plan to look into a proposal that will take that approach.

@pellared
Copy link
Member

pellared commented Apr 5, 2023

@jmacd

lightstep/otel-launcher-go#422 (comment). This shows that it is possible to work around the race condition without changing our API or sacrificing performance.

I think there are still a possible race conditions in the PR. Doing copy() of a slice does not prevent from a race because the slices are mutable. A separate goroutine can affect the result of e.g. copy(acpy, attrs).

From the PR:

// We have to copy the attributes here because the caller may
// modify their copy of the list after the call returns.

So the attributes can be modified "during" the call by a separate goroutine.

References:

@jmacd
Copy link
Contributor

jmacd commented Apr 5, 2023

So the attributes can be modified "during" the call by a separate goroutine.

I don't think so. The original caller is still active on the call stack, so if the slice is modified at this point it means the slice was modified by the caller during its own call.

@jmacd
Copy link
Contributor

jmacd commented Apr 5, 2023

After reading these proposals, I'm left with the impression that supporting two separate APIs would be the least complicated solution. I'm so interested in making sure that callers have a way to avoid all allocations at their call site that I'll take any of the style proposals we can work out.

This additional *Set allocation can obviously be amortized with a pool or even just reused if a user is smart enough in how they construct the set and then how they pass it to Add.

I think I disagree, unless we can completely change the attribute.Set type. That type is not a reusable object, so making a pool of them won't help us, I think? Actually, a lot of the dialog above rests on the assumption that attribute.Set can't change, but that type is not part of the surface area of the tracing/metrics API at this time, so a completely new solution could be built. If there were a way for me to use a pool of set-builder objects instead of a pool of []attribute.KeyValue that would let me avoid allocations somehow, I'd take it, but I haven't seen a way to build attribute sets w/o an intermediate, mutable slice being used.

@MrAlias
Copy link
Contributor

MrAlias commented Apr 5, 2023

I think I disagree, unless we can completely change the attribute.Set type.

Right, agreed, the proposal would be a pool for *Set so that reference could be reused. Not a pool of Set.

@jmacd
Copy link
Contributor

jmacd commented Apr 5, 2023

I'm not worried about allocation of Set objects.

I'm worried about the allocation of the [N]attribute.KeyValue distinct lookup key. The array has to be sized to the number of distinct items, which we only know after sorting.

@MrAlias
Copy link
Contributor

MrAlias commented Apr 5, 2023

I'm not worried about allocation of Set objects.

I'm worried about the allocation of the [N]attribute.KeyValue distinct lookup key. The array has to be sized to the number of distinct items, which we only know after sorting.

I think we are talking past each other, my statements are in regard to the "Add NewSetPtr proposal" above and how it would add an additional allocation on top of the Set construction.

@MrAlias
Copy link
Contributor

MrAlias commented Apr 5, 2023

I'm so interested in making sure that callers have a way to avoid all allocations at their call site

@jmacd can you explain this? I'm still not following how accepting a Set would prevent users from optimizing their use of NewSet?

It seems like in your distro you have tried to optimize the use of NewSet on behalf of the user. That optimization, and more, can be implemented by the user if we allow them to pass a set.

@MrAlias
Copy link
Contributor

MrAlias commented Apr 5, 2023

This specification issue is looking to add an additional argument to Record and possibly Add. I had considered migrating the attribute parameter to our options pattern (i.e. WithAttributes(/*...*/)), but originally thought it would be overkill for the methods as I didn't expect any other options to be added.

That assumption appears to have been incorrect. I'll plan to look into a proposal that will take that approach.

Proposal to add options to measurement methods

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
No open projects
Status: Done
8 participants