Optimize zap.Any to use less memory on stack (inline version) #1305

rabbbit · 2023-07-28T03:34:26Z

This is an alternative to:

Improve Any stack usage #1301 and Improve Any stack usage #1302, and Improve Any stack usage #1304 - a series of PRs that are faster than this one. However, they rely on unsafe.
Optimize zap.Any to allocate less memory on stack #1303 - my own PR that uses closures to reduce the stack size by 60%.

This PR reduces the stack size from:

 field.go:420          0xd16c3                 4881ecf8120000          SUBQ $0x12f8, SP   // 4856

to

  field.go:420          0xcb603                 4881ecb8000000          SUBQ $0xb8, SP // 184

so by ~96%. More crucially, zap.Any is now as fast as correctly typed methods, like zap.String, etc.

The downside is the (slight) increase in the code maintenance - we inline and rely on the compiler correctly re-using small variable sizes. While this is not pretty, it feels safe - the changes were purely mechanical. Future changes and extensions should be easy to review.

Additionally, the new code is (slightly) faster in all cases since we remove 1-2 function calls from all paths. The "in new goroutine" path is most affected, as shown in the benchmarks below.

This was largely inspired by conversations with @cdvr1993. We started looking at this in parallel, but I would have given up if it wasn't for our conversations.
This particular version was inspired by an earlier version of #1304 - where I realized that @cdvr1993 is doing a similar dispatching mechanism that Zap already does via zapcore - a possible optimization.

Longer version:

We have identified an issue where zap.Any can cause performance degradation due to stack size.

This is apparently caused by the compiler assigning 4.8kb (a zap.Field per arm of the switch statement) for zap.Any on stack. This can result in an unnecessary runtime.newstack+runtime.copystack. A GitHub issue against Go language is pending.

This can be particularly bad if zap.Any was to be used in a new goroutine since the default goroutine sizes can be as low as 2kb (it can vary depending on average stack size - see golang/go#18138).

Most crucially, zap.Any is now as fast as a direct dispatch like zap.String.

Results can be compared with #1311

❯  go test -bench BenchmarkAny -benchmem -cpu 1

goos: darwin
goarch: arm64
pkg: go.uber.org/zap
BenchmarkAny/string/field-only/typed    152969097                7.460 ns/op           0 B/op          0 allocs/op
BenchmarkAny/string/field-only/any      90461920                13.21 ns/op            0 B/op          0 allocs/op
BenchmarkAny/string/log/typed            3073038               389.1 ns/op            64 B/op          1 allocs/op
BenchmarkAny/string/log/any              3030273               393.0 ns/op            64 B/op          1 allocs/op
BenchmarkAny/string/log-go/typed         1000000              1141 ns/op             112 B/op          3 allocs/op
BenchmarkAny/string/log-go/any           1000000              1143 ns/op             128 B/op          3 allocs/op
BenchmarkAny/stringer/field-only/typed  159827430                7.613 ns/op           0 B/op          0 allocs/op
BenchmarkAny/stringer/field-only/any    52378398                22.89 ns/op            0 B/op          0 allocs/op
BenchmarkAny/stringer/log/typed          3283455               372.7 ns/op            64 B/op          1 allocs/op
BenchmarkAny/stringer/log/any            3125164               378.3 ns/op            64 B/op          1 allocs/op
BenchmarkAny/stringer/log-go/typed       1000000              1108 ns/op             112 B/op          3 allocs/op
BenchmarkAny/stringer/log-go/any         1000000              1145 ns/op             128 B/op          3 allocs/op
PASS
ok      go.uber.org/zap 19.275s
❯  gotip test -bench BenchmarkAny -benchmem -cpu 1

goos: darwin
goarch: arm64
pkg: go.uber.org/zap
BenchmarkAny/string/field-only/typed    159988640                7.383 ns/op           0 B/op          0 allocs/op
BenchmarkAny/string/field-only/any      88060196                13.90 ns/op            0 B/op          0 allocs/op
BenchmarkAny/string/log/typed            2805666               430.7 ns/op            64 B/op          1 allocs/op
BenchmarkAny/string/log/any              2714698               438.8 ns/op            64 B/op          1 allocs/op
BenchmarkAny/string/log-go/typed         1000000              1163 ns/op             112 B/op          3 allocs/op
BenchmarkAny/string/log-go/any           1000000              1178 ns/op             128 B/op          3 allocs/op
BenchmarkAny/stringer/field-only/typed  156450748                7.673 ns/op           0 B/op          0 allocs/op
BenchmarkAny/stringer/field-only/any    51093813                24.31 ns/op            0 B/op          0 allocs/op
BenchmarkAny/stringer/log/typed          2974875               414.9 ns/op            64 B/op          1 allocs/op
BenchmarkAny/stringer/log/any            2853394               434.3 ns/op            64 B/op          1 allocs/op
BenchmarkAny/stringer/log-go/typed       1000000              1177 ns/op             112 B/op          3 allocs/op
BenchmarkAny/stringer/log-go/any          993225              1188 ns/op             128 B/op          3 allocs/op
PASS
ok      go.uber.org/zap 19.440s
 % go test -bench BenchmarkAny -benchmem -cpu 1
goos: linux
goarch: amd64
pkg: go.uber.org/zap
cpu: AMD EPYC 7B13
BenchmarkAny/string/field-only/typed    47207966                25.48 ns/op            0 B/op          0 allocs/op
BenchmarkAny/string/field-only/any      48939066                24.15 ns/op            0 B/op          0 allocs/op
BenchmarkAny/string/log/typed            1651234               739.9 ns/op            64 B/op          1 allocs/op
BenchmarkAny/string/log/any              1624704               753.6 ns/op            64 B/op          1 allocs/op
BenchmarkAny/string/log-go/typed          526701              2288 ns/op             112 B/op          3 allocs/op
BenchmarkAny/string/log-go/any            505273              2331 ns/op             128 B/op          3 allocs/op
BenchmarkAny/stringer/field-only/typed  48647834                24.69 ns/op            0 B/op          0 allocs/op
BenchmarkAny/stringer/field-only/any    30350282                38.55 ns/op            0 B/op          0 allocs/op
BenchmarkAny/stringer/log/typed          1801276               673.0 ns/op            64 B/op          1 allocs/op
BenchmarkAny/stringer/log/any            1584859               733.6 ns/op            64 B/op          1 allocs/op
BenchmarkAny/stringer/log-go/typed        523908              2307 ns/op             112 B/op          3 allocs/op
BenchmarkAny/stringer/log-go/any          507704              2418 ns/op             128 B/op          3 allocs/op
PASS
ok      go.uber.org/zap 19.012s

codecov · 2023-07-28T03:49:41Z

Codecov Report

Merging #1305 (af1eecf) into master (ae3953e) will increase coverage by 0.00%.
The diff coverage is 98.63%.

❗ Current head af1eecf differs from pull request most recent head ac84e32. Consider uploading reports for the commit ac84e32 to get more accurate results

@@           Coverage Diff            @@
##           master    #1305    +/-   ##
========================================
  Coverage   98.08%   98.08%            
========================================
  Files          50       50            
  Lines        3242     3399   +157     
========================================
+ Hits         3180     3334   +154     
- Misses         53       55     +2     
- Partials        9       10     +1

Files Changed	Coverage Δ
field.go	`99.33% <98.63%> (-0.67%)`	⬇️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Tiny test refactoring to make debugging easier. Spotted in #1305.

rabbbit · 2023-07-28T16:00:45Z

field.go

+		if val == nil {
+			t = zapcore.ReflectType
+			break


I also considered pulling all the *p types out of the switch into a separate switch, like

switch val := value.(type) { case *int, *bool, *int64, ...: if val == nil { t = zapcore.ReflectType return Field ... }

but that doesn't work, each of the types would need it's own branch, like

switch val := value.(type) { case *int: if val == nil { t = zapcore.ReflectType return Field ... case *int64: if val == nil { t = zapcore.ReflectType return Field ... }

so each of the * cases would need to be duplicated in both switches.

Slight benefit in reduction of if/else nil pointer checks, but increase in the total function length. Not sure.

cdvr1993 · 2023-07-28T16:09:11Z

Even though this doesn't use unsafe, I feel it is still harder to maintain than #1303. Here you are dropping all the typed constructors. Then if in the future anything changes on how we created Fields, we would need to update in 2 places: constructors and Any().

abhinav · 2023-07-28T16:55:59Z

Even though this doesn't use unsafe, I feel it is still harder to maintain than #1303. Here you are dropping all the typed constructors. Then if in the future anything changes on how we created Fields, we would need to update in 2 places: constructors and Any().

Agreed! I don't think we should merge this as-is.
If I have time this weekend, I may explore auto-generating the unrolled version by having it inspect the current implementation.

rabbbit · 2023-07-29T04:17:56Z

I pushed a slightly smaller version of this to https://github.com/uber-go/zap/compare/pawel/any-int5 - it reduced the LOC changed around a half. It does seem slightly easier to parse than ^.

…ion) This is an alternative to: - #1301 and #1302 and #1304 - a series of PRs that are faster than this one. However, they rely on unsafe. - #1303 - my own PR that uses closures, to reduce the stack size by 60%. - #1305 - my own PR that inline bunch of loops - https://github.com/uber-go/zap/compare/pawel/any-int5 that does the same as above, but is slightly easier to parse - #1307 - a reflect.TypeOf lookup version THIS PR IS INCOMPLETE - it shows a possible approach, but I wanted to get reviewers thoughts before typing everything in. I originally thought we can use a `type props strucy` intermediary struct to store the data, but that hits the same problem: every `props` would get it's own slot on the stack. This avoids this by returning the raw data. Pros: - the implementation is shared between `Any` and strongly typed Fields - no reflect or unsafe - reduced the stack significantly - we should be able to get to the same ~180 bytes as ~1305. - no peft penalty for strongly typed versions, at least on ARM64 it's compiled away. Cons: - the code gets a bit harder to maintain. It's significantly better than #1305 I would say though.

Practically, the cases added are today covered by the tests for strongly typed functions, but are missing if someone was to do a refactoring like #1305 or #1307. More theoretically, since `zap.Any` is an external interface, IMO it makes sense to test it separately too.

@rabbbit

Pulls the benchmark added by @rabbbit in uber-go#1305.

Yet another attempt at reducing the stack size of zap.Any, borrowing from uber-go#1301, uber-go#1303, uber-go#1304, uber-go#1305, uber-go#1307, and 1308. This approach defines a generic data type for field constructors of a specific type. This is similar to the lookup map in uber-go#1307, minus the map lookup, the interface match, or reflection. type anyFieldC[T any] func(string, T) Field The generic data type provides a non-generic method matching the interface: interface{ Any(string, any) Field } Stack size: The stack size of zap.Any following this change is 0xc0. % go build -gcflags -S 2>&1 | grep ^go.uber.org/zap.Any go.uber.org/zap.Any STEXT size=5861 args=0x20 locals=0xc0 funcid=0x0 align=0x0 This is just 8 bytes more than uber-go#1305, which is the smallest stack size of all other attempts. Allocations: Everything appears to get inlined with no heap escapes: go build -gcflags -m 2>&1 | grep field.go | perl -n -e 'next unless m{^./field.go:(\d+)}; print if ($1 >= 413)' | grep 'escapes' (Line 413 declares anyFieldC) Besides that, the output of `-m` for the relevant section of code consists of almost entirely: ./field.go:415:6: can inline anyFieldC[go.shape.bool].Any ./field.go:415:6: can inline anyFieldC[go.shape.[]bool].Any ./field.go:415:6: can inline anyFieldC[go.shape.complex128].Any [...] ./field.go:415:6: inlining call to anyFieldC[go.shape.complex128].Any ./field.go:415:6: inlining call to anyFieldC[go.shape.[]bool].Any ./field.go:415:6: inlining call to anyFieldC[go.shape.bool].Any Followed by: ./field.go:428:10: leaking param: key ./field.go:428:22: leaking param: value Maintainability: Unlike some of the other approaches, this variant is more maintainable. The `zap.Any` function looks roughly the same. Adding new branches there is obvious, and requires no duplication. Performance: This is a net improvement on all BenchmarkAny calls except "any-no-logger" which calls `zap.Any` and discards the result. ``` name old time/op new time/op delta Any/str-no-logger-2 8.77ns ± 0% 8.75ns ± 1% ~ (p=0.159 n=4+5) Any/any-no-logger-2 54.1ns ± 0% 81.6ns ± 0% +50.71% (p=0.016 n=5+4) Any/str-with-logger-2 1.38µs ± 3% 1.38µs ± 4% ~ (p=0.841 n=5+5) Any/any-with-logger-2 1.60µs ±22% 1.37µs ± 1% ~ (p=0.151 n=5+5) Any/str-in-go-2 3.41µs ± 1% 3.42µs ± 5% ~ (p=0.905 n=4+5) Any/any-in-go-2 5.98µs ± 1% 3.68µs ± 6% -38.44% (p=0.008 n=5+5) Any/str-in-go-with-stack-2 3.42µs ± 2% 3.46µs ± 3% ~ (p=0.421 n=5+5) Any/any-in-go-with-stack-2 5.98µs ± 3% 3.65µs ± 3% -38.95% (p=0.008 n=5+5) name old alloc/op new alloc/op delta Any/str-no-logger-2 0.00B 0.00B ~ (all equal) Any/any-no-logger-2 0.00B 0.00B ~ (all equal) Any/str-with-logger-2 64.0B ± 0% 64.0B ± 0% ~ (all equal) Any/any-with-logger-2 64.0B ± 0% 64.0B ± 0% ~ (all equal) Any/str-in-go-2 88.5B ± 1% 88.0B ± 0% ~ (p=0.429 n=4+4) Any/any-in-go-2 88.0B ± 0% 88.0B ± 0% ~ (all equal) Any/str-in-go-with-stack-2 88.0B ± 0% 88.0B ± 0% ~ (all equal) Any/any-in-go-with-stack-2 88.0B ± 0% 88.0B ± 0% ~ (all equal) name old allocs/op new allocs/op delta Any/str-no-logger-2 0.00 0.00 ~ (all equal) Any/any-no-logger-2 0.00 0.00 ~ (all equal) Any/str-with-logger-2 1.00 ± 0% 1.00 ± 0% ~ (all equal) Any/any-with-logger-2 1.00 ± 0% 1.00 ± 0% ~ (all equal) Any/str-in-go-2 2.00 ± 0% 2.00 ± 0% ~ (all equal) Any/any-in-go-2 2.00 ± 0% 2.00 ± 0% ~ (all equal) Any/str-in-go-with-stack-2 2.00 ± 0% 2.00 ± 0% ~ (all equal) Any/any-in-go-with-stack-2 2.00 ± 0% 2.00 ± 0% ~ (all equal) ``` I believe this is acceptable because that's not a real use case; we expect the result to be used with a logger.

@prashantv

This is a prefactor for #1301, #1302, #1304, #1305, #1307, #1308 and #1310. We're writing various approaches to reduce the stock size and it's painful to keep copy-pasting the tests between PRs. This was suggested in @prashantv in #1307. The tests are mostly based on tests in #1303, but made "more generic", as #1307 we might want to test across more than just a single type. It does make the tests a bit harder to setup. Some of the setup is inconvenient (we duplicate the value in both `typed` and `any` version of the tests) but hopefully okay to understand. A fully non-duplicated alternative would likely require something like #1310 itself. For #1307 in particular a test against interface type would likely be needed, so adding it here too. The tests compare two code paths, with the same arguments, one using a strongly typed method and second using `zap.Any`. We have: - a simple "create field" case for a baseline - a "create and log" case for a realistic case (we typically log the fields) - a "create and log in a goroutine" case for the pathological case we're trying to solve for. - a "create and long in goroutine in a pre-warmed system" that does the above, but before tries to affect the starting goroutine stack size to provide an realistic example. Without this, for any tests with 2+ goroutines, the cost of `zap.Any` is not visible, as we always end up expanding the stack even in the strongly typed methods. The test results are: ``` goos: darwin goarch: arm64 pkg: go.uber.org/zap BenchmarkAny/string-typ-no-logger 166879518 6.988 ns/op 0 B/op 0 allocs/op BenchmarkAny/string-typ-no-logger-12 167398297 6.973 ns/op 0 B/op 0 allocs/op BenchmarkAny/string-any-no-logger 87669631 13.97 ns/op 0 B/op 0 allocs/op BenchmarkAny/string-any-no-logger-12 86760837 14.11 ns/op 0 B/op 0 allocs/op BenchmarkAny/string-typ-logger 3059485 395.5 ns/op 64 B/op 1 allocs/op BenchmarkAny/string-typ-logger-12 3141176 379.7 ns/op 64 B/op 1 allocs/op BenchmarkAny/string-any-logger 2995699 401.3 ns/op 64 B/op 1 allocs/op BenchmarkAny/string-any-logger-12 3071046 391.1 ns/op 64 B/op 1 allocs/op BenchmarkAny/string-typ-logger-go 784323 1351 ns/op 146 B/op 2 allocs/op BenchmarkAny/string-typ-logger-go-12 2000835 613.9 ns/op 96 B/op 2 allocs/op BenchmarkAny/string-any-logger-go 477486 2479 ns/op 117 B/op 2 allocs/op BenchmarkAny/string-any-logger-go-12 1830955 680.0 ns/op 112 B/op 2 allocs/op BenchmarkAny/string-typ-logger-go-stack 841566 1328 ns/op 96 B/op 2 allocs/op BenchmarkAny/string-typ-logger-go-stack-12 2625226 479.6 ns/op 96 B/op 2 allocs/op BenchmarkAny/string-any-logger-go-stack 486084 2493 ns/op 112 B/op 2 allocs/op BenchmarkAny/string-any-logger-go-stack-12 2658640 667.9 ns/op 112 B/op 2 allocs/op BenchmarkAny/stringer-typ-no-logger 147314238 8.034 ns/op 0 B/op 0 allocs/op BenchmarkAny/stringer-typ-no-logger-12 157857937 7.436 ns/op 0 B/op 0 allocs/op BenchmarkAny/stringer-any-no-logger 58872349 20.19 ns/op 0 B/op 0 allocs/op BenchmarkAny/stringer-any-no-logger-12 60532305 20.27 ns/op 0 B/op 0 allocs/op BenchmarkAny/stringer-typ-logger 3094204 411.7 ns/op 64 B/op 1 allocs/op BenchmarkAny/stringer-typ-logger-12 3163489 383.7 ns/op 64 B/op 1 allocs/op BenchmarkAny/stringer-any-logger 2981940 427.2 ns/op 64 B/op 1 allocs/op BenchmarkAny/stringer-any-logger-12 2777792 394.0 ns/op 64 B/op 1 allocs/op BenchmarkAny/stringer-typ-logger-go 911761 1335 ns/op 96 B/op 2 allocs/op BenchmarkAny/stringer-typ-logger-go-12 2006440 605.2 ns/op 96 B/op 2 allocs/op BenchmarkAny/stringer-any-logger-go 467934 2518 ns/op 112 B/op 2 allocs/op BenchmarkAny/stringer-any-logger-go-12 1786076 683.1 ns/op 112 B/op 2 allocs/op BenchmarkAny/stringer-typ-logger-go-stack 855794 1316 ns/op 96 B/op 2 allocs/op BenchmarkAny/stringer-typ-logger-go-stack-12 2598783 434.5 ns/op 96 B/op 2 allocs/op BenchmarkAny/stringer-any-logger-go-stack 473282 2474 ns/op 112 B/op 2 allocs/op BenchmarkAny/stringer-any-logger-go-stack-12 2020183 651.9 ns/op 112 B/op 2 allocs/op PASS ok go.uber.org/zap 53.516s ```

@prashantv

This is a prefactor for #1301, #1302, #1304, #1305, #1307, #1308 and #1310. We're writing various approaches to reduce the stock size and it's painful to keep copy-pasting the tests between PRs. This was suggested in @prashantv in #1307. The tests are mostly based on tests in #1303, but made "more generic", as #1307 we might want to test across more than just a single type. It does make the tests a bit harder to setup. Some of the setup is inconvenient (we duplicate the value in both `typed` and `any` version of the tests) but hopefully okay to understand. A fully non-duplicated alternative would likely require something like #1310 itself. For #1307 in particular a test against interface type would likely be needed, so adding it here too. The tests compare two code paths, with the same arguments, one using a strongly typed method and second using `zap.Any`. We have: - a simple "create field" case for a baseline - a "create and log" case for a realistic case (we typically log the fields) - a "create and log in a goroutine" case for the pathological case we're trying to solve for. - -a "create and long in goroutine in a pre-warmed system" that does the above- we decided it's not worth the complication. The test results are: ``` ❯ go test -bench BenchmarkAny -benchmem -cpu 1 goos: darwin goarch: arm64 pkg: go.uber.org/zap BenchmarkAny/string/field-only/typed 161981473 7.374 ns/op 0 B/op 0 allocs/op BenchmarkAny/string/field-only/any 82343354 14.67 ns/op 0 B/op 0 allocs/op BenchmarkAny/string/log/typed 2965648 416.1 ns/op 64 B/op 1 allocs/op BenchmarkAny/string/log/any 2920292 418.8 ns/op 64 B/op 1 allocs/op BenchmarkAny/string/log-go/typed 1000000 1158 ns/op 112 B/op 3 allocs/op BenchmarkAny/string/log-go/any 553144 2152 ns/op 128 B/op 3 allocs/op BenchmarkAny/stringer/field-only/typed 160509367 7.548 ns/op 0 B/op 0 allocs/op BenchmarkAny/stringer/field-only/any 51330402 23.45 ns/op 0 B/op 0 allocs/op BenchmarkAny/stringer/log/typed 3221404 377.0 ns/op 64 B/op 1 allocs/op BenchmarkAny/stringer/log/any 2726443 393.5 ns/op 64 B/op 1 allocs/op BenchmarkAny/stringer/log-go/typed 1000000 1129 ns/op 112 B/op 3 allocs/op BenchmarkAny/stringer/log-go/any 558602 2147 ns/op 128 B/op 3 allocs/op PASS ok go.uber.org/zap 19.426s ``` On gotip: ``` ❯ gotip test -bench BenchmarkAny -benchmem -cpu 1 goos: darwin goarch: arm64 pkg: go.uber.org/zap BenchmarkAny/string/field-only/typed 155084869 7.603 ns/op 0 B/op 0 allocs/op BenchmarkAny/string/field-only/any 82740788 14.55 ns/op 0 B/op 0 allocs/op BenchmarkAny/string/log/typed 2800495 411.6 ns/op 64 B/op 1 allocs/op BenchmarkAny/string/log/any 2896258 411.7 ns/op 64 B/op 1 allocs/op BenchmarkAny/string/log-go/typed 1000000 1155 ns/op 112 B/op 3 allocs/op BenchmarkAny/string/log-go/any 551599 2168 ns/op 128 B/op 3 allocs/op BenchmarkAny/stringer/field-only/typed 159505488 7.578 ns/op 0 B/op 0 allocs/op BenchmarkAny/stringer/field-only/any 51406354 23.78 ns/op 0 B/op 0 allocs/op BenchmarkAny/stringer/log/typed 3011210 388.6 ns/op 64 B/op 1 allocs/op BenchmarkAny/stringer/log/any 3010370 395.1 ns/op 64 B/op 1 allocs/op BenchmarkAny/stringer/log-go/typed 1000000 1161 ns/op 112 B/op 3 allocs/op BenchmarkAny/stringer/log-go/any 553860 2161 ns/op 128 B/op 3 allocs/op PASS ok go.uber.org/zap 19.391s ``` on amd64 (similar, 2x worse stack growth impact) ``` % go test -bench BenchmarkAny -benchmem -cpu 1 goos: linux goarch: amd64 pkg: go.uber.org/zap cpu: AMD EPYC 7B13 BenchmarkAny/string/field-only/typed 47534053 25.23 ns/op 0 B/op 0 allocs/op BenchmarkAny/string/field-only/any 36913526 32.57 ns/op 0 B/op 0 allocs/op BenchmarkAny/string/log/typed 1693508 725.7 ns/op 64 B/op 1 allocs/op BenchmarkAny/string/log/any 1576172 765.3 ns/op 64 B/op 1 allocs/op BenchmarkAny/string/log-go/typed 516832 2343 ns/op 112 B/op 3 allocs/op BenchmarkAny/string/log-go/any 243692 4404 ns/op 128 B/op 3 allocs/op BenchmarkAny/stringer/field-only/typed 48735537 24.73 ns/op 0 B/op 0 allocs/op BenchmarkAny/stringer/field-only/any 26115684 47.24 ns/op 0 B/op 0 allocs/op BenchmarkAny/stringer/log/typed 1761630 677.5 ns/op 64 B/op 1 allocs/op BenchmarkAny/stringer/log/any 1646913 705.4 ns/op 64 B/op 1 allocs/op BenchmarkAny/stringer/log-go/typed 534187 2275 ns/op 112 B/op 3 allocs/op BenchmarkAny/stringer/log-go/any 273787 4348 ns/op 128 B/op 3 allocs/op PASS ok go.uber.org/zap 18.890s ```

@cdvr1993

This is an alternative to: - #1301 and #1302 and #1304 - a series of PRs that are faster than this one. However, they rely on unsafe. - #1303 - my own PR that uses closures, to reduce the stack size by 60%. This PR reduces the stack size from: ``` field.go:420 0xd16c3 4881ecf8120000 SUBQ $0x12f8, SP // 4856 ``` to ``` field.go:420 0xcb603 4881ecb8000000 SUBQ $0xb8, SP // 184 ``` so by ~96%. More crucially, `zap.Any` is now as fast as correctly typed methods, like `zap.String`, etc. The downside is the (slight) incrase in the code maitenance - we unroll as much as we can and rely on the compiler correctly re-using small variable sizes. While this is not pretty, it feels safe - the changes were purely mechanical. Future changes and extensions should be easy to review. Additionally, the new code is (slightly) faster in all cases since we remove 1-2 function calls from all paths. The "in new goroutine" path is most affected, as shown in benchmarks below. This was largely inspired by conversations with @cdvr1993. We started looking at this in parallel, but I would have given up if it wasn't for our conversations. This particular version was inspired by an earlier version of #1304 - where I realized that @cdvr1993 is doing a similar dispatching mechanism that zap is already doing via `zapcore` - a possible optimization. Longer version: We have identified an issue where zap.Any can cause performance degradation due to stack size. This is apparently cased by the compiler assigning 4.8kb (a zap.Field per arm of the switch statement) for zap.Any on stack. This can result in an unnecessary runtime.newstack+runtime.copystack. A github issue against Go language is pending. This can be particularly bad if `zap.Any` was to be used in a new goroutine, since the default goroutine sizes can be as low as 2kb (it can vary depending on average stack size - see golang/go#18138). *Most crucially, `zap.Any` is now as fast as a direct dispatch like `zap.String`.* 10 runs. ``` goos: darwin goarch: arm64 pkg: go.uber.org/zap │ before-final.txt │ after-final.txt │ │ sec/op │ sec/op vs base │ Any/str-no-logger 3.106n ± 2% 3.160n ± 1% +1.75% (p=0.025 n=10) Any/str-no-logger-2 3.171n ± 4% 3.142n ± 1% ~ (p=0.593 n=10) Any/str-no-logger-4 3.108n ± 3% 3.139n ± 2% +0.97% (p=0.004 n=10) Any/str-no-logger-8 3.099n ± 2% 3.143n ± 2% ~ (p=0.086 n=10) Any/any-no-logger 13.89n ± 2% 12.98n ± 2% -6.59% (p=0.000 n=10) Any/any-no-logger-2 13.97n ± 2% 12.96n ± 2% -7.27% (p=0.000 n=10) Any/any-no-logger-4 13.83n ± 2% 12.89n ± 2% -6.83% (p=0.000 n=10) Any/any-no-logger-8 13.77n ± 2% 12.88n ± 2% -6.43% (p=0.000 n=10) Any/str-with-logger 384.1n ± 2% 383.9n ± 6% ~ (p=0.810 n=10) Any/str-with-logger-2 367.8n ± 2% 368.5n ± 3% ~ (p=0.971 n=10) Any/str-with-logger-4 372.4n ± 2% 368.6n ± 4% ~ (p=0.912 n=10) Any/str-with-logger-8 369.8n ± 3% 368.3n ± 3% ~ (p=0.698 n=10) Any/any-with-logger 383.8n ± 3% 383.3n ± 6% ~ (p=0.838 n=10) Any/any-with-logger-2 370.0n ± 3% 367.6n ± 1% ~ (p=0.239 n=10) Any/any-with-logger-4 370.0n ± 3% 368.2n ± 4% ~ (p=0.631 n=10) Any/any-with-logger-8 367.6n ± 2% 369.7n ± 3% ~ (p=0.756 n=10) Any/str-in-go 1.334µ ± 3% 1.347µ ± 3% ~ (p=0.271 n=10) Any/str-in-go-2 754.5n ± 3% 744.8n ± 5% ~ (p=0.481 n=10) Any/str-in-go-4 420.2n ± 11% 367.7n ± 31% ~ (p=0.086 n=10) Any/str-in-go-8 557.6n ± 4% 547.1n ± 12% ~ (p=0.579 n=10) Any/any-in-go 2.562µ ± 4% 1.447µ ± 3% -43.53% (p=0.000 n=10) Any/any-in-go-2 1361.0n ± 4% 761.4n ± 7% -44.06% (p=0.000 n=10) Any/any-in-go-4 732.1n ± 9% 397.1n ± 11% -45.76% (p=0.000 n=10) Any/any-in-go-8 541.3n ± 13% 564.6n ± 5% +4.30% (p=0.041 n=10) Any/str-in-go-with-stack 1.420µ ± 1% 1.428µ ± 3% ~ (p=0.670 n=10) Any/str-in-go-with-stack-2 749.5n ± 4% 771.8n ± 4% ~ (p=0.123 n=10) Any/str-in-go-with-stack-4 433.2n ± 15% 400.7n ± 14% ~ (p=0.393 n=10) Any/str-in-go-with-stack-8 494.0n ± 7% 490.1n ± 10% ~ (p=0.853 n=10) Any/any-in-go-with-stack 2.586µ ± 3% 1.471µ ± 4% -43.14% (p=0.000 n=10) Any/any-in-go-with-stack-2 1343.0n ± 3% 773.7n ± 4% -42.39% (p=0.000 n=10) Any/any-in-go-with-stack-4 697.7n ± 8% 403.4n ± 9% -42.17% (p=0.000 n=10) Any/any-in-go-with-stack-8 490.8n ± 9% 492.8n ± 8% ~ (p=0.796 n=10) geomean 206.3n 182.9n -11.35% ``` On absolute terms: Before, on arm64: ``` goos: darwin goarch: arm64 pkg: go.uber.org/zap BenchmarkAny/str-with-logger 3154850 387.9 ns/op 64 B/op 1 allocs/op BenchmarkAny/str-with-logger-2 3239221 371.0 ns/op 64 B/op 1 allocs/op BenchmarkAny/str-with-logger-4 3273285 363.8 ns/op 64 B/op 1 allocs/op BenchmarkAny/str-with-logger-8 3251991 372.4 ns/op 64 B/op 1 allocs/op BenchmarkAny/any-with-logger 2944020 401.1 ns/op 64 B/op 1 allocs/op BenchmarkAny/any-with-logger-2 2984863 368.2 ns/op 64 B/op 1 allocs/op BenchmarkAny/any-with-logger-4 3265248 363.0 ns/op 64 B/op 1 allocs/op BenchmarkAny/any-with-logger-8 3301592 365.9 ns/op 64 B/op 1 allocs/op BenchmarkAny/str-in-go 764239 1423 ns/op 140 B/op 2 allocs/op BenchmarkAny/str-in-go-2 1510189 753.0 ns/op 88 B/op 2 allocs/op BenchmarkAny/str-in-go-4 3013986 369.1 ns/op 88 B/op 2 allocs/op BenchmarkAny/str-in-go-8 2128927 540.0 ns/op 88 B/op 2 allocs/op BenchmarkAny/any-in-go 464083 2551 ns/op 88 B/op 2 allocs/op BenchmarkAny/any-in-go-2 818104 1347 ns/op 88 B/op 2 allocs/op BenchmarkAny/any-in-go-4 1587925 698.9 ns/op 88 B/op 2 allocs/op BenchmarkAny/any-in-go-8 2452558 466.7 ns/op 88 B/op 2 allocs/op BenchmarkAny/str-in-go-with-stack 767626 1440 ns/op 88 B/op 2 allocs/op BenchmarkAny/str-in-go-with-stack-2 1534382 771.1 ns/op 88 B/op 2 allocs/op BenchmarkAny/str-in-go-with-stack-4 2384058 433.4 ns/op 88 B/op 2 allocs/op BenchmarkAny/str-in-go-with-stack-8 3146942 450.0 ns/op 88 B/op 2 allocs/op BenchmarkAny/any-in-go-with-stack 434194 2524 ns/op 88 B/op 2 allocs/op BenchmarkAny/any-in-go-with-stack-2 851312 1304 ns/op 88 B/op 2 allocs/op BenchmarkAny/any-in-go-with-stack-4 1570944 710.1 ns/op 88 B/op 2 allocs/op BenchmarkAny/any-in-go-with-stack-8 2546115 604.6 ns/op 88 B/op 2 allocs/op PASS ok go.uber.org/zap 50.238s ``` After: ``` ❯ go test -bench BenchmarkAny -benchmem -run errs -cpu 1,2,4,8 goos: darwin goarch: arm64 pkg: go.uber.org/zap BenchmarkAny/str-with-logger 3202051 382.1 ns/op 64 B/op 1 allocs/op BenchmarkAny/str-with-logger-2 3301683 371.2 ns/op 64 B/op 1 allocs/op BenchmarkAny/str-with-logger-4 3186028 364.8 ns/op 64 B/op 1 allocs/op BenchmarkAny/str-with-logger-8 3061030 371.0 ns/op 64 B/op 1 allocs/op BenchmarkAny/any-with-logger 3203704 378.2 ns/op 64 B/op 1 allocs/op BenchmarkAny/any-with-logger-2 3281462 372.8 ns/op 64 B/op 1 allocs/op BenchmarkAny/any-with-logger-4 3252879 371.5 ns/op 64 B/op 1 allocs/op BenchmarkAny/any-with-logger-8 3246148 373.9 ns/op 64 B/op 1 allocs/op BenchmarkAny/str-in-go 804132 1404 ns/op 133 B/op 2 allocs/op BenchmarkAny/str-in-go-2 1686093 758.0 ns/op 88 B/op 2 allocs/op BenchmarkAny/str-in-go-4 3075596 430.7 ns/op 88 B/op 2 allocs/op BenchmarkAny/str-in-go-8 2101650 543.7 ns/op 88 B/op 2 allocs/op BenchmarkAny/any-in-go 845822 1424 ns/op 88 B/op 2 allocs/op BenchmarkAny/any-in-go-2 1531311 736.6 ns/op 88 B/op 2 allocs/op BenchmarkAny/any-in-go-4 2618665 464.6 ns/op 88 B/op 2 allocs/op BenchmarkAny/any-in-go-8 2130280 536.2 ns/op 88 B/op 2 allocs/op BenchmarkAny/str-in-go-with-stack 818583 1440 ns/op 88 B/op 2 allocs/op BenchmarkAny/str-in-go-with-stack-2 1533379 739.4 ns/op 88 B/op 2 allocs/op BenchmarkAny/str-in-go-with-stack-4 2507131 399.6 ns/op 88 B/op 2 allocs/op BenchmarkAny/str-in-go-with-stack-8 2348804 453.1 ns/op 88 B/op 2 allocs/op BenchmarkAny/any-in-go-with-stack 807199 1526 ns/op 88 B/op 2 allocs/op BenchmarkAny/any-in-go-with-stack-2 1590476 783.0 ns/op 88 B/op 2 allocs/op BenchmarkAny/any-in-go-with-stack-4 3026263 383.3 ns/op 88 B/op 2 allocs/op BenchmarkAny/any-in-go-with-stack-8 2615467 493.8 ns/op 88 B/op 2 allocs/op PASS ok go.uber.org/zap 51.077s ``` And amd64, before: ``` % go test -bench BenchmarkAny -benchmem -run errs -cpu 1,2,4,8 goos: linux goarch: amd64 pkg: go.uber.org/zap cpu: AMD EPYC 7B13 BenchmarkAny/str-no-logger 100000000 11.58 ns/op 0 B/op 0 allocs/op BenchmarkAny/str-no-logger-2 100000000 11.52 ns/op 0 B/op 0 allocs/op BenchmarkAny/str-no-logger-4 100000000 11.56 ns/op 0 B/op 0 allocs/op BenchmarkAny/str-no-logger-8 100000000 11.50 ns/op 0 B/op 0 allocs/op BenchmarkAny/any-no-logger 39399811 30.35 ns/op 0 B/op 0 allocs/op BenchmarkAny/any-no-logger-2 39448304 30.63 ns/op 0 B/op 0 allocs/op BenchmarkAny/any-no-logger-4 39647024 30.32 ns/op 0 B/op 0 allocs/op BenchmarkAny/any-no-logger-8 39479619 30.46 ns/op 0 B/op 0 allocs/op BenchmarkAny/str-with-logger 1798702 669.3 ns/op 64 B/op 1 allocs/op BenchmarkAny/str-with-logger-2 1862551 647.1 ns/op 64 B/op 1 allocs/op BenchmarkAny/str-with-logger-4 1848636 642.2 ns/op 64 B/op 1 allocs/op BenchmarkAny/str-with-logger-8 1878465 656.7 ns/op 64 B/op 1 allocs/op BenchmarkAny/any-with-logger 1776140 684.3 ns/op 64 B/op 1 allocs/op BenchmarkAny/any-with-logger-2 1868102 668.5 ns/op 64 B/op 1 allocs/op BenchmarkAny/any-with-logger-4 1869589 639.9 ns/op 64 B/op 1 allocs/op BenchmarkAny/any-with-logger-8 1782540 648.5 ns/op 64 B/op 1 allocs/op BenchmarkAny/str-in-go 458112 2594 ns/op 91 B/op 2 allocs/op BenchmarkAny/str-in-go-2 820398 1344 ns/op 88 B/op 2 allocs/op BenchmarkAny/str-in-go-4 1392148 969.6 ns/op 88 B/op 2 allocs/op BenchmarkAny/str-in-go-8 1790403 644.8 ns/op 88 B/op 2 allocs/op BenchmarkAny/any-in-go 220327 4897 ns/op 88 B/op 2 allocs/op BenchmarkAny/any-in-go-2 494391 2701 ns/op 88 B/op 2 allocs/op BenchmarkAny/any-in-go-4 823672 1399 ns/op 88 B/op 2 allocs/op BenchmarkAny/any-in-go-8 1591206 746.8 ns/op 88 B/op 2 allocs/op BenchmarkAny/str-in-go-with-stack 384094 2820 ns/op 88 B/op 2 allocs/op BenchmarkAny/str-in-go-with-stack-2 809073 1530 ns/op 88 B/op 2 allocs/op BenchmarkAny/str-in-go-with-stack-4 1464598 933.0 ns/op 88 B/op 2 allocs/op BenchmarkAny/str-in-go-with-stack-8 1943251 578.0 ns/op 88 B/op 2 allocs/op BenchmarkAny/any-in-go-with-stack 233019 4967 ns/op 88 B/op 2 allocs/op BenchmarkAny/any-in-go-with-stack-2 356689 2848 ns/op 88 B/op 2 allocs/op BenchmarkAny/any-in-go-with-stack-4 791342 1385 ns/op 88 B/op 2 allocs/op BenchmarkAny/any-in-go-with-stack-8 1662126 746.0 ns/op 88 B/op 2 allocs/op PASS ok go.uber.org/zap 51.671s ``` After: ``` % go test -bench BenchmarkAny -benchmem -run errs -cpu 1,2,4,8 goos: linux goarch: amd64 pkg: go.uber.org/zap cpu: AMD EPYC 7B13 BenchmarkAny/str-no-logger 100000000 11.77 ns/op 0 B/op 0 allocs/op BenchmarkAny/str-no-logger-2 100000000 11.75 ns/op 0 B/op 0 allocs/op BenchmarkAny/str-no-logger-4 100000000 11.76 ns/op 0 B/op 0 allocs/op BenchmarkAny/str-no-logger-8 100000000 11.69 ns/op 0 B/op 0 allocs/op BenchmarkAny/any-no-logger 49795383 24.33 ns/op 0 B/op 0 allocs/op BenchmarkAny/any-no-logger-2 48821454 24.31 ns/op 0 B/op 0 allocs/op BenchmarkAny/any-no-logger-4 49452686 24.79 ns/op 0 B/op 0 allocs/op BenchmarkAny/any-no-logger-8 49359926 24.26 ns/op 0 B/op 0 allocs/op BenchmarkAny/str-with-logger 1808188 700.1 ns/op 64 B/op 1 allocs/op BenchmarkAny/str-with-logger-2 1894179 643.9 ns/op 64 B/op 1 allocs/op BenchmarkAny/str-with-logger-4 1858263 649.7 ns/op 64 B/op 1 allocs/op BenchmarkAny/str-with-logger-8 1879894 645.1 ns/op 64 B/op 1 allocs/op BenchmarkAny/any-with-logger 1817276 663.7 ns/op 64 B/op 1 allocs/op BenchmarkAny/any-with-logger-2 1906438 637.7 ns/op 64 B/op 1 allocs/op BenchmarkAny/any-with-logger-4 1837354 641.5 ns/op 64 B/op 1 allocs/op BenchmarkAny/any-with-logger-8 1909658 648.3 ns/op 64 B/op 1 allocs/op BenchmarkAny/str-in-go 468484 2463 ns/op 96 B/op 2 allocs/op BenchmarkAny/str-in-go-2 726475 1465 ns/op 88 B/op 2 allocs/op BenchmarkAny/str-in-go-4 1285284 958.9 ns/op 88 B/op 2 allocs/op BenchmarkAny/str-in-go-8 1746547 573.1 ns/op 88 B/op 2 allocs/op BenchmarkAny/any-in-go 426568 2715 ns/op 88 B/op 2 allocs/op BenchmarkAny/any-in-go-2 611106 1703 ns/op 88 B/op 2 allocs/op BenchmarkAny/any-in-go-4 1000000 1017 ns/op 88 B/op 2 allocs/op BenchmarkAny/any-in-go-8 2220459 625.7 ns/op 88 B/op 2 allocs/op BenchmarkAny/str-in-go-with-stack 429721 2673 ns/op 88 B/op 2 allocs/op BenchmarkAny/str-in-go-with-stack-2 637306 1593 ns/op 88 B/op 2 allocs/op BenchmarkAny/str-in-go-with-stack-4 1301713 902.1 ns/op 88 B/op 2 allocs/op BenchmarkAny/str-in-go-with-stack-8 2012583 651.6 ns/op 88 B/op 2 allocs/op BenchmarkAny/any-in-go-with-stack 391810 2833 ns/op 88 B/op 2 allocs/op BenchmarkAny/any-in-go-with-stack-2 675589 1639 ns/op 88 B/op 2 allocs/op BenchmarkAny/any-in-go-with-stack-4 1219318 970.3 ns/op 88 B/op 2 allocs/op BenchmarkAny/any-in-go-with-stack-8 1825632 574.6 ns/op 88 B/op 2 allocs/op PASS ok go.uber.org/zap 50.294s ```

…ion) This is an alternative to: - #1301 and #1302 and #1304 - a series of PRs that are faster than this one. However, they rely on unsafe. - #1303 - my own PR that uses closures, to reduce the stack size by 60%. - #1305 - my own PR that inline bunch of loops - https://github.com/uber-go/zap/compare/pawel/any-int5 that does the same as above, but is slightly easier to parse - #1307 - a reflect.TypeOf lookup version THIS PR IS INCOMPLETE - it shows a possible approach, but I wanted to get reviewers thoughts before typing everything in. I originally thought we can use a `type props strucy` intermediary struct to store the data, but that hits the same problem: every `props` would get it's own slot on the stack. This avoids this by returning the raw data. Pros: - the implementation is shared between `Any` and strongly typed Fields - no reflect or unsafe - reduced the stack significantly - we should be able to get to the same ~180 bytes as ~1305. - no peft penalty for strongly typed versions, at least on ARM64 it's compiled away. Cons: - the code gets a bit harder to maintain. It's significantly better than #1305 I would say though.

rabbbit · 2023-08-01T03:11:21Z

Closed in favor of #1310.

Yet another attempt at reducing the stack size of zap.Any, borrowing from uber-go#1301, uber-go#1303, uber-go#1304, uber-go#1305, uber-go#1307, and 1308. This approach defines a generic data type for field constructors of a specific type. This is similar to the lookup map in uber-go#1307, minus the map lookup, the interface match, or reflection. type anyFieldC[T any] func(string, T) Field The generic data type provides a non-generic method matching the interface: interface{ Any(string, any) Field } Stack size: The stack size of zap.Any following this change is 0xc0. % go build -gcflags -S 2>&1 | grep ^go.uber.org/zap.Any go.uber.org/zap.Any STEXT size=5861 args=0x20 locals=0xc0 funcid=0x0 align=0x0 This is just 8 bytes more than uber-go#1305, which is the smallest stack size of all other attempts. Allocations: Everything appears to get inlined with no heap escapes: go build -gcflags -m 2>&1 | grep field.go | perl -n -e 'next unless m{^./field.go:(\d+)}; print if ($1 >= 413)' | grep 'escapes' (Line 413 declares anyFieldC) Besides that, the output of `-m` for the relevant section of code consists of almost entirely: ./field.go:415:6: can inline anyFieldC[go.shape.bool].Any ./field.go:415:6: can inline anyFieldC[go.shape.[]bool].Any ./field.go:415:6: can inline anyFieldC[go.shape.complex128].Any [...] ./field.go:415:6: inlining call to anyFieldC[go.shape.complex128].Any ./field.go:415:6: inlining call to anyFieldC[go.shape.[]bool].Any ./field.go:415:6: inlining call to anyFieldC[go.shape.bool].Any Followed by: ./field.go:428:10: leaking param: key ./field.go:428:22: leaking param: value Maintainability: Unlike some of the other approaches, this variant is more maintainable. The `zap.Any` function looks roughly the same. Adding new branches there is obvious, and requires no duplication. Performance: This is a net improvement on all BenchmarkAny calls except "any-no-logger" which calls `zap.Any` and discards the result. ``` name old time/op new time/op delta Any/str-no-logger-2 8.77ns ± 0% 8.75ns ± 1% ~ (p=0.159 n=4+5) Any/any-no-logger-2 54.1ns ± 0% 81.6ns ± 0% +50.71% (p=0.016 n=5+4) Any/str-with-logger-2 1.38µs ± 3% 1.38µs ± 4% ~ (p=0.841 n=5+5) Any/any-with-logger-2 1.60µs ±22% 1.37µs ± 1% ~ (p=0.151 n=5+5) Any/str-in-go-2 3.41µs ± 1% 3.42µs ± 5% ~ (p=0.905 n=4+5) Any/any-in-go-2 5.98µs ± 1% 3.68µs ± 6% -38.44% (p=0.008 n=5+5) Any/str-in-go-with-stack-2 3.42µs ± 2% 3.46µs ± 3% ~ (p=0.421 n=5+5) Any/any-in-go-with-stack-2 5.98µs ± 3% 3.65µs ± 3% -38.95% (p=0.008 n=5+5) name old alloc/op new alloc/op delta Any/str-no-logger-2 0.00B 0.00B ~ (all equal) Any/any-no-logger-2 0.00B 0.00B ~ (all equal) Any/str-with-logger-2 64.0B ± 0% 64.0B ± 0% ~ (all equal) Any/any-with-logger-2 64.0B ± 0% 64.0B ± 0% ~ (all equal) Any/str-in-go-2 88.5B ± 1% 88.0B ± 0% ~ (p=0.429 n=4+4) Any/any-in-go-2 88.0B ± 0% 88.0B ± 0% ~ (all equal) Any/str-in-go-with-stack-2 88.0B ± 0% 88.0B ± 0% ~ (all equal) Any/any-in-go-with-stack-2 88.0B ± 0% 88.0B ± 0% ~ (all equal) name old allocs/op new allocs/op delta Any/str-no-logger-2 0.00 0.00 ~ (all equal) Any/any-no-logger-2 0.00 0.00 ~ (all equal) Any/str-with-logger-2 1.00 ± 0% 1.00 ± 0% ~ (all equal) Any/any-with-logger-2 1.00 ± 0% 1.00 ± 0% ~ (all equal) Any/str-in-go-2 2.00 ± 0% 2.00 ± 0% ~ (all equal) Any/any-in-go-2 2.00 ± 0% 2.00 ± 0% ~ (all equal) Any/str-in-go-with-stack-2 2.00 ± 0% 2.00 ± 0% ~ (all equal) Any/any-in-go-with-stack-2 2.00 ± 0% 2.00 ± 0% ~ (all equal) ``` I believe this is acceptable because that's not a real use case; we expect the result to be used with a logger.

Yet another attempt at reducing the stack size of zap.Any, borrowing from #1301, #1303, #1304, #1305, #1307, and #1308. This approach defines a generic data type for field constructors of a specific type. This is similar to the lookup map in #1307, minus the map lookup, the interface match, or reflection. type anyFieldC[T any] func(string, T) Field The generic data type provides a non-generic method matching the interface: interface{ Any(string, any) Field } **Stack size**: The stack size of zap.Any following this change is 0xc0 (192 bytes). % go build -gcflags -S 2>&1 | grep ^go.uber.org/zap.Any go.uber.org/zap.Any STEXT size=5861 args=0x20 locals=0xc0 funcid=0x0 align=0x0 This is just 8 bytes more than #1305, which is the smallest stack size of all other attempts. **Allocations**: Everything appears to get inlined with no heap escapes: % go build -gcflags -m 2>&1 | grep field.go | perl -n -e 'next unless m{^./field.go:(\d+)}; print if ($1 >= 413)' | grep 'escapes' [no output] (Line 413 declares anyFieldC) Besides that, the output of `-m` for the relevant section of code consists of almost entirely: ./field.go:415:6: can inline anyFieldC[go.shape.bool].Any ./field.go:415:6: can inline anyFieldC[go.shape.[]bool].Any ./field.go:415:6: can inline anyFieldC[go.shape.complex128].Any [...] ./field.go:415:6: inlining call to anyFieldC[go.shape.complex128].Any ./field.go:415:6: inlining call to anyFieldC[go.shape.[]bool].Any ./field.go:415:6: inlining call to anyFieldC[go.shape.bool].Any Followed by: ./field.go:428:10: leaking param: key ./field.go:428:22: leaking param: value **Maintainability**: Unlike some of the other approaches, this variant is more maintainable. The `zap.Any` function looks roughly the same. Adding new branches there is obvious, and requires no duplication. **Performance**: This is a net improvement against master on BenchmarkAny's log-go checks that log inside a new goroutine. ``` name old time/op new time/op delta Any/string/field-only/typed 25.2ns ± 1% 25.6ns ± 2% ~ (p=0.460 n=5+5) Any/string/field-only/any 56.9ns ± 3% 79.4ns ± 0% +39.55% (p=0.008 n=5+5) Any/string/log/typed 1.47µs ± 0% 1.49µs ± 4% +1.58% (p=0.016 n=4+5) Any/string/log/any 1.53µs ± 2% 1.55µs ± 1% +1.37% (p=0.016 n=5+5) Any/string/log-go/typed 5.97µs ± 6% 5.99µs ± 1% ~ (p=0.151 n=5+5) Any/string/log-go/any 10.9µs ± 0% 6.2µs ± 0% -43.32% (p=0.008 n=5+5) Any/stringer/field-only/typed 25.3ns ± 1% 25.5ns ± 1% +1.09% (p=0.008 n=5+5) Any/stringer/field-only/any 85.5ns ± 1% 124.5ns ± 0% +45.66% (p=0.008 n=5+5) Any/stringer/log/typed 1.43µs ± 1% 1.42µs ± 2% ~ (p=0.175 n=4+5) Any/stringer/log/any 1.50µs ± 1% 1.56µs ± 6% +4.20% (p=0.008 n=5+5) Any/stringer/log-go/typed 5.94µs ± 0% 5.92µs ± 0% -0.40% (p=0.032 n=5+5) Any/stringer/log-go/any 11.1µs ± 2% 6.3µs ± 0% -42.93% (p=0.008 n=5+5) name old alloc/op new alloc/op delta Any/string/field-only/typed 0.00B 0.00B ~ (all equal) Any/string/field-only/any 0.00B 0.00B ~ (all equal) Any/string/log/typed 64.0B ± 0% 64.0B ± 0% ~ (all equal) Any/string/log/any 64.0B ± 0% 64.0B ± 0% ~ (all equal) Any/string/log-go/typed 112B ± 0% 112B ± 0% ~ (all equal) Any/string/log-go/any 128B ± 0% 128B ± 0% ~ (all equal) Any/stringer/field-only/typed 0.00B 0.00B ~ (all equal) Any/stringer/field-only/any 0.00B 0.00B ~ (all equal) Any/stringer/log/typed 64.0B ± 0% 64.0B ± 0% ~ (all equal) Any/stringer/log/any 64.0B ± 0% 64.0B ± 0% ~ (all equal) Any/stringer/log-go/typed 112B ± 0% 112B ± 0% ~ (all equal) Any/stringer/log-go/any 128B ± 0% 128B ± 0% ~ (all equal) name old allocs/op new allocs/op delta Any/string/field-only/typed 0.00 0.00 ~ (all equal) Any/string/field-only/any 0.00 0.00 ~ (all equal) Any/string/log/typed 1.00 ± 0% 1.00 ± 0% ~ (all equal) Any/string/log/any 1.00 ± 0% 1.00 ± 0% ~ (all equal) Any/string/log-go/typed 3.00 ± 0% 3.00 ± 0% ~ (all equal) Any/string/log-go/any 3.00 ± 0% 3.00 ± 0% ~ (all equal) Any/stringer/field-only/typed 0.00 0.00 ~ (all equal) Any/stringer/field-only/any 0.00 0.00 ~ (all equal) Any/stringer/log/typed 1.00 ± 0% 1.00 ± 0% ~ (all equal) Any/stringer/log/any 1.00 ± 0% 1.00 ± 0% ~ (all equal) Any/stringer/log-go/typed 3.00 ± 0% 3.00 ± 0% ~ (all equal) Any/stringer/log-go/any 3.00 ± 0% 3.00 ± 0% ~ (all equal) ``` It causes a regression in "field-only" which calls the field constructor and discards the result without using it in a logger. I believe this is acceptable because that's not a real use case; we expect the result to be used with a logger.

Practically, the cases added are today covered by the tests for strongly typed functions, but are missing if someone was to do a refactoring like #1305 or #1307. More theoretically, since `zap.Any` is an external interface, IMO it makes sense to test it separately too.

@prashantv

This is a prefactor for #1301, #1302, #1304, #1305, #1307, #1308 and #1310. We're writing various approaches to reduce the stack size and it's painful to keep copy-pasting the tests between PRs. This was suggested in @prashantv in #1307. The tests are mostly based on tests in #1303, but made "more generic", as #1307 we might want to test across more than just a single type. It does make the tests a bit harder to setup. Some of the setup is inconvenient (we duplicate the value in both `typed` and `any` version of the tests) but hopefully okay to understand. A fully non-duplicated alternative would likely require something like #1310 itself. For #1307 in particular a test against interface type would likely be needed, so adding it here too. The tests compare two code paths, with the same arguments, one using a strongly typed method and second using `zap.Any`. We have: - a simple "create field" case for a baseline - a "create and log" case for a realistic case (we typically log the fields) - a "create and log in a goroutine" case for the pathological case we're trying to solve for. - a "create and long in goroutine in a pre-warmed system" that does the above, but before tries to affect the starting goroutine stack size to provide an realistic example. Without this, for any tests with 2+ goroutines, the cost of `zap.Any` is not visible, as we always end up expanding the stack even in the strongly typed methods. The test results are: ``` goos: darwin goarch: arm64 pkg: go.uber.org/zap BenchmarkAny/string-typ-no-logger 166879518 6.988 ns/op 0 B/op 0 allocs/op BenchmarkAny/string-typ-no-logger-12 167398297 6.973 ns/op 0 B/op 0 allocs/op BenchmarkAny/string-any-no-logger 87669631 13.97 ns/op 0 B/op 0 allocs/op BenchmarkAny/string-any-no-logger-12 86760837 14.11 ns/op 0 B/op 0 allocs/op BenchmarkAny/string-typ-logger 3059485 395.5 ns/op 64 B/op 1 allocs/op BenchmarkAny/string-typ-logger-12 3141176 379.7 ns/op 64 B/op 1 allocs/op BenchmarkAny/string-any-logger 2995699 401.3 ns/op 64 B/op 1 allocs/op BenchmarkAny/string-any-logger-12 3071046 391.1 ns/op 64 B/op 1 allocs/op BenchmarkAny/string-typ-logger-go 784323 1351 ns/op 146 B/op 2 allocs/op BenchmarkAny/string-typ-logger-go-12 2000835 613.9 ns/op 96 B/op 2 allocs/op BenchmarkAny/string-any-logger-go 477486 2479 ns/op 117 B/op 2 allocs/op BenchmarkAny/string-any-logger-go-12 1830955 680.0 ns/op 112 B/op 2 allocs/op BenchmarkAny/string-typ-logger-go-stack 841566 1328 ns/op 96 B/op 2 allocs/op BenchmarkAny/string-typ-logger-go-stack-12 2625226 479.6 ns/op 96 B/op 2 allocs/op BenchmarkAny/string-any-logger-go-stack 486084 2493 ns/op 112 B/op 2 allocs/op BenchmarkAny/string-any-logger-go-stack-12 2658640 667.9 ns/op 112 B/op 2 allocs/op BenchmarkAny/stringer-typ-no-logger 147314238 8.034 ns/op 0 B/op 0 allocs/op BenchmarkAny/stringer-typ-no-logger-12 157857937 7.436 ns/op 0 B/op 0 allocs/op BenchmarkAny/stringer-any-no-logger 58872349 20.19 ns/op 0 B/op 0 allocs/op BenchmarkAny/stringer-any-no-logger-12 60532305 20.27 ns/op 0 B/op 0 allocs/op BenchmarkAny/stringer-typ-logger 3094204 411.7 ns/op 64 B/op 1 allocs/op BenchmarkAny/stringer-typ-logger-12 3163489 383.7 ns/op 64 B/op 1 allocs/op BenchmarkAny/stringer-any-logger 2981940 427.2 ns/op 64 B/op 1 allocs/op BenchmarkAny/stringer-any-logger-12 2777792 394.0 ns/op 64 B/op 1 allocs/op BenchmarkAny/stringer-typ-logger-go 911761 1335 ns/op 96 B/op 2 allocs/op BenchmarkAny/stringer-typ-logger-go-12 2006440 605.2 ns/op 96 B/op 2 allocs/op BenchmarkAny/stringer-any-logger-go 467934 2518 ns/op 112 B/op 2 allocs/op BenchmarkAny/stringer-any-logger-go-12 1786076 683.1 ns/op 112 B/op 2 allocs/op BenchmarkAny/stringer-typ-logger-go-stack 855794 1316 ns/op 96 B/op 2 allocs/op BenchmarkAny/stringer-typ-logger-go-stack-12 2598783 434.5 ns/op 96 B/op 2 allocs/op BenchmarkAny/stringer-any-logger-go-stack 473282 2474 ns/op 112 B/op 2 allocs/op BenchmarkAny/stringer-any-logger-go-stack-12 2020183 651.9 ns/op 112 B/op 2 allocs/op PASS ok go.uber.org/zap 53.516s ```

Yet another attempt at reducing the stack size of zap.Any, borrowing from #1301, #1303, #1304, #1305, #1307, and #1308. This approach defines a generic data type for field constructors of a specific type. This is similar to the lookup map in #1307, minus the map lookup, the interface match, or reflection. type anyFieldC[T any] func(string, T) Field The generic data type provides a non-generic method matching the interface: interface{ Any(string, any) Field } **Stack size**: The stack size of zap.Any following this change is 0xc0 (192 bytes). % go build -gcflags -S 2>&1 | grep ^go.uber.org/zap.Any go.uber.org/zap.Any STEXT size=5861 args=0x20 locals=0xc0 funcid=0x0 align=0x0 This is just 8 bytes more than #1305, which is the smallest stack size of all other attempts. **Allocations**: Everything appears to get inlined with no heap escapes: % go build -gcflags -m 2>&1 | grep field.go | perl -n -e 'next unless m{^./field.go:(\d+)}; print if ($1 >= 413)' | grep 'escapes' [no output] (Line 413 declares anyFieldC) Besides that, the output of `-m` for the relevant section of code consists of almost entirely: ./field.go:415:6: can inline anyFieldC[go.shape.bool].Any ./field.go:415:6: can inline anyFieldC[go.shape.[]bool].Any ./field.go:415:6: can inline anyFieldC[go.shape.complex128].Any [...] ./field.go:415:6: inlining call to anyFieldC[go.shape.complex128].Any ./field.go:415:6: inlining call to anyFieldC[go.shape.[]bool].Any ./field.go:415:6: inlining call to anyFieldC[go.shape.bool].Any Followed by: ./field.go:428:10: leaking param: key ./field.go:428:22: leaking param: value **Maintainability**: Unlike some of the other approaches, this variant is more maintainable. The `zap.Any` function looks roughly the same. Adding new branches there is obvious, and requires no duplication. **Performance**: This is a net improvement against master on BenchmarkAny's log-go checks that log inside a new goroutine. ``` name old time/op new time/op delta Any/string/field-only/typed 25.2ns ± 1% 25.6ns ± 2% ~ (p=0.460 n=5+5) Any/string/field-only/any 56.9ns ± 3% 79.4ns ± 0% +39.55% (p=0.008 n=5+5) Any/string/log/typed 1.47µs ± 0% 1.49µs ± 4% +1.58% (p=0.016 n=4+5) Any/string/log/any 1.53µs ± 2% 1.55µs ± 1% +1.37% (p=0.016 n=5+5) Any/string/log-go/typed 5.97µs ± 6% 5.99µs ± 1% ~ (p=0.151 n=5+5) Any/string/log-go/any 10.9µs ± 0% 6.2µs ± 0% -43.32% (p=0.008 n=5+5) Any/stringer/field-only/typed 25.3ns ± 1% 25.5ns ± 1% +1.09% (p=0.008 n=5+5) Any/stringer/field-only/any 85.5ns ± 1% 124.5ns ± 0% +45.66% (p=0.008 n=5+5) Any/stringer/log/typed 1.43µs ± 1% 1.42µs ± 2% ~ (p=0.175 n=4+5) Any/stringer/log/any 1.50µs ± 1% 1.56µs ± 6% +4.20% (p=0.008 n=5+5) Any/stringer/log-go/typed 5.94µs ± 0% 5.92µs ± 0% -0.40% (p=0.032 n=5+5) Any/stringer/log-go/any 11.1µs ± 2% 6.3µs ± 0% -42.93% (p=0.008 n=5+5) name old alloc/op new alloc/op delta Any/string/field-only/typed 0.00B 0.00B ~ (all equal) Any/string/field-only/any 0.00B 0.00B ~ (all equal) Any/string/log/typed 64.0B ± 0% 64.0B ± 0% ~ (all equal) Any/string/log/any 64.0B ± 0% 64.0B ± 0% ~ (all equal) Any/string/log-go/typed 112B ± 0% 112B ± 0% ~ (all equal) Any/string/log-go/any 128B ± 0% 128B ± 0% ~ (all equal) Any/stringer/field-only/typed 0.00B 0.00B ~ (all equal) Any/stringer/field-only/any 0.00B 0.00B ~ (all equal) Any/stringer/log/typed 64.0B ± 0% 64.0B ± 0% ~ (all equal) Any/stringer/log/any 64.0B ± 0% 64.0B ± 0% ~ (all equal) Any/stringer/log-go/typed 112B ± 0% 112B ± 0% ~ (all equal) Any/stringer/log-go/any 128B ± 0% 128B ± 0% ~ (all equal) name old allocs/op new allocs/op delta Any/string/field-only/typed 0.00 0.00 ~ (all equal) Any/string/field-only/any 0.00 0.00 ~ (all equal) Any/string/log/typed 1.00 ± 0% 1.00 ± 0% ~ (all equal) Any/string/log/any 1.00 ± 0% 1.00 ± 0% ~ (all equal) Any/string/log-go/typed 3.00 ± 0% 3.00 ± 0% ~ (all equal) Any/string/log-go/any 3.00 ± 0% 3.00 ± 0% ~ (all equal) Any/stringer/field-only/typed 0.00 0.00 ~ (all equal) Any/stringer/field-only/any 0.00 0.00 ~ (all equal) Any/stringer/log/typed 1.00 ± 0% 1.00 ± 0% ~ (all equal) Any/stringer/log/any 1.00 ± 0% 1.00 ± 0% ~ (all equal) Any/stringer/log-go/typed 3.00 ± 0% 3.00 ± 0% ~ (all equal) Any/stringer/log-go/any 3.00 ± 0% 3.00 ± 0% ~ (all equal) ``` It causes a regression in "field-only" which calls the field constructor and discards the result without using it in a logger. I believe this is acceptable because that's not a real use case; we expect the result to be used with a logger.

Practically, the cases added are today covered by the tests for strongly typed functions, but are missing if someone was to do a refactoring like #1305 or #1307. More theoretically, since `zap.Any` is an external interface, IMO it makes sense to test it separately too.

@prashantv

This is a prefactor for #1301, #1302, #1304, #1305, #1307, #1308 and #1310. We're writing various approaches to reduce the stock size and it's painful to keep copy-pasting the tests between PRs. This was suggested in @prashantv in #1307. The tests are mostly based on tests in #1303, but made "more generic", as #1307 we might want to test across more than just a single type. It does make the tests a bit harder to setup. Some of the setup is inconvenient (we duplicate the value in both `typed` and `any` version of the tests) but hopefully okay to understand. A fully non-duplicated alternative would likely require something like #1310 itself. For #1307 in particular a test against interface type would likely be needed, so adding it here too. The tests compare two code paths, with the same arguments, one using a strongly typed method and second using `zap.Any`. We have: - a simple "create field" case for a baseline - a "create and log" case for a realistic case (we typically log the fields) - a "create and log in a goroutine" case for the pathological case we're trying to solve for. - a "create and long in goroutine in a pre-warmed system" that does the above, but before tries to affect the starting goroutine stack size to provide an realistic example. Without this, for any tests with 2+ goroutines, the cost of `zap.Any` is not visible, as we always end up expanding the stack even in the strongly typed methods. The test results are: ``` goos: darwin goarch: arm64 pkg: go.uber.org/zap BenchmarkAny/string-typ-no-logger 166879518 6.988 ns/op 0 B/op 0 allocs/op BenchmarkAny/string-typ-no-logger-12 167398297 6.973 ns/op 0 B/op 0 allocs/op BenchmarkAny/string-any-no-logger 87669631 13.97 ns/op 0 B/op 0 allocs/op BenchmarkAny/string-any-no-logger-12 86760837 14.11 ns/op 0 B/op 0 allocs/op BenchmarkAny/string-typ-logger 3059485 395.5 ns/op 64 B/op 1 allocs/op BenchmarkAny/string-typ-logger-12 3141176 379.7 ns/op 64 B/op 1 allocs/op BenchmarkAny/string-any-logger 2995699 401.3 ns/op 64 B/op 1 allocs/op BenchmarkAny/string-any-logger-12 3071046 391.1 ns/op 64 B/op 1 allocs/op BenchmarkAny/string-typ-logger-go 784323 1351 ns/op 146 B/op 2 allocs/op BenchmarkAny/string-typ-logger-go-12 2000835 613.9 ns/op 96 B/op 2 allocs/op BenchmarkAny/string-any-logger-go 477486 2479 ns/op 117 B/op 2 allocs/op BenchmarkAny/string-any-logger-go-12 1830955 680.0 ns/op 112 B/op 2 allocs/op BenchmarkAny/string-typ-logger-go-stack 841566 1328 ns/op 96 B/op 2 allocs/op BenchmarkAny/string-typ-logger-go-stack-12 2625226 479.6 ns/op 96 B/op 2 allocs/op BenchmarkAny/string-any-logger-go-stack 486084 2493 ns/op 112 B/op 2 allocs/op BenchmarkAny/string-any-logger-go-stack-12 2658640 667.9 ns/op 112 B/op 2 allocs/op BenchmarkAny/stringer-typ-no-logger 147314238 8.034 ns/op 0 B/op 0 allocs/op BenchmarkAny/stringer-typ-no-logger-12 157857937 7.436 ns/op 0 B/op 0 allocs/op BenchmarkAny/stringer-any-no-logger 58872349 20.19 ns/op 0 B/op 0 allocs/op BenchmarkAny/stringer-any-no-logger-12 60532305 20.27 ns/op 0 B/op 0 allocs/op BenchmarkAny/stringer-typ-logger 3094204 411.7 ns/op 64 B/op 1 allocs/op BenchmarkAny/stringer-typ-logger-12 3163489 383.7 ns/op 64 B/op 1 allocs/op BenchmarkAny/stringer-any-logger 2981940 427.2 ns/op 64 B/op 1 allocs/op BenchmarkAny/stringer-any-logger-12 2777792 394.0 ns/op 64 B/op 1 allocs/op BenchmarkAny/stringer-typ-logger-go 911761 1335 ns/op 96 B/op 2 allocs/op BenchmarkAny/stringer-typ-logger-go-12 2006440 605.2 ns/op 96 B/op 2 allocs/op BenchmarkAny/stringer-any-logger-go 467934 2518 ns/op 112 B/op 2 allocs/op BenchmarkAny/stringer-any-logger-go-12 1786076 683.1 ns/op 112 B/op 2 allocs/op BenchmarkAny/stringer-typ-logger-go-stack 855794 1316 ns/op 96 B/op 2 allocs/op BenchmarkAny/stringer-typ-logger-go-stack-12 2598783 434.5 ns/op 96 B/op 2 allocs/op BenchmarkAny/stringer-any-logger-go-stack 473282 2474 ns/op 112 B/op 2 allocs/op BenchmarkAny/stringer-any-logger-go-stack-12 2020183 651.9 ns/op 112 B/op 2 allocs/op PASS ok go.uber.org/zap 53.516s ``` reviews rename review farewell, my friend dummy

Yet another attempt at reducing the stack size of zap.Any, borrowing from #1301, #1303, #1304, #1305, #1307, and #1308. This approach defines a generic data type for field constructors of a specific type. This is similar to the lookup map in #1307, minus the map lookup, the interface match, or reflection. type anyFieldC[T any] func(string, T) Field The generic data type provides a non-generic method matching the interface: interface{ Any(string, any) Field } **Stack size**: The stack size of zap.Any following this change is 0xc0 (192 bytes). % go build -gcflags -S 2>&1 | grep ^go.uber.org/zap.Any go.uber.org/zap.Any STEXT size=5861 args=0x20 locals=0xc0 funcid=0x0 align=0x0 This is just 8 bytes more than #1305, which is the smallest stack size of all other attempts. **Allocations**: Everything appears to get inlined with no heap escapes: % go build -gcflags -m 2>&1 | grep field.go | perl -n -e 'next unless m{^./field.go:(\d+)}; print if ($1 >= 413)' | grep 'escapes' [no output] (Line 413 declares anyFieldC) Besides that, the output of `-m` for the relevant section of code consists of almost entirely: ./field.go:415:6: can inline anyFieldC[go.shape.bool].Any ./field.go:415:6: can inline anyFieldC[go.shape.[]bool].Any ./field.go:415:6: can inline anyFieldC[go.shape.complex128].Any [...] ./field.go:415:6: inlining call to anyFieldC[go.shape.complex128].Any ./field.go:415:6: inlining call to anyFieldC[go.shape.[]bool].Any ./field.go:415:6: inlining call to anyFieldC[go.shape.bool].Any Followed by: ./field.go:428:10: leaking param: key ./field.go:428:22: leaking param: value **Maintainability**: Unlike some of the other approaches, this variant is more maintainable. The `zap.Any` function looks roughly the same. Adding new branches there is obvious, and requires no duplication. **Performance**: This is a net improvement against master on BenchmarkAny's log-go checks that log inside a new goroutine. ``` name old time/op new time/op delta Any/string/field-only/typed 25.2ns ± 1% 25.6ns ± 2% ~ (p=0.460 n=5+5) Any/string/field-only/any 56.9ns ± 3% 79.4ns ± 0% +39.55% (p=0.008 n=5+5) Any/string/log/typed 1.47µs ± 0% 1.49µs ± 4% +1.58% (p=0.016 n=4+5) Any/string/log/any 1.53µs ± 2% 1.55µs ± 1% +1.37% (p=0.016 n=5+5) Any/string/log-go/typed 5.97µs ± 6% 5.99µs ± 1% ~ (p=0.151 n=5+5) Any/string/log-go/any 10.9µs ± 0% 6.2µs ± 0% -43.32% (p=0.008 n=5+5) Any/stringer/field-only/typed 25.3ns ± 1% 25.5ns ± 1% +1.09% (p=0.008 n=5+5) Any/stringer/field-only/any 85.5ns ± 1% 124.5ns ± 0% +45.66% (p=0.008 n=5+5) Any/stringer/log/typed 1.43µs ± 1% 1.42µs ± 2% ~ (p=0.175 n=4+5) Any/stringer/log/any 1.50µs ± 1% 1.56µs ± 6% +4.20% (p=0.008 n=5+5) Any/stringer/log-go/typed 5.94µs ± 0% 5.92µs ± 0% -0.40% (p=0.032 n=5+5) Any/stringer/log-go/any 11.1µs ± 2% 6.3µs ± 0% -42.93% (p=0.008 n=5+5) name old alloc/op new alloc/op delta Any/string/field-only/typed 0.00B 0.00B ~ (all equal) Any/string/field-only/any 0.00B 0.00B ~ (all equal) Any/string/log/typed 64.0B ± 0% 64.0B ± 0% ~ (all equal) Any/string/log/any 64.0B ± 0% 64.0B ± 0% ~ (all equal) Any/string/log-go/typed 112B ± 0% 112B ± 0% ~ (all equal) Any/string/log-go/any 128B ± 0% 128B ± 0% ~ (all equal) Any/stringer/field-only/typed 0.00B 0.00B ~ (all equal) Any/stringer/field-only/any 0.00B 0.00B ~ (all equal) Any/stringer/log/typed 64.0B ± 0% 64.0B ± 0% ~ (all equal) Any/stringer/log/any 64.0B ± 0% 64.0B ± 0% ~ (all equal) Any/stringer/log-go/typed 112B ± 0% 112B ± 0% ~ (all equal) Any/stringer/log-go/any 128B ± 0% 128B ± 0% ~ (all equal) name old allocs/op new allocs/op delta Any/string/field-only/typed 0.00 0.00 ~ (all equal) Any/string/field-only/any 0.00 0.00 ~ (all equal) Any/string/log/typed 1.00 ± 0% 1.00 ± 0% ~ (all equal) Any/string/log/any 1.00 ± 0% 1.00 ± 0% ~ (all equal) Any/string/log-go/typed 3.00 ± 0% 3.00 ± 0% ~ (all equal) Any/string/log-go/any 3.00 ± 0% 3.00 ± 0% ~ (all equal) Any/stringer/field-only/typed 0.00 0.00 ~ (all equal) Any/stringer/field-only/any 0.00 0.00 ~ (all equal) Any/stringer/log/typed 1.00 ± 0% 1.00 ± 0% ~ (all equal) Any/stringer/log/any 1.00 ± 0% 1.00 ± 0% ~ (all equal) Any/stringer/log-go/typed 3.00 ± 0% 3.00 ± 0% ~ (all equal) Any/stringer/log-go/any 3.00 ± 0% 3.00 ± 0% ~ (all equal) ``` It causes a regression in "field-only" which calls the field constructor and discards the result without using it in a logger. I believe this is acceptable because that's not a real use case; we expect the result to be used with a logger.

rabbbit force-pushed the pawel/any-int4 branch from bb5c1f0 to eb45591 Compare July 28, 2023 03:36

rabbbit added a commit that referenced this pull request Jul 28, 2023

Table test TestFieldConstructors

1310691

Tiny test refactoring to make debugging easier. Spotted in #1305.

rabbbit mentioned this pull request Jul 28, 2023

Table test TestFieldConstructors #1306

Merged

rabbbit changed the title ~~Optimize zap.Any to use less mamory on stack (unrolled version)~~ Optimize zap.Any to use less mamory on stack (inline version) Jul 28, 2023

rabbbit added a commit that referenced this pull request Jul 28, 2023

Table test TestFieldConstructors (#1306)

8d1bd8c

Tiny test refactoring to make debugging easier. Spotted in #1305.

rabbbit commented Jul 28, 2023

View reviewed changes

rabbbit force-pushed the pawel/any-int4 branch from a8e46a2 to af1eecf Compare July 28, 2023 16:02

rabbbit mentioned this pull request Jul 29, 2023

Optimize zap.Any to allocate less memory on stack #1303

Closed

rabbbit mentioned this pull request Jul 30, 2023

Optimize zap.Any to use less memory on stack (intermediate props verion) #1308

Closed

rabbbit changed the title ~~Optimize zap.Any to use less mamory on stack (inline version)~~ Optimize zap.Any to use less memory on stack (inline version) Jul 30, 2023

rabbbit mentioned this pull request Jul 30, 2023

Add direct coverage to zap.Any #1309

Merged

abhinav mentioned this pull request Jul 30, 2023

zap.Any: Reduce stack size with generics #1310

Merged

abhinav pushed a commit to abhinav/zap that referenced this pull request Jul 30, 2023

Cherry-pick benchmark from uber-go#1305

983a8fb

Pulls the benchmark added by @rabbbit in uber-go#1305.

rabbbit mentioned this pull request Jul 30, 2023

zap.Any add benchmarks #1311

Merged

rabbbit force-pushed the pawel/any-int4 branch from af1eecf to a700e10 Compare August 1, 2023 03:03

Remove duplication

ac84e32

rabbbit closed this Aug 1, 2023

rabbbit deleted the pawel/any-int4 branch August 1, 2023 03:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize zap.Any to use less memory on stack (inline version) #1305

Optimize zap.Any to use less memory on stack (inline version) #1305

rabbbit commented Jul 28, 2023 •

edited

codecov bot commented Jul 28, 2023 •

edited

rabbbit Jul 28, 2023

cdvr1993 commented Jul 28, 2023

abhinav commented Jul 28, 2023

rabbbit commented Jul 29, 2023

rabbbit commented Aug 1, 2023

Optimize zap.Any to use less memory on stack (inline version) #1305

Optimize zap.Any to use less memory on stack (inline version) #1305

Conversation

rabbbit commented Jul 28, 2023 • edited

codecov bot commented Jul 28, 2023 • edited

Codecov Report

rabbbit Jul 28, 2023

Choose a reason for hiding this comment

cdvr1993 commented Jul 28, 2023

abhinav commented Jul 28, 2023

rabbbit commented Jul 29, 2023

rabbbit commented Aug 1, 2023

rabbbit commented Jul 28, 2023 •

edited

codecov bot commented Jul 28, 2023 •

edited