Migrate github.com/json-iterator/go to sigs.k8s.io/json #257

inteon · 2024-04-11T16:58:01Z

fixes #202

Replaces "github.com/json-iterator/go" with "sigs.k8s.io/json" and "encoding/json".

TODO: compare benchmarks before and after <- help would be appreciated here.

k8s-ci-robot · 2024-04-11T16:58:07Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: inteon
Once this PR has been reviewed and has the lgtm label, please assign apelisse for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

apelisse · 2024-04-11T17:14:26Z

Yeah, that's ambitious. There's a reason we had written the code the way we did, but let's see what the benchmarks say.

You can do ... IIRC

go test -bench= ./fieldpath/

to run the benchmark. Do that against master and the tip of your branch, save the output in files, run benchstat master.txt pr.txt, see what comes out (you might need to do add a -count=10 to your go test.

jpbetz · 2024-04-11T17:16:25Z

Benchmarks will be essential here. This code was hyper-optimized because it was on a hot path.

apelisse · 2024-04-11T17:23:45Z

This is what I get:

> benchstat master.txt pr.txt 
goos: darwin
goarch: arm64
pkg: sigs.k8s.io/structured-merge-diff/v4/fieldpath
                             │  master.txt  │                pr.txt                 │
                             │    sec/op    │    sec/op      vs base                │
FieldSet/serialize-20-14       3.369µ ± ∞ ¹   14.304µ ± ∞ ¹  +324.58% (p=0.008 n=5)
FieldSet/deserialize-20-14     10.18µ ± ∞ ¹    38.24µ ± ∞ ¹  +275.51% (p=0.008 n=5)
FieldSet/serialize-50-14       8.939µ ± ∞ ¹   38.017µ ± ∞ ¹  +325.29% (p=0.008 n=5)
FieldSet/deserialize-50-14     26.22µ ± ∞ ¹   105.66µ ± ∞ ¹  +303.03% (p=0.008 n=5)
FieldSet/serialize-100-14      29.20µ ± ∞ ¹   108.68µ ± ∞ ¹  +272.22% (p=0.008 n=5)
FieldSet/deserialize-100-14    67.92µ ± ∞ ¹   327.90µ ± ∞ ¹  +382.79% (p=0.008 n=5)
FieldSet/serialize-500-14      143.1µ ± ∞ ¹    514.4µ ± ∞ ¹  +259.32% (p=0.008 n=5)
FieldSet/deserialize-500-14    310.6µ ± ∞ ¹   1650.6µ ± ∞ ¹  +431.36% (p=0.008 n=5)
FieldSet/serialize-1000-14     316.4µ ± ∞ ¹   1081.6µ ± ∞ ¹  +241.84% (p=0.008 n=5)
FieldSet/deserialize-1000-14   664.7µ ± ∞ ¹   3654.3µ ± ∞ ¹  +449.78% (p=0.008 n=5)
geomean                        52.16µ          219.8µ        +321.31%
¹ need >= 6 samples for confidence interval at level 0.95

                             │  master.txt   │                 pr.txt                 │
                             │     B/op      │      B/op       vs base                │
FieldSet/serialize-20-14       1.563Ki ± ∞ ¹    8.446Ki ± ∞ ¹  +440.22% (p=0.008 n=5)
FieldSet/deserialize-20-14     9.722Ki ± ∞ ¹   29.034Ki ± ∞ ¹  +198.65% (p=0.008 n=5)
FieldSet/serialize-50-14       4.530Ki ± ∞ ¹   23.527Ki ± ∞ ¹  +419.34% (p=0.008 n=5)
FieldSet/deserialize-50-14     19.94Ki ± ∞ ¹    81.76Ki ± ∞ ¹  +310.11% (p=0.008 n=5)
FieldSet/serialize-100-14      15.67Ki ± ∞ ¹    82.53Ki ± ∞ ¹  +426.74% (p=0.008 n=5)
FieldSet/deserialize-100-14    58.83Ki ± ∞ ¹   300.17Ki ± ∞ ¹  +410.27% (p=0.008 n=5)
FieldSet/serialize-500-14      72.61Ki ± ∞ ¹   396.84Ki ± ∞ ¹  +446.52% (p=0.008 n=5)
FieldSet/deserialize-500-14    270.7Ki ± ∞ ¹   1548.6Ki ± ∞ ¹  +472.07% (p=0.008 n=5)
FieldSet/serialize-1000-14     157.9Ki ± ∞ ¹    845.1Ki ± ∞ ¹  +435.13% (p=0.008 n=5)
FieldSet/deserialize-1000-14   593.6Ki ± ∞ ¹   3325.8Ki ± ∞ ¹  +460.25% (p=0.008 n=5)
geomean                        34.42Ki          170.0Ki        +394.01%
¹ need >= 6 samples for confidence interval at level 0.95

                             │  master.txt  │                 pr.txt                 │
                             │  allocs/op   │   allocs/op    vs base                 │
FieldSet/serialize-20-14        8.000 ± ∞ ¹   334.000 ± ∞ ¹  +4075.00% (p=0.008 n=5)
FieldSet/deserialize-20-14      232.0 ± ∞ ¹     582.0 ± ∞ ¹   +150.86% (p=0.008 n=5)
FieldSet/serialize-50-14        14.00 ± ∞ ¹    926.00 ± ∞ ¹  +6514.29% (p=0.008 n=5)
FieldSet/deserialize-50-14      659.0 ± ∞ ¹    1649.0 ± ∞ ¹   +150.23% (p=0.008 n=5)
FieldSet/serialize-100-14       40.00 ± ∞ ¹   3125.00 ± ∞ ¹  +7712.50% (p=0.008 n=5)
FieldSet/deserialize-100-14    2.298k ± ∞ ¹    5.893k ± ∞ ¹   +156.44% (p=0.008 n=5)
FieldSet/serialize-500-14       185.0 ± ∞ ¹   15607.0 ± ∞ ¹  +8336.22% (p=0.008 n=5)
FieldSet/deserialize-500-14    11.47k ± ∞ ¹    29.62k ± ∞ ¹   +158.09% (p=0.008 n=5)
FieldSet/serialize-1000-14      393.0 ± ∞ ¹   33653.0 ± ∞ ¹  +8463.10% (p=0.008 n=5)
FieldSet/deserialize-1000-14   25.00k ± ∞ ¹    64.48k ± ∞ ¹   +157.87% (p=0.008 n=5)
geomean                         356.2          4.720k        +1225.15%
¹ need >= 6 samples for confidence interval at level 0.95

jpbetz · 2024-04-11T19:42:11Z

/hold
Thanks @apelisse! Holding based on those numbers to make sure we don't cause a performance regression.

inteon · 2024-04-12T14:27:47Z

I applied a few hacks and optimisations in 59b4860 and 4f4fbff. These changes improved the performance quite a bit compared to the initial version. However, the performance is still a bit worse than the current version on master (bench.old.txt is master and bench.v1.txt is the latest version of this PR):

$ benchstat bench.old.txt bench.v1.txt 
goos: linux
goarch: amd64
pkg: sigs.k8s.io/structured-merge-diff/v4/fieldpath
cpu: 12th Gen Intel(R) Core(TM) i7-12700H
                            │ bench.old.txt │             bench.v1.txt              │
                            │    sec/op     │    sec/op      vs base                │
FieldSet/serialize-20-6        3.526µ ± ∞ ¹    7.008µ ± ∞ ¹   +98.75% (p=0.029 n=4)
FieldSet/deserialize-20-6      9.038µ ± ∞ ¹   16.972µ ± ∞ ¹   +87.78% (p=0.029 n=4)
FieldSet/serialize-50-6        11.14µ ± ∞ ¹    19.83µ ± ∞ ¹   +78.06% (p=0.029 n=4)
FieldSet/deserialize-50-6      25.43µ ± ∞ ¹    52.32µ ± ∞ ¹  +105.80% (p=0.029 n=4)
FieldSet/serialize-100-6       39.24µ ± ∞ ¹    68.09µ ± ∞ ¹   +73.52% (p=0.029 n=4)
FieldSet/deserialize-100-6     86.16µ ± ∞ ¹   181.32µ ± ∞ ¹  +110.44% (p=0.029 n=4)
FieldSet/serialize-500-6       202.0µ ± ∞ ¹    358.5µ ± ∞ ¹   +77.47% (p=0.029 n=4)
FieldSet/deserialize-500-6     412.8µ ± ∞ ¹    868.4µ ± ∞ ¹  +110.33% (p=0.029 n=4)
FieldSet/serialize-1000-6      473.1µ ± ∞ ¹    860.7µ ± ∞ ¹   +81.93% (p=0.029 n=4)
FieldSet/deserialize-1000-6    822.5µ ± ∞ ¹   1984.8µ ± ∞ ¹  +141.31% (p=0.029 n=4)
geomean                        63.03µ          123.3µ         +95.57%
¹ need >= 6 samples for confidence interval at level 0.95

                            │ bench.old.txt │              bench.v1.txt              │
                            │     B/op      │      B/op       vs base                │
FieldSet/serialize-20-6       1.536Ki ± ∞ ¹    1.621Ki ± ∞ ¹    +5.53% (p=0.029 n=4)
FieldSet/deserialize-20-6     9.745Ki ± ∞ ¹   15.893Ki ± ∞ ¹   +63.09% (p=0.029 n=4)
FieldSet/serialize-50-6       4.463Ki ± ∞ ¹    4.282Ki ± ∞ ¹    -4.05% (p=0.029 n=4)
FieldSet/deserialize-50-6     20.02Ki ± ∞ ¹    46.73Ki ± ∞ ¹  +133.43% (p=0.029 n=4)
FieldSet/serialize-100-6      15.71Ki ± ∞ ¹    16.22Ki ± ∞ ¹    +3.28% (p=0.029 n=4)
FieldSet/deserialize-100-6    58.98Ki ± ∞ ¹   164.79Ki ± ∞ ¹  +179.39% (p=0.029 n=4)
FieldSet/serialize-500-6      72.84Ki ± ∞ ¹    64.28Ki ± ∞ ¹   -11.75% (p=0.029 n=4)
FieldSet/deserialize-500-6    270.1Ki ± ∞ ¹    842.0Ki ± ∞ ¹  +211.73% (p=0.029 n=4)
FieldSet/serialize-1000-6     154.4Ki ± ∞ ¹    128.3Ki ± ∞ ¹   -16.91% (p=0.029 n=4)
FieldSet/deserialize-1000-6   593.1Ki ± ∞ ¹   1805.6Ki ± ∞ ¹  +204.42% (p=0.029 n=4)
geomean                       34.27Ki          52.94Ki         +54.48%
¹ need >= 6 samples for confidence interval at level 0.95

                            │ bench.old.txt │            bench.v1.txt             │
                            │   allocs/op   │  allocs/op    vs base               │
FieldSet/serialize-20-6         8.000 ± ∞ ¹    7.000 ± ∞ ¹  -12.50% (p=0.029 n=4)
FieldSet/deserialize-20-6       231.0 ± ∞ ¹    253.0 ± ∞ ¹   +9.52% (p=0.029 n=4)
FieldSet/serialize-50-6        14.000 ± ∞ ¹    9.000 ± ∞ ¹  -35.71% (p=0.029 n=4)
FieldSet/deserialize-50-6       661.0 ± ∞ ¹    743.5 ± ∞ ¹  +12.48% (p=0.029 n=4)
FieldSet/serialize-100-6        40.00 ± ∞ ¹    10.00 ± ∞ ¹  -75.00% (p=0.029 n=4)
FieldSet/deserialize-100-6     2.296k ± ∞ ¹   2.611k ± ∞ ¹  +13.74% (p=0.029 n=4)
FieldSet/serialize-500-6       184.00 ± ∞ ¹    13.00 ± ∞ ¹  -92.93% (p=0.029 n=4)
FieldSet/deserialize-500-6     11.48k ± ∞ ¹   13.07k ± ∞ ¹  +13.85% (p=0.029 n=4)
FieldSet/serialize-1000-6      392.50 ± ∞ ¹    14.00 ± ∞ ¹  -96.43% (p=0.029 n=4)
FieldSet/deserialize-1000-6    24.99k ± ∞ ¹   28.52k ± ∞ ¹  +14.12% (p=0.029 n=4)
geomean                         355.8          170.7        -52.03%
¹ need >= 6 samples for confidence interval at level 0.95

jpbetz · 2024-04-12T14:34:15Z

For context, there were quite a few months of engineering effort to optimize the performance of SSA, and in particular, managed fields. I'm supportive of making this code better, but I feel responsible to ensure that we don't regress significantly here.

inteon · 2024-04-12T14:41:23Z

For context, there were quite a few months of engineering effort to optimize the performance of SSA, and in particular, managed fields. I'm supportive of making this code better, but I feel responsible to ensure that we don't regress significantly here.

This PR aims to be the very best attempt at making #202 work. If we do achieve the required performance, this PR will be the proof that it is not possible to switch to the "encoding/json" package unless the performance of that package is improved.

k8s-ci-robot · 2024-04-12T14:56:47Z

@inteon: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
pull-structured-merge-diff-vet	`26b5747`	link	true	`/test pull-structured-merge-diff-vet`

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

dominiquelefevre · 2024-05-23T19:34:53Z

For context, there were quite a few months of engineering effort

For more context, modern-go/reflect2 and json-iterator/go are both unmaintained and broken. #202 documents one problem with reflect2. There are more. For example, it uses some braindead constructs that make it impossible for golang to apply dead code elimination to anything that depends on reflect2, even transitively. Since structured-merge-diff and k8s.io/apimachinery are dependencies of everything that uses k8s, reflect2 hurts everyone by inhibiting the DCE.

reflect2 is the cause of more trouble. For example, see golang/go@ef225d1. I very much regret it that golang authors chose to fix reflect2 when releasing go 1.18, and keep adding kludges.

Suck it up and be correct first.

apelisse · 2024-05-23T19:56:55Z

Suck it up and be correct first.

I don't get it, what are we supposed to suck up? And who is even supposed to?

Are we supposed to suck up the performance impact, or are we supposed to get it done somehow?

apelisse · 2024-05-23T20:06:24Z

@liggitt In 2024, what's the best json library in Go to build a json part by part and stream it directly rather than serialize everything in a big string?

apelisse · 2024-05-23T20:36:19Z

Gojay seems like a good alternative, it seems to have good performance, custom encoding/decoding, and a streaming API: https://github.com/francoispqt/gojay

liggitt · 2024-05-25T17:15:11Z

if we were going to pull in another json library, I'd aim for the same one we used in kube-openapi (https://github.com/kubernetes/kube-openapi/tree/master/pkg/internal/third_party/go-json-experiment) since that's where it looks like the stdlib is headed (golang/go#63397) and supports streaming / custom encoding/decoding, etc

liggitt · 2024-05-25T17:16:15Z

internal/builder/builder.go

+	"bytes"
+	gojson "encoding/json"
+	"runtime"
+	"unsafe"


really not a fan of adding unsafe usages

liggitt · 2024-05-25T17:16:36Z

internal/builder/builder.go

+//go:nosplit
+//go:nocheckptr


really not a fan of point-in-time copies of runtime code that could become unsafe in the future

migrate github.com/json-iterator/go to sigs.k8s.io/json

d23b92a

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Apr 11, 2024

k8s-ci-robot requested review from apelisse and jpbetz April 11, 2024 16:58

k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Apr 11, 2024

inteon mentioned this pull request Apr 11, 2024

reconsider use of json-iterator / reflect2 #202

Open

k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Apr 11, 2024

inteon force-pushed the use_json_std branch 2 times, most recently from b7e17b3 to b8f304c Compare April 11, 2024 23:27

inteon added 2 commits April 12, 2024 11:33

Reduce number of allocations during serializing

59b4860

Reduce number of allocations during deserialize

4f4fbff

inteon force-pushed the use_json_std branch from b8f304c to 4f4fbff Compare April 12, 2024 14:23

wip: restore API

26b5747

liggitt reviewed May 25, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Migrate github.com/json-iterator/go to sigs.k8s.io/json #257

Migrate github.com/json-iterator/go to sigs.k8s.io/json #257

inteon commented Apr 11, 2024

k8s-ci-robot commented Apr 11, 2024

apelisse commented Apr 11, 2024

jpbetz commented Apr 11, 2024

apelisse commented Apr 11, 2024

jpbetz commented Apr 11, 2024

inteon commented Apr 12, 2024 •

edited

jpbetz commented Apr 12, 2024

inteon commented Apr 12, 2024

k8s-ci-robot commented Apr 12, 2024

dominiquelefevre commented May 23, 2024

apelisse commented May 23, 2024

apelisse commented May 23, 2024

apelisse commented May 23, 2024

liggitt commented May 25, 2024

liggitt May 25, 2024

liggitt May 25, 2024

		//go:nosplit
		//go:nocheckptr

Migrate github.com/json-iterator/go to sigs.k8s.io/json #257

Are you sure you want to change the base?

Migrate github.com/json-iterator/go to sigs.k8s.io/json #257

Conversation

inteon commented Apr 11, 2024

k8s-ci-robot commented Apr 11, 2024

apelisse commented Apr 11, 2024

jpbetz commented Apr 11, 2024

apelisse commented Apr 11, 2024

jpbetz commented Apr 11, 2024

inteon commented Apr 12, 2024 • edited

jpbetz commented Apr 12, 2024

inteon commented Apr 12, 2024

k8s-ci-robot commented Apr 12, 2024

dominiquelefevre commented May 23, 2024

apelisse commented May 23, 2024

apelisse commented May 23, 2024

apelisse commented May 23, 2024

liggitt commented May 25, 2024

liggitt May 25, 2024

Choose a reason for hiding this comment

liggitt May 25, 2024

Choose a reason for hiding this comment

inteon commented Apr 12, 2024 •

edited