Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Propose: use an fast int-int map to cache type information #194

Open
jxskiss opened this issue Apr 28, 2021 · 7 comments
Open

Propose: use an fast int-int map to cache type information #194

jxskiss opened this issue Apr 28, 2021 · 7 comments
Labels
enhancement New feature or request performance

Comments

@jxskiss
Copy link
Contributor

jxskiss commented Apr 28, 2021

Hi, this json package is awesome, the performance is very impressive.
I saw the idea of "Dispatch by typeptr from map to slice" is limited by the type slice size and I got an idea to help this.

Some time ago, I discovered a very fast int key to int value map implemention here: https://github.com/brentp/intintmap/.
From my recent benchmark, it shows that a further optimized copy-on-write version of the int-int map can be nearly fast as slice index (without unsafe slice bounds checking elimination), I think it may be a better choice than the current implementation, and it doesn't waste memory.

cpu: Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz
BenchmarkConcurrentStdMapGet_NoLock-12                  72221664                15.79 ns/op
BenchmarkConcurrentStdMapGet_RWMutex-12                  3255450               369.01 ns/op
BenchmarkConcurrentSyncMapGet-12                        27300724                44.59 ns/op
BenchmarkConcurrentCOWMapGet-12                        344179628                 3.487 ns/op
BenchmarkConcurrentSliceIndex-12                       908571164                 1.213 ns/op

The benchmark code is here:
https://github.com/jxskiss/gopkg/blob/master/intintmap/cow_test.go

Do you think it's a good idea to use the int-int map for the codec cache?
I can send a PR if it's welcomed.

(PS. saying the int-int map, I am not meaning to introduce external dependencies, we can just implement the functionality that we need for the type information cache.)

@goccy
Copy link
Owner

goccy commented Apr 29, 2021

Thank you for the reporting !
interesting. However, in terms of speed, slice approach is faster, so we should consider switching the implementation of the fallback destination. I would like to see the benchmark results when using intintmap .

@jxskiss
Copy link
Contributor Author

jxskiss commented Apr 29, 2021

I will draft a change to check the benchmark result ~

@goccy goccy added the enhancement New feature or request label Apr 29, 2021
@jxskiss

This comment has been minimized.

@jxskiss
Copy link
Contributor Author

jxskiss commented Apr 30, 2021

Well, the previous comment is incorrect, I'll post a new benchark result later soon.

@jxskiss
Copy link
Contributor Author

jxskiss commented May 2, 2021

Here is a new updated benchmark result

name                                              old time/op    new time/op    delta
_EncodeBigData_GoJson-12                             621µs ± 1%     627µs ± 1%   +0.96%  (p=0.021 n=8+8)
_MarshalBigData_GoJson-12                            689µs ± 3%     693µs ± 3%     ~     (p=0.400 n=9+10)
_MarshalBytes_GoJson/32-12                           118ns ± 2%     124ns ± 2%   +4.75%  (p=0.000 n=9+10)
_MarshalBytes_GoJson/256-12                          371ns ± 1%     392ns ± 1%   +5.72%  (p=0.000 n=8+8)
_MarshalBytes_GoJson/4096-12                        4.67µs ± 2%    4.86µs ± 0%   +4.13%  (p=0.000 n=9+8)
_EncodeRawMessage_GoJson-12                         26.7ns ± 4%    28.4ns ± 4%   +6.65%  (p=0.000 n=9+10)
_MarshalString_GoJson-12                            34.3ns ± 5%    36.0ns ± 4%   +5.02%  (p=0.001 n=9+9)
_Compact_GoJson-12                                  6.78ms ± 3%    6.74ms ± 1%     ~     (p=0.888 n=9+8)
_Indent_GoJson-12                                   15.8ms ± 3%    16.0ms ± 3%     ~     (p=0.063 n=9+9)
_Decode_SmallStruct_Unmarshal_GoJson-12              384ns ± 0%     412ns ± 2%   +7.10%  (p=0.000 n=7+9)
_Decode_SmallStruct_Unmarshal_GoJsonNoEscape-12      324ns ± 4%     346ns ± 1%   +6.88%  (p=0.000 n=9+8)
_Decode_SmallStruct_Stream_GoJson-12                 718ns ± 1%     779ns ± 5%   +8.39%  (p=0.000 n=8+9)
_Decode_MediumStruct_Unmarshal_GoJson-12            2.67µs ± 2%    2.79µs ± 1%   +4.60%  (p=0.000 n=8+8)
_Decode_MediumStruct_Unmarshal_GoJsonNoEscape-12    2.63µs ± 3%    2.75µs ± 3%   +4.54%  (p=0.000 n=9+10)
_Decode_MediumStruct_Stream_GoJson-12               3.37µs ± 1%    3.60µs ± 0%   +6.76%  (p=0.000 n=8+7)
_Decode_LargeStruct_Unmarshal_GoJson-12             34.6µs ± 3%    36.5µs ± 3%   +5.49%  (p=0.000 n=9+10)
_Decode_LargeStruct_Unmarshal_GoJsonNoEscape-12     34.3µs ± 1%    36.1µs ± 1%   +5.17%  (p=0.000 n=8+10)
_Decode_LargeStruct_Stream_GoJson-12                48.9µs ± 5%    50.7µs ± 1%   +3.57%  (p=0.004 n=9+9)
_Encode_SmallStruct_GoJson-12                       366ns ±104%     284ns ± 1%     ~     (p=0.051 n=9+10)
_Encode_SmallStruct_GoJsonNoEscape-12                217ns ±17%     216ns ± 1%   -0.21%  (p=0.015 n=8+9)
_Encode_SmallStructCached_GoJson-12                  214ns ± 9%     219ns ± 3%   +2.51%  (p=0.008 n=8+9)
_Encode_SmallStructCached_GoJsonNoEscape-12          213ns ±17%     214ns ± 0%   +0.11%  (p=0.013 n=8+7)
_Encode_MediumStruct_GoJson-12                       976ns ±40%     929ns ± 0%     ~     (p=0.105 n=8+8)
_Encode_MediumStruct_GoJsonNoEscape-12               517ns ±11%     530ns ± 1%   +2.49%  (p=0.043 n=8+10)
_Encode_MediumStructCached_GoJson-12                 506ns ± 4%     519ns ± 1%   +2.50%  (p=0.015 n=8+9)
_Encode_MediumStructCached_GoJsonNoEscape-12         501ns ± 5%     517ns ± 1%   +3.25%  (p=0.006 n=8+10)
_Encode_LargeStruct_GoJson-12                       18.5µs ± 3%    19.2µs ± 2%   +4.07%  (p=0.000 n=8+10)
_Encode_LargeStruct_GoJsonNoEscape-12               18.5µs ± 3%    19.3µs ± 2%   +4.02%  (p=0.000 n=9+9)
_Encode_LargeStructCached_GoJson-12                 7.09µs ± 1%    7.27µs ± 2%   +2.64%  (p=0.000 n=8+10)
_Encode_LargeStructCached_GoJsonNoEscape-12         7.08µs ± 3%    7.28µs ± 2%   +2.82%  (p=0.000 n=9+9)
_Encode_MapInterface_GoJson-12                       754ns ± 3%     777ns ± 2%   +3.05%  (p=0.001 n=9+10)
_Encode_Interface_GoJson-12                          118ns ± 4%     126ns ± 1%   +6.82%  (p=0.000 n=8+7)
_Encode_Bool_GoJson-12                              45.3ns ± 4%    46.8ns ± 5%   +3.18%  (p=0.012 n=8+10)
_Marshal_Bool_GoJson-12                             50.1ns ± 4%    51.8ns ± 2%   +3.43%  (p=0.002 n=8+9)
_Encode_Int_GoJson-12                               49.5ns ± 4%    51.7ns ± 5%   +4.42%  (p=0.002 n=8+10)
_Encode_MarshalJSON_GoJson-12                        123ns ± 1%     128ns ± 3%   +3.86%  (p=0.000 n=8+10)

name                                              old speed      new speed      delta
_EncodeBigData_GoJson-12                          3.13GB/s ± 1%  3.10GB/s ± 1%   -0.95%  (p=0.021 n=8+8)
_MarshalBigData_GoJson-12                         2.82GB/s ± 3%  2.80GB/s ± 3%     ~     (p=0.400 n=9+10)

@jxskiss jxskiss closed this as completed May 2, 2021
@jxskiss
Copy link
Contributor Author

jxskiss commented May 2, 2021

oops... I have closed the issue by mistake, how can I reopen it ?

@goccy goccy reopened this May 2, 2021
jxskiss added a commit to jxskiss/go-json that referenced this issue May 2, 2021
Change-Id: Icb767d22b6c1b2cad991c53f3f450f5eb3f773b1
jxskiss added a commit to jxskiss/go-json that referenced this issue May 4, 2021
Change-Id: Icb767d22b6c1b2cad991c53f3f450f5eb3f773b1
@jxskiss
Copy link
Contributor Author

jxskiss commented May 4, 2021

I have send another pull request to enable slice cache for 32x larger type address range as
#213, which may make this complex optimization unnecessary for most of applications 😃

I guess we can pending this PR untill we find that it really worth the complexity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request performance
Projects
None yet
Development

No branches or pull requests

2 participants