Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding ints to strings #137

Open
marten-seemann opened this issue Jan 19, 2020 · 1 comment
Open

encoding ints to strings #137

marten-seemann opened this issue Jan 19, 2020 · 1 comment

Comments

@marten-seemann
Copy link

First of all, thank you a lot for this awesome library. I just ran a few benchmarks on my marshalling code, and I'm getting a speedup of 6x (!) and a reduction of 25x (!) on allocated memory compared to the Go standard library. That's a crazy difference!

Trying to optimize things even further (just trying to push things to the limit ;)), I've encountered the following problem: In the JSON format that I'm supposed to output, integers need to be encoded as strings (because many JSON implementations only support numbers up to 2^53 or so).
With the Go standard library this is pretty straightforward. I'd define a value like this

type frame struct {
    Offset int64 `json:"offset,string"`
}

This would automatically output the int64 (in decimal representation) as a string.

As far as I've found out so far, the way to do this using gojay would be the following:

enc.StringKey("offset", strconv.FormatInt(int64(f.Offset)))

The problem is that strconv.FormatInt has to allocate the string it is returning, just to have gojay copy the string contents to the JSON output being written, and garbage collected afterwards.

I can see two ways how to optimize this:

  1. I could allocate a byte slice (or string) as a buffer, encode my int64 into that string, and then use StringKey with this buffer, reusing it every time I need to write a number as a string. This solution would not require any changes to gojay.
  2. gojay could offer an API to do exactly this, e.g. a function Encoder.Int64AsString. Considering that my use case is apparently pretty common and even supported by the standard library, maybe this is the cleaner solution?

Does this make sense? Or am I missing something here?

@marcsantiago
Copy link

So I took a look at the standard libraries implementation
In decode.go line 867 the function called literalStore handles the string to numeric conversion.

In around 879 you can that fromQuoted variable is whats used to determine if the value has the string tag to conversions.

Later on on line 1013 you can see that within the switch block that a type conversion is being done

n, err := strconv.ParseInt(s, 10, 64)
if err != nil || v.OverflowInt(n) {
	d.saveError(&UnmarshalTypeError{Value: "number " + s, Type: v.Type(), Offset: int64(d.readIndex())})
	break
}
v.SetInt(n)

The encoding.go package in the std lib does not does not use the string tag. It uses the field type directly not the tag itself unless the tag is omitempty and in that case it skips the field is empty.

In that sense

enc.StringKey("offset", strconv.FormatInt(int64(f.Offset)))

it is still better as it does not rely on reflection to cast that value :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants