Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support @JsonFormat for String, numbers, using String.format() #1114

Open
cowtowncoder opened this issue Feb 2, 2016 · 22 comments · May be fixed by #4273
Open

Support @JsonFormat for String, numbers, using String.format() #1114

cowtowncoder opened this issue Feb 2, 2016 · 22 comments · May be fixed by #4273

Comments

@cowtowncoder
Copy link
Member

Seems like it would be nice to just support formats available via format Strings that String.format() uses.
Note that existing support for date/time values use different formatting; this should be fine, and if we want to for some reason support something from java.text.Format it should be possible to use heuristics to determine intended type of format. However that may not be necessary.

Initially we should just support simple types:

  • java.lang.String
  • int, long, short, byte, float, double and matching wrapper types
  • BigDecimal, BigInteger
@cowtowncoder cowtowncoder changed the title (2.8) Support @JsonFormat for String, numbers, using String.format() Support @JsonFormat for String, numbers, using String.format() May 4, 2016
@cowtowncoder cowtowncoder added 2.9 and removed 2.8 labels Jul 22, 2016
@cowtowncoder cowtowncoder added 2.10 and removed 2.9 labels Jul 17, 2018
@mjustin
Copy link

mjustin commented Mar 27, 2019

I was about to open an issue suggesting adding support for DecimalFormat as the format for numbers, but I guess String.format gets the job done too. I assume I shouldn't open a separate issue for that?

@cowtowncoder
Copy link
Member Author

cowtowncoder commented Mar 27, 2019

@mjustin Probably not. The main question is, I think, two-fold:

  • Which formatter should be used (as is there is no way to select different ones) for each type
  • For numbers, use of formatting requires ability to write numbers passing String -- works for (most?) textual formats, but will not work for binary formats -- what to do? NOTE: Jackson 2.8 added JsonGenerator.canWriteFormattedNumbers() which DOES let databind know this, maybe just ignore

@cowtowncoder
Copy link
Member Author

Note: @JsonFormat(shape = Shape.STRING) already works for all JDK numbers, so this is just for extending formatting capabilities.

@cowtowncoder cowtowncoder added the good first issue Issue that seems easy to resolve and is likely a good candidate for contributors new to project label Oct 6, 2019
@mincong-h
Copy link

Hi @cowtowncoder , does this issue mean supporting the use-case below, where @JsonFormat is used to contextualize the serialization?

   static class IntAsThousandSeparatedString {
       @JsonFormat(shape=Shape.STRING, pattern="%,d")
       @JsonProperty("value")
       public int foo = 3_000;
   }

And it should be serialized as follow:

{ "value" : "3,000" }

I understand that it should support other numeric types and different locales as well. I would like to contribute on this one. If you could give me some hints about where to get started, that will be great.

@cowtowncoder
Copy link
Member Author

@mincong-h Yes, this is correct.

As to supported types: it is ok to implement support for smaller set of types, for example just for String, or, just for various numbers.

On hints, here are some thoughts:

  1. Ideally support should be both for serializers and deserializers... but former is much easier. Not sure if latter can be supported generally, probably need to consider on case by case basis. May need to start by only support serialization
  2. Detection of custom Shape (and other JsonFormat overrides) is typically handled from createContextual() method of JsonSerializer (and JsonDeserializer) -- see f.ex EnumSerializer for best access (annotation is accessed through AnnotationIntrospector but there are other complications and that is why helper method findFormatOverrides() is useful
  3. createContextual() MUST create a new instance, can not modify state of (de)serializer since instances are shared and should be immutable

Handling of various JsonFormat settings gets pretty complicated unfortunately, so changes should be well tested, using various ways to change settings, including "config overrides" (ObjectMapper.configOverride(String.class).setFormat(....)).

@mincong-h
Copy link

Thanks for your hints, @cowtowncoder. I had a draft here to modify the ToStringSerializer master...mincong-h:iss1114 But more I think about this issue, more I feel this feature might not be that easy. Serialization is completely fine, using String.format provides rich choices for users to format their objects. However, deserialization is a nightmare—there is no way to parse strings easily. Otherwise Java's would have done that already.

At a high-level overview, I believe users expect Jackson annotations to work for both serialization and deserialization. Meaning that when declaring a @JsonFormat(shape=Shape.STRING, pattern="%,d"), they will expect Jackson to serialize and deserialize correctly. This is very hard to support and can also have impact on performance. I guess for the deserialization, we need to first parse the format string, e.g. "%,d", to understand it syntax, then use it to deserialize the value. This is hard because:

I think it's better to have the following alternatives:

  1. Introduce @JsonSerializeFormat
  2. Introduce @JsonNumberFormat

Annotation @JsonSerializeFormat means that we only support the serialization, this is a one direction configuration. It's not super cool, but should work for some particular cases. This could be nice as the complementary of @JsonFormat. And more importantly, it states explicitly that the annotation only works in serialization. It's self-documented.

Another alternative I see is @JsonNumberFormat (or @DecimalFormat as @mjustin mentioned). I think people are interested when working with locale-specific numbers. Because their thousands-separator, decimal point, etc are different. This annotation can be particularly useful when the serialized data comes from external system and has a fixed format. What I imagine is something like Python Pandas' DataFrame, where user can control explicitly the options themselves, e.g. thousands-separator, decimal point:

@JsonNumberFormat(thousands=" ", decimal=",")
public double frenchValue = 1_234.56;  // "1 234,56"

In this way, each parameter seems easier to parse compared to the "generic" approach of String.format(). Also supporting the deserialization could be easier than @JsonFormat. WDYT?

@cowtowncoder
Copy link
Member Author

I don't like the idea of datatype-specific formatters, mostly because that leads to complexity on handling side as well as on number of special-purpose annotations. One really nice feature of @JsonFormat itself is that initial handling as well as overrides (per-type config overrides, per-property) is automated and (de)serializers need not worry about details of where things come from.
Adding new format annotations and logic would be significantly more work, and take formatting into directions that may not be something Jackson should provide.

At the same time it is true that there are issues with simple textual format definition: not so much performance since we never process format String more than once (it is done when constructing (de)serializer and resulting format object used afterwards), but it may be confusing to try to figure supported formats.

As to serialization-only vs read-write: it is difficult to say what users expect, in a way, since existing format string is only used with date/time values... and yes, they can be read and written.
But number formatting is different as fundamentally alternate number serialization would make resulting JSON content non-compliant (but not necessarily XML or CSV content).

So... I don't know what would be a good way forward. Here are some thoughts:

  1. I agree that number-formatting looks like different case from any other choices (date/time, "standard" String serialization formatting)
  2. "standard" (or maybe just "general purpose") formatting for String-values (including not just Java Strings but "stringified" Numbers and other types) could fall into "easier" category: and one where we could perhaps support deserialization too (depending on what kind of notation was chosen)
  3. For either of above it might be easier to implement advanced functionality as add-on module and NOT try to build it in jackson-databind. Existing support for @JsonFormat could be used (or not), depending on needs.

From this, I guess earlier comment by @mjustin -- about whether to split issue(s) -- makes sense again: yes, I think there are separate concerns of number-specific possibility, and then more general-purpose "String decoration" capability.

@cowtowncoder cowtowncoder removed 2.10 good first issue Issue that seems easy to resolve and is likely a good candidate for contributors new to project labels Apr 12, 2020
@Ikki-Dai
Copy link

Hope this feature can be implement, spring framework has a @NumberFormat annotation provide similar feature!

@hurelhuyag
Copy link

My suggestion: #4273

@pjfanning
Copy link
Member

pjfanning commented Dec 20, 2023

@hurelhuyag #4273 doesn't appear to be anything like what is described here. Try writing a test with a JsonFormat annotation in it and showing how that the test can be made to work with the changes you are suggesting. Adding new constructors doesn't help when the code that calls the constructors is done deep within the jackson-databind code.

Any PR that doesn't attempt to prove that the code does something useful with a test case - this is not a good start.

@hurelhuyag
Copy link

hurelhuyag commented Dec 20, 2023

@pjfanning Everyone here wants to have number formatting instead of just calling toString() method which ToStringSerializer does.

There are 2 ways to do it.

  1. String.format()
  2. NumberFormat.format()

I suggest using the second method directly in NumberSerializer.

It is not a time to write tests. It is time, we have to decide which method to use and where to put the feature. After that, tests can be written.

@pjfanning
Copy link
Member

@pjfanning Everyone here wants to have number formatting instead of just calling toString() method which ToStringSerializer does.

There are 2 ways to do it.

  1. String.format()
  2. NumberFormat.format()

I suggest using the second method directly in NumberSerializer.

It is not a time to write tests. It is time, we have to decide which method to use and where to put the feature. After that, tests can be written.

We'll need deserializer support too. Serializer support on its own is not enough.

@yihtserns
Copy link
Contributor

yihtserns commented Dec 20, 2023

It is not a time to write tests. It is time, we have to decide which method to use and where to put the feature. After that, tests can be written.

@hurelhuyag do you come from a coding culture where devs only write tests to capture regression?

The tests you're being asked here is to demonstrate usage so people can easily see (from a user's perspective) what you have proposed/enabled/implemented/fixed.

@cowtowncoder cowtowncoder linked a pull request Dec 20, 2023 that will close this issue
@cowtowncoder
Copy link
Member Author

cowtowncoder commented Dec 20, 2023

I 100% think that it is exactly time to write tests to show how things work (and IF they work) :)

I added some notes on PR itself; but there are fairly big questions to resolve:

  1. Format notation to use (DecimalFormat vs StringFormat)
  2. Output: JSON Number (JsonGenerator.writeNumber(String)) vs JSON String? (not all Strings are valid JSON Numbers of course)
  3. Only for serialization? Probably has to be... but how would they be read back in?

On (2), @JsonFormat also has shape property which could be combined to choose between methods to call; so the question would be that of what to default to (I am guessing NUMBER since these are Number (or subtype) fields?

@cowtowncoder
Copy link
Member Author

@yihtserns @pjfanning You may want to have a look at PR. I think we can make this work, but I have one design concern wrt JsonFormat.shape. Basically, there are 2 ways to write formatted numbers:

  1. JsonGenerator.writeString(String), to output JSON String
  2. JsonGenerator.writeNumber(String), to try to output JSON Number -- not all backends support it, and for others (JSON esp.) it could produce invalid content

I can see both use cases useful for some usage. To me, use of JsonFormat.shape makes sense; and Shape.STRING vs Shape.NUMBER is simple.
But what about the default case (Shape.ANY)? One possibility would be to always require specifying shape, otherwise failing: that is, not guessing but requiring explicit choice.

Other would be to default to one style or other. My initial leaning was that since we are dealing with Numbers, default should be to writeNumber(String). But then again, alternative of "as text unless told otherwise" is safer and avoids producing invalid content.

WDYT?

@pjfanning
Copy link
Member

Formatted numbers should be rendered as JSON strings. Spec compliant JSON parsers will not parse the formatted numbers if they are rendered as JSON numbers (without quotes).

If we really want to support rendering them as JSON numbers (and users are happy that the consumers have JSON parsers that will accept them), then maybe we could add an extra param on @JsonFormat - maybe named renderFormattedNumbersAsStrings, defaulting to true.

@cowtowncoder
Copy link
Member Author

Thank you @pjfanning.

I think it depends: I think it'd be possible someone wanted to specify precision and produce compliant number output. In that case it'd seem wrong to quote value. But yes, it depends on intent which is difficult to deduce.

However, I realized that maybe DecimalFormat doesn't have that option, compared to String.format() string?
If so, n/m, and we should default to String.

As to overriding number, no need for a new setting, just use shape like I explained:

@JsonFormat(format="...." shape=JsonFormat.Shape.NUMBER)

But I guess assuming shape = JsonFormat.Shape.ANY to be same as Shape.STRING makes sense.

One final possibility is that there is new NumberInput.looksLikeNumber(String) (or similar) that I added which we also could call to determine. But maybe that's too complicated heuristic and more confusing than helpful.

@yihtserns
Copy link
Contributor

yihtserns commented Dec 22, 2023

Overview of current Pattern & Shape support by various data types

This is just so we're on the same page regarding the current state (actually mainly for my personal reference, but I thought I should just share here). Be warned that this was made from a quick check so it is likely not exhaustive - I'm sure I missed some things.

  Pattern (Ser) Pattern (Deser) STRING ARRAY BINARY BOOLEAN NATURAL NUMBER NUMBER_FLOAT NUMBER_INT OBJECT SCALAR
String                        
Boolean                
Calendar,Date SimpleDateFormat SimpleDateFormat            
Enum          
Collection                      
Map                      
Map.Entry                      
Integer                      
Long                      
Byte                      
Short                      
Float                      
Double                      
BigDecimal,BigInteger,Number subclasses                  
AtomicXXX                        
Bean              
UUID                  
InetAddress                
JSR310 DateTime DateTimeFormatter DateTimeFormatter          
JSR310 Duration ChronoUnit ChronoUnit          

@JooHyukKim
Copy link
Member

@yihtserns is there written test code used to validate the research above? Asking this so that we may have process of expanding support/non-support

@yihtserns
Copy link
Contributor

@yihtserns is there written test code used to validate the research above? Asking this so that we may have process of expanding support/non-support

Nope: it was generated using the help of IDE+👀. 😅

@yihtserns
Copy link
Contributor

yihtserns commented Dec 23, 2023

Output: JSON Number (JsonGenerator.writeNumber(String)) vs JSON String? (not all Strings are valid JSON Numbers of course)
I think it depends: I think it'd be possible someone wanted to specify precision and produce compliant number output. In that case it'd seem wrong to quote value. But yes, it depends on intent which is difficult to deduce.

I was under the impression that this is about @JsonFormat(shape=STRING), and also because there's no precedence for pattern being used to generate anything other than String output (because there's no such need before, of course).

Perhaps question about generating non-String formatted value should be separated into another issue/ticket.

Only for serialization? Probably has to be... but how would they be read back in?

So far everything that supports pattern, supports it for both serialization & deserialization. Although there may eventually be cases where deserialization can't be supported, I think there won't be any technical blocker for deserializing a number-string using the same pattern used to serialize it...

...waitaminute is there an API to parse a number-string using String.format pattern?..

Format notation to use (DecimalFormat vs StringFormat)

I'm ashamed to say that I'm not terribly familiar with DecimalFormat (I don't remember needing to use it ever for my day jobs), but from a quick scan of the javadoc, I have few thoughts:

  • Unlike the other formatters Jackson has used (SimpleDateFormat, DateTimeFormatter), DecimalFormat seems to have an awful lot more configuration options (so many setters e.g. setCurrency, setDecimalSeparatorAlwaysShown etc) - I guess we can ignore most of them? Or maybe they may eventually go to JsonFormat.Feature when requested?
  • Should @JsonFormat.lenient be supported when parsing numbers?
  • Should @JsonFormat.locale be supported? Seems like DecimalFormat supports locale, but not directly instead via NumberFormat's factory methods (what weird design is this???).

@cowtowncoder
Copy link
Member Author

@yihtserns Right, most things try to support serialization and deserialization. But there is no hard limit of this having to be the case. In case of Numbers I suspect supporting round-tripping may well be difficult.
Which is why I would suggest that we defer deserialization support.

But if we are to support deserialization, I think that:

  1. For "serialize as actual Number token" (JsonFormat.Shape.NUMBER and subtypes) it would have to rely on default number decoding by format backend -- no attempt made for other handling, as parsers do not necessarily expose Strings to databind. Or more importantly, they will need to be able to decode number token without knowledge of @JsonFormat info (given that it's databind level, not available for streaming core)
  2. For "serialize as String token" we would/could/(probably) should support decoding. Eventually.

As to configuration of DecimalFormat, yeah, I don't think it's quite worth the effort to try to expose anything that does not with with existing JsonFormat facets.
But anything that is mappable (i.e. there's JsonFormat property AND DecimalFormat can be configured to use it).

Still, I think there are many "write-only" use cases where either:

  1. Value is not read by Jackson (but something else)
  2. Output is actually valid number (truncating precision); or need not be read back as Number

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants