Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix using optional between numbers #362

Merged
merged 3 commits into from
Apr 29, 2024
Merged

Conversation

dkhalanskyjb
Copy link
Contributor

See the commit message for details of the bug.

@dkhalanskyjb dkhalanskyjb requested a review from ilya-g March 5, 2024 11:05
@dkhalanskyjb
Copy link
Contributor Author

There is still a way to make parsers fail when their specification is unambiguous, for example, with something like

dayOfMonth(Padding.NONE)
monthName(MonthNames("01", "02", "03", "04", "05", "06", "07", "08", "09", "10", "11", "12"))

as the trie doesn't get merged with the neighboring number consumers. Fixing it is completely doable, which is why we don't restrict this usage, but it would severely overcomplicate the code and would probably not help anyone at all.

@@ -129,6 +129,8 @@ public sealed interface DateTimeFormatBuilder {
*
* [am] is used for the AM marker (0-11 hours), [pm] is used for the PM marker (12-23 hours).
*
* Empty strings can not be used as markers.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we check this requirement in the amPmMarker call, or it fails the operation (the parsing I suppose) later?
Do we need the requirement to have distinct strings here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The amPmMarker call checks this requirement. Documented this and added the tests.
Also, added the requirement for distinct strings for month days, days of the week, and AM/PM.

@@ -21,6 +23,14 @@ public class MonthNames(
) {
init {
require(names.size == 12) { "Month names must contain exactly 12 elements" }
names.indices.forEach { ix ->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the second thought about duplicates, this way users would not be able to use single-letter month names for formatting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After an internal discussion, we decided that it's not an issue with the typical use cases for single-letter month names: they are mostly used independently from the other formatting, so you probably won't see things like 2024-J-06 in the wild. Instead, you can find a row of J, F, M, A, etc., as labels on a chart or buttons in a calendar app, but then, there is no reason to build a custom format, you can just do something like date.month.name[0].

The reproducer was:
```
offsetHours(Padding.NONE)
optional {
    optional { char(':') }
    offsetMinutesOfHour()
}
```

This showed us one bug and one inefficiency, both of which are
fixed here.

Inefficiency: the `optional { char(':') }` should for all intents
and purposes be equivalent to
`alternativeParsing({ char(':') }) { }`: both the empty string and
the `:` should be parsed, and, according to `optional`'s
definition, if none of the fields mentioned in the block are
non-zero, nothing should be output. However, such `optional` still
created a fake parser element that notified all the fields that
they are zero when an empty string is parsed, even though there are
no fields.

Bug: the fake parser element that notifies the fields that they
are zero even though nothing of note was parsed interfered with the
mechanism supporting compact formats. Number parsers are greedy,
so `number(hh); number(mm)` gets merged into `number(hhmm)`.
If there is something between them, this merge can't happen:
`number(hh); string("x"); number(mm)`.
For this reason, parsers that don't accept any strings are
forbidden (or just very unlikely to happen)--except for the
zero-width unconditional modification parser. This bug is fixed by
moving such parsers to the end of the parser:
`number(hh); modification(); number(mm); string("x")` will get
transformed to `number(hhmm); string("x"); modification()`.
@dkhalanskyjb dkhalanskyjb merged commit e721269 into master Apr 29, 2024
1 check passed
@dkhalanskyjb dkhalanskyjb deleted the fix-optional-between-numbers branch April 29, 2024 11:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants