Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Json round-tripping changes leap seconds #944

Closed
Ekleog opened this issue Jan 23, 2023 · 12 comments
Closed

Json round-tripping changes leap seconds #944

Ekleog opened this issue Jan 23, 2023 · 12 comments

Comments

@Ekleog
Copy link
Contributor

Ekleog commented Jan 23, 2023

The following code triggers the second assert_eq! with a confusing error message:

fn main() {
    let time = chrono::NaiveTime::from_num_seconds_from_midnight_opt(0, 1_500_000_000).unwrap();
    assert_eq!(time, time, "time was equal to itself");
    let time2 = serde_json::from_slice(&serde_json::to_vec(&time).unwrap()).unwrap();
    assert_eq!(time, time2, "time was not equal to its json-roundtripped self"); // triggers with confusing error message
}

(playground)

thread 'main' panicked at 'assertion failed: `(left == right)`
  left: `00:00:01.500`,
 right: `00:00:01.500`: time was not equal to its json-roundtripped self', src/main.rs:5:5

This issue was again found by my fuzzer… and TBH I would have rather not had to care 😅

My thoughts are:

  • Probably we should change the Debug repr of leap seconds, so it's obvious there's a leap second at play in this error message? I'm willing to do a PR but not sure what the output should be. Maybe 00:00:00+1.500? Or 00:00:00(leap).500?
  • Defaulting to TAI (UTC seems to be treated as canonical; hazardous in the presence of leap seconds #21) would have avoided this issue in the first place
  • I have literally no idea how to deal with json roundtripping of the value. Either we introduce a specific notation for leap seconds time in the JSON, but then that'd probably break literally all the other json parsers that wouldn't be ready for this kind of input. Or we just acknowledge that json serialization of non-TAI is lossy, but then it'd really be great if we had a way of serializing to TAI.

What do you think?

@djc
Copy link
Contributor

djc commented Jan 24, 2023

Changing the Debug representation makes sense to me, although I'm not sure how we should represent them -- something like +1.500 seems easily confused with time zones so I feel like it should be more explicit though maybe less verbose than (leap) -- just enough to make it clear that something rare is going on.

We can discuss TAI in the linked issue. I think we'll have to accept that JSON serialization of non-TAI time is lossy.

@Ekleog
Copy link
Contributor Author

Ekleog commented Jan 28, 2023

Hmm… maybe 00:00:00.leap.500 then, for the time that is currently being serialized as 00:00:01.500?

@ijackson
Copy link

If you are going to try to encode leap seconds at all, and print it in this (without-date) ISO8601-like format, you should print the leap second as :60. That will round trip, although of course whether this will work with any protocol peers will depend de jure on whether the protocol specifies ISO8601 (with, perhaps implicitly, leap seconds) as the format.

De facto the peer probably won't cope unless the protocol spec (that imported JSON) explicitly says that leap seconds must be supported.

I think the call from_num_seconds_from_midnight_opt(0, 1_500_000_000) is erroneous and ought to be rejected. Leap seconds form part of the previous day. So (assuming we want to keep this IMO-ill-advised API) the test case ought to be from_num_seconds_from_midnight_opt(86399, 1_500_000_000).

@ijackson
Copy link

Certainly, leap seconds always appear at the end of a minute, in any timezone. So when passing from_num_seconds_from_midnight_opt nano>1, the only permissible values of secs are i*60 + 59 for some (sensible) integer i. That seems to work correctly: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=f4784dd9c7350cbcad1f75f3d5545e36

@djc
Copy link
Contributor

djc commented Jan 31, 2023

@ijackson if this API is ill-advised, what would you propose instead? I agree that it might make sense to reject 0, 1_500_000_000 in this case.

Maybe we should reject serializing a leap second unless through a particular opt-in API?

@ijackson
Copy link

@ijackson if this API is ill-advised, what would you propose instead? I agree that it might make sense to reject 0, 1_500_000_000 in this case.

I think it's probably too late to rework this particular bit of the API now. My suggestion would be to document the semantics more explicitly, and in particular to document explicitly the restriction that leap seconds are only allowed at the end of a whole minute.

Maybe we should reject serializing a leap second unless through a particular opt-in API?

I don't see a compelling reason not to serialise it to :60 and hope that whoever receives it can cope.

An alternative would be to rip out leap seconds support altogether but that doesn't seem like a very good option at this stage.

@Ekleog
Copy link
Contributor Author

Ekleog commented Feb 1, 2023

I think changing the API is related to #21. The API would ideally need to change whatever is decided for 0.5, but will be dependent on the choice there (whether chrono should focus on having second-exact durations or second-exact naive-times without support from the os, basically).

TBH I can't see better behavior than current for json round-tripping with the current API: it's afaict already encoding as :60 and decoding properly.

The problem is really just for ill-formed leap seconds (in the middle of a minute). I think it makes sense for json round-tripping to not keep that information right, as the leap second was ill-formed in the first place. So this just leaves the debug representation.

Maybe having Debug be like 00:00:01.5(invalid leap second) whenever the leap second is invalid, and keeping the current 00:00:60.5 behavior when the leap second is actually valid would make sense? Sure it's verbose but it's behavior that should have been prevented by the API anyway, no correct code will find it (but fuzzers definitely will)

@djc
Copy link
Contributor

djc commented Feb 1, 2023

I mean, we could instead make from_num_seconds_from_midnight_opt() panic for invalid leap second input?

@ijackson
Copy link

ijackson commented Feb 1, 2023

It doesn't need to panic. It can just return None. Per the docs:

Returns None on invalid number of seconds and/or nanosecond.

@djc
Copy link
Contributor

djc commented Feb 1, 2023

Ahh yeah, sorry.

@pitdicker
Copy link
Collaborator

We have a couple of issues about leap seconds. I would like to make this one about the ability of types to represent leap seconds, and leave the clock source and the ability to do calculations involving leap seconds to other issues.

I am just writing down some of my thoughts on the issue here.

Chrono defines a point in time with components: types such as NaiveDate, NaiveTime and FixedOffset. This allows for combinations to arise that don't exist in reality. Hifitime is the notable rust crate with leap-second support, and avoids a design with components in favor of well-defined time scales.

Offset from UTC
From my reading of the time zone database most countries started with a standard time that was based on the geographical center of the country/capital. The offset from UTC was defined with hours, minutes, seconds, and sometimes even one or two digits after the comma. Eventually this was standardized to nice round values that are multiples of 10 minutes. From a quick look Suriname seems to be the latest to switch in 1945.

In order to work with historical values FixedOffset must be able to represent seconds. And like basically all software we ignore the digits behind the comma.

Leap seconds
The first leap seconds was inserted on June 30th 1972. According to the time zone database it was initially not clear whether the leap second would occur simultaneously worldwide, or if some countries would schedule it at local midnight. It turns out to happen simultaneously (and I am glad, this already cause for enough mess 😄).

Because by 1972 the offset from UTC is everywhere specified to be a round number of minutes, everyone can use ":60" to designate the extra second. It does not have to happen within the 59th minute however. In Nepal the offset in 1972 was +05:30, so the first leap second was at 1972-07-01 5:29:60. And when the offset was changed to +05:45 in 1986 the next leap second occured on 1988-01-01 5:44:60 (let's hope I got the numbers right).

So there are ca. 26 years between the first leap second, and when the last offset from UTC was in use that did not have a round number of minutes, and could mess up the ":60" designation.

Components
Only the DateTime<Tz> type has enough information to know if a value corresponds a real leap second (insert disclaimer about future leap seconds). The NaiveTime and NaiveDateTime types can do nothing better than represent it.

What should happen if you have a DateTime<Tz> with a valid leap second, say 2016-12-31 23:59:60 UTC, and a user combines this with an offset of -03:40:36? Chrono currently can just do as you ask and preserve all information. Only when the time comes to format the value it will no longer be possible to discern it from a value one second later.
This seems like a pretty reasonable design to me.

Make it harder to create invalid leap seconds
Because there has never been a leap second that was not at the end of a minute, I think it is very reasonable to make such values difficult to construct.

We can change NaiveTime::from_hms_nano_opt and NaiveTime::from_num_seconds_from_midnight_opt to error when the nanoseconds exceed 1,000,000,000 while the second is not 59.
This will also affect all other constructors on NaiveTime and NaiveDate::and_* that wrap NaiveTime::from_hms_nano_opt.

We should also adjust the Arbitrary code to not try match.

Change Debug representation?
Since #1035 we no longer use the Debug representation when serializing, but an RFC 3339 formatter. I am not sure we should adjust the Debug format, as I can't see a format that makes obvious sense.

@pitdicker
Copy link
Collaborator

Closing this issue.

#1283 changed almost all methods to no longer allow creating a leap second that is not on a minute boundary. In can still be created from an existing value by using a DateTime with a fractional minute offset, and by changing a time with Timelike::with_nanosecond.

I don't think we should change the Debug format. It has long been documented to be an ISO 8601 string. And also the leap second issue has been documented for a long time (https://docs.rs/chrono/latest/chrono/naive/struct.NaiveTime.html#impl-Debug-for-NaiveTime):

The Debug output of the naive time t is the same as t.format("%H:%M:%S%.f").

The string printed can be readily parsed via the parse method on str.

It should be noted that, for leap seconds not on the minute boundary, it may print a representation not distinguishable from non-leap seconds. This doesn't matter in practice, since such leap seconds never happened. (By the time of the first leap second on 1972-06-30, every time zone offset around the world has standardized to the 5-minute alignment.)

#954 is for further discussion of the leap-second support in chrono.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants