Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use JVM time zone rules in Joda #11891

Merged
merged 1 commit into from
Dec 17, 2018
Merged

Use JVM time zone rules in Joda #11891

merged 1 commit into from
Dec 17, 2018

Conversation

haozhun
Copy link
Contributor

@haozhun haozhun commented Nov 10, 2018

Fixes #8233, fixes #11855 depends on #11890

@haozhun
Copy link
Contributor Author

haozhun commented Nov 10, 2018

Posted JodaOrg/joda-time#491 to request opinion from joda author

Copy link
Contributor

@hellium01 hellium01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't really go into the conversion logics yet (needed to read a little about the implementation on joda/jdk code), I added some comments for other parts.

@@ -2186,13 +2186,13 @@
2177 CST6CDT
2178 Cuba
2179 EET
2180 EST
# 2180 EST # not in java.time
Copy link
Contributor

@hellium01 hellium01 Nov 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should removed it probably from a while back. Currently, when we run following query in presto-cli:

SELECT CAST('2018-10-28 14:44:59.000 EST' AS TIMESTAMP WITH TIME ZONE)

we got error:

Error fetching next at https://[2401:db00:1010:811a:face:0:1:0]:7778/v1/statement/20181110_024907_49716_ysdca/2 returned an invalid response: JsonResponse
java.time.zone.ZoneRulesException: Unknown time-zone ID: EST

How do you find out which timezone to be removed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "2040 Canada/East-Saskatchewan" also got removed in 2017
http://mm.icann.org/pipermail/tz-announce/2017-October/000047.html

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume TestTimeZoneUtils won't be very useful anymore since joda/jdk timezones are unified.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have a test for zone-index contents?

Copy link
Contributor Author

@haozhun haozhun Nov 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have a test for zone-index contents?

We already have one in TestTimeZoneUtils. I updated it.

think "2040 Canada/East-Saskatchewan" also got removed in 2017

Updated in the preceding commit.

// * static initializer of DateTimeZoneIndex

@BeforeClass
protected void registerFunctions()
Copy link
Contributor

@hellium01 hellium01 Nov 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment above, we probably don't need to rely on the system property. Joda allows you to override the default provider.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about the bad name.

This is not really setting it. It's more like validating it (as evidenced by the error message in catch). However, when you run this in IntelliJ, it does provide the convenience of setting it for you (only when doing so is safe).

@@ -407,6 +407,9 @@
<excludes>
<exclude>**/TestLocalExecutionPlanner.java</exclude>
</excludes>
<systemPropertyVariables>
<org.joda.time.DateTimeZone.Provider>com.facebook.presto.tz.JdkBasedZoneInfoProvider</org.joda.time.DateTimeZone.Provider>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does joda support service loader? Or any other way to make this be loaded more reliably?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Override
public int hashCode()
{
return Objects.hash(super.hashCode(), zoneRules);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is hashcode compatible with equals? hashcode consults super's fields and equals does not (it would be correct for equals to do this)

Copy link
Contributor Author

@haozhun haozhun Nov 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no field in the super class. Making things even more interesting, super does have an implementation for hashCode, but not equals. This is very uncommon.

The equals and hashCode implementation here are generated by IntelliJ.

I think the implementation is correct as is. However, it indeed looks very suspicious.

Any concrete suggestions? @findepi @dain

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the classes it appears that subclasses are not equals to each other, and instead you need an exact type match. Providing a default implementation for hashCode is a bit wonky and some of the subclasses override that also. I suggest we simply implement both equals and hashCode and we base them solely on getID().

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naive question -- do we need hashCode/equals at all?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The super class declared equals as abstract so it must be implemented.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets make this just return getID().hashCode();

@@ -2186,13 +2186,13 @@
2177 CST6CDT
2178 Cuba
2179 EET
2180 EST
# 2180 EST # not in java.time
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have a test for zone-index contents?

// Otherwise, the zone-internal cache won't be useful.
// Joda's default Provider implementation, ZoneInfoProvider, caches with SoftReference.
// This implementation didn't use SoftReference for simplicity.
zoneCache = CacheBuilder.newBuilder().build(new CacheLoader<String, DateTimeZone>()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it take a long time to load every time zone? If not, why not just load everything ahead of time?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ran a quick test. On my mac from cold boot, this takes 250 ms, so I would just load everything up front:

    @Test
    public void test()
    {
        JdkBasedZoneInfoProvider jdkBasedZoneInfoProvider = new JdkBasedZoneInfoProvider();
        Set<String> availableIDs = jdkBasedZoneInfoProvider.getAvailableIDs();
        for (String availableID : availableIDs) {
            jdkBasedZoneInfoProvider.getZone(availableID);
        }
    }

@dain
Copy link
Contributor

dain commented Nov 12, 2018

One more thing. The DateTimeZone class is Serializable, so I would either make that fail or test that it works as expected.

@haozhun
Copy link
Contributor Author

haozhun commented Nov 13, 2018

One more thing. The DateTimeZone class is Serializable, so I would either make that fail or test that it works as expected.

Can you tell me how to do this right?

@dain
Copy link
Contributor

dain commented Nov 13, 2018

One more thing. The DateTimeZone class is Serializable, so I would either make that fail or test that it works as expected.

Can you tell me how to do this right?

I just read through the existing serialization code, and it appears to be doing the correct thing. When a DateTimeZone is serialized it is replaced with a class Stub, which contains the String id of the time zone, and when Stub is deserialized it replaces itself with DateTimeZone.forID(iId). So we are good.

Copy link
Contributor

@dain dain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some follow up comments, but this looking good to me.

@Override
public int hashCode()
{
return Objects.hash(super.hashCode(), zoneRules);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The super class declared equals as abstract so it must be implemented.

public class JdkBasedZoneInfoProvider
implements Provider
{
private final Map<String, DateTimeZone> zones;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason to not make this static?

Copy link
Contributor Author

@haozhun haozhun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments addressed, @dain

// this is before the first transition
return instant;
}
return previousTransition.toEpochSecond() * 1000 - 1;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to addressing your comments, I added + 1 and - 1 here. I also added a bunch of tests. Please take a look.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment here.

Copy link
Contributor

@dain dain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor comments. There are still build failures.

// this is before the first transition
return instant;
}
return previousTransition.toEpochSecond() * 1000 - 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment here.

return false;
}
JdkBasedDateTimeZone that = (JdkBasedDateTimeZone) o;
return Objects.equals(zoneRules, that.zoneRules);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets make this just return getID().equals(that.getId());

@Override
public int hashCode()
{
return Objects.hash(super.hashCode(), zoneRules);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets make this just return getID().hashCode();

@haozhun
Copy link
Contributor Author

haozhun commented Nov 16, 2018

Not ready to merge due to issues related to spi and class loader.

@haozhun
Copy link
Contributor Author

haozhun commented Nov 19, 2018

I talked to @dain and @electrum separately offline about this. To summarize,

Proposed Goals:

  1. Presto will continue to support usage of both joda and java.time in connectors. (We recommend joda, but we don't want to force everyone to make the switch.)
  2. Presto will continue to support usage of any reasonable joda version in connectors.
  3. Incorrect results due to inconsistency of tzdata between joda and java.time needs to be eliminated.

Proposed Approach:

  • The supporting classes in this PR will be put in a library in airlift.
    • The library will depend on joda as a "provided" dependency.
    • The library will presumably work with any reasonable version of joda.
  • All connectors that uses joda must depend on this library as a "runtime" dependency.
    • Otherwise, connector will see a different tzdata, potentially violating goal 3.
    • Presto-main will depend on the library and set the JVM system property. As a result, if a connector doesn't depend on it, it will fail loudly (albeit with a rather unhelpful error message about class loading failure).

Question:

  • What should the library be named?

@dain
Copy link
Contributor

dain commented Nov 19, 2018

  • What should the library be named?

JodaToJavaTimeBridge?

@findepi
Copy link
Contributor

findepi commented Nov 19, 2018

The supporting classes in this PR will be put in a library in airlift.

What is the benefit of putting this in airlift-joda-javatime-bridge vs presto-joda-javatime-bridge?

@haozhun
Copy link
Contributor Author

haozhun commented Nov 19, 2018

@dain airlift/airlift#698

@findepi I'm ok either way. @electrum may want to chime in.

@haozhun
Copy link
Contributor Author

haozhun commented Dec 13, 2018

@dain I made significant change to this PR. Please take another look. cc @findepi

Copy link
Contributor

@dain dain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@electrum do you have any comments on the documentation change?

registers a custom Joda-Time zone provider at startup time, which makes
Joda-Time delegate to JVM's built-in zone rules. The zone provider
can be found in the above package. If the dependency is not declared,
a class loading error will occur.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @electrum ^^^

Copy link
Contributor

@electrum electrum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documentation looks good


.. note::

For any connector that uses Joda-Time, it must declare a runtime
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use code quotes for "runtime"

For any connector that uses Joda-Time, it must declare a runtime
dependency on `joda-to-java-time-bridge <https://github.com/airlift/joda-to-java-time-bridge/blob/master/README.md>`_.
Presto internally uses ``java.time`` for certain date/time
computations. As a result, to avoid incorrect results, Presto
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a result -> Thus

(avoid using "result" twice in slightly different ways)

Presto internally uses ``java.time`` for certain date/time
computations. As a result, to avoid incorrect results, Presto
registers a custom Joda-Time zone provider at startup time, which makes
Joda-Time delegate to JVM's built-in zone rules. The zone provider
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to JVM -> to the JVM

Mixed use of joda and java.time results in two different versions of
tzdata being used at the same time for a query. This has lead to
inconsistencies that had resulted in incorrect query results, and
query execution failure.

This commit makes joda and java.time use the same tzdata by making
joda delegate to java.time for all time zone rules.
@haozhun haozhun merged commit 8ac4b84 into prestodb:master Dec 17, 2018
@haozhun haozhun mentioned this pull request Dec 17, 2018
14 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
7 participants