Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible timestamp issue #45

Open
Zarathustra2 opened this issue Apr 20, 2023 · 22 comments
Open

Possible timestamp issue #45

Zarathustra2 opened this issue Apr 20, 2023 · 22 comments

Comments

@Zarathustra2
Copy link
Contributor

Given the following test for timestamps_test.exs

  test "filtering datetimes" do
    # iso8601 type
    time = ~U[2023-04-20 17:00:25Z]

    account =
      %Account{}
      |> Account.changeset(%{name: "Test"})
      |> TestRepo.insert!()

    %Product{}
    |> Product.changeset(%{
      account_id: account.id,
      name: "Qux1",
      approved_at: time
    })
    |> TestRepo.insert!()

    from(
      u in Product,
      where: u.approved_at == ^time
    )
    |> TestRepo.exists?()
    |> IO.inspect()

    from(
      u in Product,
      where: fragment("? = ?", u.approved_at, ^time)
    )
    |> TestRepo.exists?()
    |> IO.inspect()

    from(
      u in Product,
      where: fragment("? = ?", u.approved_at, ^DateTime.to_unix(time, :millisecond))
    )
    |> TestRepo.exists?()
    |> IO.inspect()
  end

All 3 cases will print false for me.

Is this expected and if so why?

@ruslandoga
Copy link
Contributor

ruslandoga commented Apr 21, 2023

It's probably your non-UTC ClickHouse server timezone issue. It happens due to timestamps in params being encoded as text and in rows -- as ints. I think I have "fixed" it in Ch a few days ago: https://github.com/plausible/ch#select-rows

But even with that change you'd still encounter issues with naive timestamp unless you switch to a UTC server.

@Zarathustra2
Copy link
Contributor Author

But in my test I am not using a native timestamp I am using a timestamp with an UTC offset. So if I insert that timestamp and reuse to filter for that row I should get that inserted row since ~U[2023-04-20 17:00:25Z] has a timezone.

So I kinda expect the tests to pass so I assume there is still something going wrong or am I misunderstanding something?

@ruslandoga
Copy link
Contributor

For ~U[..] it should already be fixed in Ch master: https://github.com/plausible/ch#select-rows

@Zarathustra2
Copy link
Contributor Author

I updated my local chto to latest ch (commit d305fa092515a106a20f517e642228f75bbe2001) and now the test code above fails.

    from(
      u in Product,
      where: u.approved_at == ^time
    )
    |> TestRepo.exists?()
    |> IO.inspect()

^ This still yields false

    from(
      u in Product,
      where: fragment("? = ?", u.approved_at, ^time)
    )
    |> TestRepo.exists?()
    |> IO.inspect()

^ This now throws an error:

  1) test filtering datetimes (Ecto.Integration.TimestampsTest)
     stacktrace:
       (ch 0.1.0) lib/ch/query.ex:197: DBConnection.Query.Ch.Query.encode_param/1
       (ch 0.1.0) lib/ch/query.ex:182: anonymous fn/1 in DBConnection.Query.Ch.Query.query_params/1
       (elixir 1.14.3) lib/enum.ex:1658: Enum."-map/2-lists^map/1-0-"/2
       (ch 0.1.0) lib/ch/query.ex:100: DBConnection.Query.Ch.Query.encode/3
       (db_connection 2.5.0) lib/db_connection.ex:1336: DBConnection.maybe_encode/4
       (db_connection 2.5.0) lib/db_connection.ex:707: DBConnection.execute/4
       (ch 0.1.0) lib/ch.ex:51: Ch.query/4
       (ecto_sql 3.10.1) lib/ecto/adapters/sql.ex:438: Ecto.Adapters.SQL.query!/4
       (chto 0.1.0) lib/ecto/adapters/clickhouse.ex:235: Ecto.Adapters.ClickHouse.execute/5
       (ecto 3.10.1) lib/ecto/repo/queryable.ex:229: Ecto.Repo.Queryable.execute/4
       (ecto 3.10.1) lib/ecto/repo/queryable.ex:19: Ecto.Repo.Queryable.all/3
       (ecto 3.10.1) lib/ecto/repo/queryable.ex:134: Ecto.Repo.Queryable.exists?/3
       test/ecto/integration/timestamps_test.exs:71: (test)

@ruslandoga
Copy link
Contributor

I see. Checking

@ruslandoga
Copy link
Contributor

ruslandoga commented Apr 21, 2023

@Zarathustra2 I've added the test and it seems to be passing for me (UTC timezone). I misunderstood 🙈

  test "filtering datetimes" do
    # iso8601 type
    time = ~U[2023-04-20 17:00:25Z]

    account =
      %Account{}
      |> Account.changeset(%{name: "Test"})
      |> TestRepo.insert!()

    %Product{}
    |> Product.changeset(%{
      account_id: account.id,
      name: "Qux1",
      approved_at: time
    })
    |> TestRepo.insert!()

    assert TestRepo.exists?(
             from u in Product,
               where: u.approved_at == ^time
           )

    assert TestRepo.exists?(
             from u in Product,
               where: fragment("? = ?", u.approved_at, ^time)
           )

    # `refute` because `DateTime` is stored as seconds, for milliseconds precision use `DateTime64(3)`
    refute TestRepo.exists?(
             from u in Product,
               where: fragment("? = ?", u.approved_at, ^DateTime.to_unix(time, :millisecond))
           )
  end

@Zarathustra2
Copy link
Contributor Author

Hmmm, if you run select timezone(); what do you get?

I get Europe/Berlin

@ruslandoga
Copy link
Contributor

UTC

@Zarathustra2
Copy link
Contributor Author

What if you change ur database to Europe/Berlin or anything that is not on UTC timezone. I think then the test will fail

@Zarathustra2
Copy link
Contributor Author

at least that is the only difference I can now spot between your env and my as why the test fails

@ruslandoga
Copy link
Contributor

It seems to be more complicated than I thought. Simply encoding UTC datetimes to unix timestamps in query strings is not enough.

@ruslandoga
Copy link
Contributor

#46 still fails on Europe/Berlin timezone.

@ruslandoga
Copy link
Contributor

Seems like the timestamp inserted in RowBinary is also shifted by ClickHouse.

@Zarathustra2
Copy link
Contributor Author

Simply encoding UTC datetimes to unix timestamps in query strings is not enough.

That is kinda weird because technically according to their documentation that should do it

(Dealing with time & timezones must be the most annoying domain problem I encounter :D )

@Zarathustra2
Copy link
Contributor Author

Should the CI maybe also run with a different clickhouse timezone setting so that in the future we can catch future timezone issues?

@ruslandoga
Copy link
Contributor

For reference, this is what Ch is doing now: plausible/ch@cea3726

@ruslandoga
Copy link
Contributor

Should the CI maybe also run with a different clickhouse timezone setting so that in the future we can catch future timezone issues?

Sure, but that'd would require reworking the tests a bit since naive datetimes would still fail. Probably a tag for each naive test like @tag :naive and ExUnit.start(except: [:naive]) in test_helper.exs based on some env var would be enough. But adding that right now is low priority since Plausible is all UTC :)

@ruslandoga
Copy link
Contributor

I'll return to this issue after finishing #43

@Zarathustra2
Copy link
Contributor Author

Zarathustra2 commented Apr 21, 2023

Thanks my friend. I am currently slammed with work (postgres -> clickhouse migration :D ) but looking to do #37 over the weekend.

@Zarathustra2
Copy link
Contributor Author

I have taken another look, look at the queries:

false
16:18:06.013 [debug] QUERY OK source="products" db=2.9ms idle=191.2ms
SELECT true FROM "products" AS p0 WHERE (p0."approved_at" = {$0:DateTime}) LIMIT 1 [~N[2023-04-20 17:00:25]]

true
16:18:06.019 [debug] QUERY OK source="products" db=2.8ms idle=195.7ms
SELECT true FROM "products" AS p0 WHERE (p0."approved_at" = {$0:DateTime}) LIMIT 1 [~U[2023-04-20 17:00:25Z]]

for some reason the DateTime gets casted to a NaiveDateTime. I ahve not been successful yet to locate where this happens and why but this is the issue. If it would use not cast it it would behave the same as the fragment query

@ruslandoga
Copy link
Contributor

Nice find, Ecto schemas use naive datetimes by default.

@ruslandoga ruslandoga mentioned this issue Nov 11, 2023
@ruslandoga
Copy link
Contributor

ruslandoga commented Nov 11, 2023

@Zarathustra2 I've added Europe/Berlin to CI in #46 and adapted tests to pass. I've also added a modified version of your test to highlight the difference in "product" vs Product Ecto Query sources. The former doesn't cast ~U[] to ~N[] and the latter does. That affects how these datetimes are encoded in SELECT query params.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants