Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSV import should support a timezone override #4421

Open
javier opened this issue Apr 18, 2024 · 3 comments
Open

CSV import should support a timezone override #4421

javier opened this issue Apr 18, 2024 · 3 comments
Labels
Friction New feature Feature requests

Comments

@javier
Copy link
Contributor

javier commented Apr 18, 2024

Is your feature request related to a problem?

If you have a CSV with timestamps (specially for designated, but not only) QuestDB should allow to force a timezone override, so UTC is not assumed.

For example, imagine a CSV (original at https://data.cityofchicago.org/api/views/qmqz-2xku/rows.csv) with rows like these:

Beach Name,Measurement Timestamp,Water Temperature,Turbidity,Transducer Depth,Wave Height,Wave Period,Battery Life,Measurement Timestamp Label,Measurement ID
Calumet Beach,04/07/2018 12:30:00 AM,21.7,3.15,,0.389,3,0,04/07/2018 12:30 AM,CalumetBeach201804072430
Ohio Street Beach,06/14/2019 07:00:00 PM,16,2.18,,0.137,3,11.8,06/14/2019 7:00 PM,OhioStreetBeach201906141900
Ohio Street Beach,05/16/2018 01:00:00 PM,21.7,1.92,,0.023,1,10.3,05/16/2018 1:00 PM,OhioStreetBeach201805161300
Ohio Street Beach,05/16/2018 02:00:00 PM,21.9,1.81,,0.026,6,10.3,05/16/2018 2:00 PM,OhioStreetBeach201805161400
Ohio Street Beach,06/04/2018 09:00:00 AM,23,0.03,,0.021,8,12.4,06/04/2018 9:00 AM,OhioStreetBeach201806040900
Ohio Street Beach,06/04/2018 10:00:00 AM,17.6,3.09,,0.159,4,12.6,06/04/2018 10:00 AM,OhioStreetBeach201806041000

I know the date is CST, because this is from Chicago, but timezone is not present on the dataset. Right now I have to do a transformation (my solution is importing into a temp table and then doing something like INSERT INTO table SELECT to_timestamp(CONCAT(MeasurementTimestamp,' CST'), 'MM/dd/yyyy kk:mm:ss a z') as MeasurementTimestamp,.

Describe the solution you'd like.

I'd rather pass a format for the timestamp when importing the CSV. I am not sure if we want to do this as an extra parameter on import, or if we want to make this generic into questdb, so we can also use it at any moment with functions like to_timestamp. In that case we could have something like 'MM/dd/yyyy kk:mm:ss a !CST'), where the ! would indicate we are forcing the timezone and the date should be interpreted as being natively on that one.

The import param is probably easier, but the second option is probably more versatile.

Describe alternatives you've considered.

No response

Full Name:

Javier Ramirez

Affiliation:

QuestDB

Additional context

No response

@javier javier added New feature Feature requests Friction labels Apr 18, 2024
@javier
Copy link
Contributor Author

javier commented Apr 18, 2024

Or maybe if we want to make the pattern more understandable, rather than !CST we could use something like z!CST to reuse the z existing token

@jerrinot
Copy link
Contributor

something like !CST looks ok first. but if you think about it:

  1. it makes ! a special character - it means compatibility issues.
  2. the same patterns is used for formatting. one has to guess how a formatted is going to act - should it force formatting in the given timezone? probably yes. should it print the timezone too? probably now. but I am less sure.

while you can specify how it should behave this specification will be specific to QuestDB and our parsers/formatters will move further from what other systems do. (or are aware of any system supporting this?)

for the reasons above I believe the simple option with forcing a specific timezone via an import config is probably wiser.

@javier
Copy link
Contributor Author

javier commented May 17, 2024

That makes sense. To the best of my knowledge, no other database allows to force a timezone when not present, so I was imagining if we could have a way of making that part of our parser adding a new token that would force the timezone. But I was probably overthinking XD

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Friction New feature Feature requests
Projects
None yet
Development

No branches or pull requests

2 participants