Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

timestamps: use ISO 8601 format #1536

Open
wants to merge 45 commits into
base: master
Choose a base branch
from

Conversation

pstorz
Copy link
Member

@pstorz pstorz commented Aug 24, 2023

This PR introduces the usage of ISO8601 format to format timestamps and adds
timezone support both in formatting and for storing in the Database.

  • default format used mostly: ISO8601 format: 2023-08-24T17:49:24+0200
  • scheduler preview Thu 05-Oct-2023 02:05+0200
  • to be used as part of a filename: 2023-10-01T17.16.25+0200
  • as debug output timestamp, i.e. with microseconds 2023-10-01T17:16:53.123456+0200
  • The database output for timestamps also has been altered so that the timestamps now are output in the ISO Format 2023-10-05T08:57:39+0200

Breaking changes:

  • The existing rarely used directives log_timestamp_format have been deprecated and are not used anymore.
  • The timesamps are now enabled by default for debug messages.

Other changes:

  • Renamed the gettext string translation _() into T_() Translate to avoid problems.
  • All 'bstrf*' functions now have an std::string interface, and all calls have been adapted.
  • import https://github.com/HowardHinnant/date/tree/v3.0.1 into third-party/date

Important

During the catalog database upgrade, the TIMESTAMP WITHOUT TIME ZONE columns will be updated to TIMESTAMP WITH TIME ZONE. It is mandatory that the Postgresql has the correct TIMEZONE setting before the update, so that Postgresql can update the values correctly.

Please check

  • Short description and the purpose of this PR is present above this paragraph
  • Your name is present in the AUTHORS file (optional)

If you have any questions or problems, please give a comment in the PR.

Helpful documentation and best practices

Checklist for the reviewer of the PR (will be processed by the Bareos team)

Make sure you check/merge the PR using devtools/pr-tool to have some simple automated checks run and a proper changelog record added.

General
  • Is the PR title usable as CHANGELOG entry?
  • Purpose of the PR is understood
  • Commit descriptions are understandable and well formatted
  • Check backport line
  • Required backport PRs have been created
Source code quality
  • Source code changes are understandable
  • Variable and function names are meaningful
  • Code comments are correct (logically and spelling)
  • Required documentation changes are present and part of the PR
  • Verify that WebUI still works correctly
Tests
  • Decision taken that a test is required (if not, then remove this paragraph)
  • The choice of the type of test (unit test or systemtest) is reasonable
  • Testname matches exactly what is being tested
  • On a fail, output of the test leads quickly to the origin of the fault

@pstorz pstorz force-pushed the dev/pstorz/master/timestamp-format branch 2 times, most recently from a0ff4d2 to 5e1bdf6 Compare August 24, 2023 16:30
@bruno-at-bareos bruno-at-bareos self-assigned this Aug 31, 2023
Copy link
Contributor

@bruno-at-bareos bruno-at-bareos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bareos-check-sources doesn't agree with your changes, please fix them.

We should decide what we want to achieve with the TimeStampFormat and LogTimeStampFormat.

Do we want to adapt in btime.cc bstrftimes to also take care of the format, our code is calling both of them, maybe one function would be better?

core/src/lib/btime.cc Show resolved Hide resolved
core/src/dird/dird_conf.cc Outdated Show resolved Hide resolved
core/src/dird/dird_conf.cc Outdated Show resolved Hide resolved
core/src/dird/dird_conf.cc Outdated Show resolved Hide resolved
core/src/dird/dird_conf.cc Outdated Show resolved Hide resolved
core/src/filed/filed_conf.cc Outdated Show resolved Hide resolved
core/src/stored/stored_conf.cc Outdated Show resolved Hide resolved
@pstorz pstorz force-pushed the dev/pstorz/master/timestamp-format branch from 0ab35c2 to 6faa586 Compare September 7, 2023 12:45
@bruno-at-bareos
Copy link
Contributor

The job report still contain wrong date (at least for backup) see result of bareos-basic tests.
Scheduled time: 1970-01-01T00:00:00+0000
Start time: 1970-01-01T00:00:00+0000
End time: 1970-01-01T00:00:00+0000

@pstorz pstorz force-pushed the dev/pstorz/master/timestamp-format branch 3 times, most recently from 7ee54a5 to d34bd16 Compare September 19, 2023 14:54
@pstorz pstorz force-pushed the dev/pstorz/master/timestamp-format branch 2 times, most recently from 6d3f2b2 to 6970f56 Compare September 20, 2023 15:09
@pstorz pstorz force-pushed the dev/pstorz/master/timestamp-format branch 4 times, most recently from 5f3a2e7 to 89b5e9f Compare September 22, 2023 13:30
Copy link
Member

@arogge arogge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides my other comments, we should pick up the whole repository and not just one file if we want to have it in third-party.
Also, we should probably discuss the usage of some package manager instead of shipping all dependencies.

Comment on lines 465 to 468
"EXTRACT('EPOCH' FROM FirstWritten),"
"EXTRACT('EPOCH' FROM LastWritten),"
"EXTRACT('EPOCH' FROM LabelDate),"
"EXTRACT('EPOCH' FROM InitialWrite) "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it is good practice to name these fields, even if we just access them via index.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 506 to 509
"EXTRACT('EPOCH' FROM FirstWritten),"
"EXTRACT('EPOCH' FROM LastWritten),"
"EXTRACT('EPOCH' FROM LabelDate),"
"EXTRACT('EPOCH' FROM InitialWrite) "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it is good practice to name these fields, even if we just access them via index.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, as the text-variants of these fields are not needed anymore it may be a good idea to replace FirstWritten with EXTRACT('EPOCH' FROM FirstWritten) AS FirstWritten in place and keep the original offset-index.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 585 to 586
bstrncpy(mr->cFirstWritten, (row[20] != NULL) ? row[20] : "",
sizeof(mr->cFirstWritten));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cFirstWritten is never used (and can be removed)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 587 to 588
bstrncpy(mr->cLastWritten, (row[21] != NULL) ? row[21] : "",
sizeof(mr->cLastWritten));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cLastWritten is never used (and can be removed)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 593 to 594
bstrncpy(mr->cLabelDate, (row[26] != NULL) ? row[26] : "",
sizeof(mr->cLabelDate));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cLabelDate is never used (and can be removed)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -56,6 +56,22 @@
# define ioctl_req_t int
#endif

#if defined(HAVE_WIN32)
static constexpr char* kBareosDefaultTimestampFormat
= (char*)"%Y-%m-%dT%H:%M:%S";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This casts away the constness of the constant, which will lead to undefined behaviour.
The safe approach is to change the interface of the receiving function to accept a const char*

Comment on lines 59 to 74
#if defined(HAVE_WIN32)
static constexpr char* kBareosDefaultTimestampFormat
= (char*)"%Y-%m-%dT%H:%M:%S";
// %z is missing because it is not correctly implemented in windows :(
#else
static constexpr char* kBareosDefaultTimestampFormat = (char*)"%Y-%m-%dT%T%z";
#endif

// for use in TO_CHAR database queries
static constexpr char* kBareosDatabaseDefaultTimestampFormat
= (char*)"YYYY-MM-DD\"T\"HH24:MI:SSTZH:TZM";

// to create filenames, so only what is allowed in filenames also on windows
static constexpr char* kBareosFilenameTimestampFormat
= (char*)"%Y-%m-%dT%H.%M.%S";

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it probably makes sense to group these into a class or namespace, so we end up with TimestampFormat::Default, TimestampFormat::Database and TimestampFormat::Filename.

Also I'm not 100% sure that this is the right place for these.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 124 to 106
# ifndef NO_UNDERSCORE_MACRO
# ifndef _
# define _(s) (s)
# endif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That workaround is not going to cut it.
We need some solution to completely get rid of that _() macro. It currently breaks things and it will continue to produce weird issues if we don't remove it somehow.

#include "include/bareos.h"
#include "lib/btime.h"
#include <math.h>
#include "../../../third-party/date/include/date/date.h"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that should be #include <date/date.h>. include-path setup should be done by CMake.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 34 to 35
char* bstrftime(char* dt, int maxlen, utime_t tim);
char* bstrftime_filename(char* dt, int maxlen, utime_t tim);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these interfaces still feel like they belong to the 90's.
I guess we should have overloads that just return a std::string.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@bruno-at-bareos
Copy link
Contributor

I've two remarks, what about if I don't want to use the hardcoded timestamp format ? if log_format_timestamp is deprecated and not used how can I change it.
In our initial task we were also talking about unifying the message timestamp format.

@pstorz pstorz force-pushed the dev/pstorz/master/timestamp-format branch 3 times, most recently from 3080ea4 to 075f3ea Compare October 1, 2023 14:45
@bruno-at-bareos
Copy link
Contributor

bruno-at-bareos commented Oct 2, 2023

Open question, what about the format of restore when or restore find job before date

11: Enter a list of directories to restore for found JobIds
12: Select full restore to a specified Job date
13: Cancel
Select item:  (1-13): 10
The restored files will the most current backup
BEFORE the date you specify below.

Enter date as YYYY-MM-DD HH:MM:SS :

We have also the case in query module with choice 4
4: List all backups for a Client after a specified time

the returned array show TZ in 2 digits while it is 4 elsewhere and the date is formatted without the T
We should be consistent in the output.

Choose a query (1-22): 4
Enter Client Name: bareos-fd
Enter time in YYYY-MM-DD HH:MM:SS format: 2023-10-01 23:59:00
+-------+-----------+-------+------------------------+----------+-------+---------+
| jobid | client    | level | starttime              | jobfiles | gb    | volumes |
+-------+-----------+-------+------------------------+----------+-------+---------+
|     1 | bareos-fd | F     | 2023-10-02 11:21:34+00 |       23 | 0.020 |       1 |
|     2 | bareos-fd | F     | 2023-10-02 11:22:23+00 |       83 | 0.000 |       1 |
+-------+-----------+-------+------------------------+----------+-------+---------+

@bruno-at-bareos
Copy link
Contributor

bruno-at-bareos commented Oct 2, 2023

pr-tool still has some issues to be fixed

 ✗  Commit checks failed:
        32ee18862 fixup  time command: headline starts with 'fixup'
        10f37bf14 fixup: include gcc12: headline starts with 'fixup'
        6f68a2d1d fixup ua_status.cc: headline starts with 'fixup'
        3ca31122b fixup sql_create.cc: headline starts with 'fixup'
        410388afa remove char* bstrftime_filename(char* dt, int maxlen, utime_t utime);: headline too long
        55d825542 btime: use ISO8601 timestamps everywhere: body contains line longer 72 chars
 ✗  bareos-check-sources --since=46617892a8153d5a0b7fb24538fdb2e9e0994008 reported:
        Plugin 'cmake format' would modify 'core/src/lib/CMakeLists.txt'

@pstorz
Copy link
Member Author

pstorz commented Oct 4, 2023

Open question, what about the format of restore when or restore find job before date

11: Enter a list of directories to restore for found JobIds
12: Select full restore to a specified Job date
13: Cancel
Select item:  (1-13): 10
The restored files will the most current backup
BEFORE the date you specify below.

Enter date as YYYY-MM-DD HH:MM:SS :

We have also the case in query module with choice 4 4: List all backups for a Client after a specified time

When bareos asks for a date, the format is still the same as before. Internally, the current timezone offset will be appended to the input and used to calculate the correct timestamp.

the returned array show TZ in 2 digits while it is 4 elsewhere and the date is formatted without the T We should be consistent in the output.

Choose a query (1-22): 4
Enter Client Name: bareos-fd
Enter time in YYYY-MM-DD HH:MM:SS format: 2023-10-01 23:59:00
+-------+-----------+-------+------------------------+----------+-------+---------+
| jobid | client    | level | starttime              | jobfiles | gb    | volumes |
+-------+-----------+-------+------------------------+----------+-------+---------+
|     1 | bareos-fd | F     | 2023-10-02 11:21:34+00 |       23 | 0.020 |       1 |
|     2 | bareos-fd | F     | 2023-10-02 11:22:23+00 |       83 | 0.000 |       1 |
+-------+-----------+-------+------------------------+----------+-------+---------+

The string displayed here is the way that postgresql formats timestamps with timezone.

I have only changed the format string for queries where the database string is used as input into our date parsing functions.

All queries have been adapted, so that the timestamps are correctly formatted by the database.

@pstorz pstorz force-pushed the dev/pstorz/master/timestamp-format branch 2 times, most recently from 4cb56f4 to 0168331 Compare October 4, 2023 14:41
@bruno-at-bareos
Copy link
Contributor

bruno-at-bareos commented Oct 5, 2023

Generic comments after rough testing
the table in status command is misaligned now

status client
Terminated Jobs:
 JobId  Level    Files      Bytes   Status   Finished        Name
======================================================================
     1  Full         23    22.34 M  OK       2023-10-05T13:21:46+0000 backup-bareos-fd
     2  Incr          0         0   OK       2023-10-05T13:21:52+0000 backup-bareos-fd

similar for storage

*status storage=File
Connecting to Storage daemon File at 3f81b031f968:9103
 Encryption: TLS_CHACHA20_POLY1305_SHA256 TLSv1.3

bareos-sd Version: 23.0.0~pre996.3c0f6dd44 (05 October 2023) openSUSE Leap 15.4
Daemon started 2023-10-05T13:21:30+0000. Jobs: run=2, running=0, self-compiled binary
 Sizes: boffset_t=8 size_t=8 int32_t=4 int64_t=8 bwlimit=0kB/s

Job inventory:

Terminated Jobs:
 JobId  Level    Files      Bytes   Status   Finished        Name
===================================================================
     1  Full         23    22.34 M  OK       2023-10-05T13:21:46+0000 backup-bareos-fd
     2  Incr   4,294,967,295         0   OK       2023-10-05T13:21:52+0000 backup-bareos-fd
====

In restore preview the When is shown with actual timestamp+timezone, but the input can't have the timezone ...
Do we want to change that ?

Run Restore job
JobName:         RestoreFiles
Bootstrap:       /usr/local/var/lib/bareos/bareos-dir.restore.1.bsr
Where:           /tmp/bareos-restores
Replace:         Always
FileSet:         LinuxAll
Backup Client:   bareos-fd
Restore Client:  bareos-fd
Format:          Native
Storage:         File
When:            2023-10-05T13:31:36+0000
Catalog:         MyCatalog
Priority:        10
Plugin Options:  *None*
Select parameter to modify (1-14): 7
Please enter desired start time as YYYY-MM-DD HH:MM:SS (return for now):

@arogge arogge force-pushed the dev/pstorz/master/timestamp-format branch from 64d3cbd to 2161063 Compare October 31, 2023 14:43
change most uses of std::string's data() to use c_str() where
a const char* is sufficient.
So uses of _ will not be broken by the preprocessor.
@arogge arogge force-pushed the dev/pstorz/master/timestamp-format branch from 2161063 to 9efbeb0 Compare October 31, 2023 14:46
@arogge arogge changed the title timestamps: set default to iso 8601 format. timestamps: use ISO 8601 format Oct 31, 2023
Copy link
Member

@arogge arogge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I found out today we have a few outstanding issues with this PR left:

  • requires at least PostgreSQL 11 due to TO_CHAR()
  • at least one query getting Job.StartTime passes it unconverted to StrToUtime() which may lead to miscalculated since time, there may be more occurences of that though

Previously this was done with strings, so we passed the raw timestamp
from the database around and parsed it where needed. Now the database
returns a utime_t and we can format it if desired.
@arogge arogge added this to the 24.0.0 milestone Apr 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants