Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DD/MM/YY 12:59:59 format #103

Closed
alch76 opened this issue Aug 4, 2017 · 12 comments · May be fixed by #106
Closed

DD/MM/YY 12:59:59 format #103

alch76 opened this issue Aug 4, 2017 · 12 comments · May be fixed by #106
Milestone

Comments

@alch76
Copy link

alch76 commented Aug 4, 2017

DD/MM/YY 12:59:59 format is not supported.
See below
This is 04-August-2017

Linux 2.6.32-504.el6.x86_64 (hidden) 	04/08/17 	_x86_64_	(2 CPU)

00:00:01        CPU      %usr     %nice      %sys   %iowait    %steal      %irq     %soft    %guest     %idle
00:10:01        all      1.30      0.00      0.50      0.06      0.00      0.00      0.01      0.00     98.13
@Pitterling
Copy link
Collaborator

its nearly impossible to distinguish bw MM/DD/YY and DD/MM/YY .. automatic detection will also fail.

As a workaournd: change 04/08/17 to 08/04/17 and use the supported format.

@vlsi
Copy link
Owner

vlsi commented Aug 4, 2017

Well, the other options are:

  1. Add DD/MM/YY explicitly
  2. Exclude those options that are known to fail for the given input.
  3. Add a free-text format field
  4. Win the worst date format input challenge somehow else :)

@vlsi
Copy link
Owner

vlsi commented Aug 4, 2017

For the funs sake I've tried a quick test to find out how high odds are we could parse the date properly.

require 'date'

ST = Date.strptime("04/01/2017", "%m/%d/%Y")
EN = Date.strptime("08/01/2017", "%m/%d/%Y")

def check(str, mask)
  d = Date.strptime(str, mask) rescue nil
  return false if d.nil?
  return str != d.strftime("%y/%m/%d") && ST <= d && d<=EN
end

dates = (ST..EN).group_by do |i|
  str = i.strftime("%y/%m/%d")
  if check(str, "%y/%d/%m") || check(str, "%d/%m/%y") || check(str, "%m/%d/%y") then "ko" else "ok" end
end

ok = dates["ok"].count
ko = dates["ko"].count

p "probability of proper parsing is #{100.0*ok/(ok+ko)} (dates in range #{ST..EN})"

The results are as follows:

probability of proper parsing is 53.9 (dates in range 2013-01-01..2017-08-01)
probability of proper parsing is 82.5 (dates in range 2016-08-01..2017-08-01)
probability of proper parsing is 90.2 (dates in range 2017-04-01..2017-08-01)

That means: for recent sar files the odds are quite high we could identify proper date format.

elkrieg added a commit to elkrieg/ksar that referenced this issue Aug 7, 2017
@elkrieg
Copy link
Contributor

elkrieg commented Aug 7, 2017

edited regexps of automatic detection. It can determine incorrect dates in case of 01/01/17 but at least it will not fail and determine correctly dates like 13/01/17 and 01/13/17

@elkrieg
Copy link
Contributor

elkrieg commented Aug 7, 2017

to increase accuracy, code can refer to current date, parsing 05/01/17, assuming, that the most probably sar has been captured recently =)

@vlsi
Copy link
Owner

vlsi commented Aug 16, 2017

Just in case:

  1. If environment variable S_TIME_FORMAT is set to ISO, sar prints date in ISO format %Y-%m-%d
  2. By default, sar (sysstat) uses strftime("%x") to format date. That is locale-dependent

I did try multiple variations like DateTimeFormatter.ofLocalizedDate(FormatStyle.SHORT).withLocale(locale), and it does not reproduce sar output.

For instance:

$ sar -V
sysstat version 7.0.2
$ cat /etc/redhat-release
Red Hat Enterprise Linux Server release 5.5 (Tikanga)
$ LC_ALL=hi_IN sar -w -s 17:00:00
Linux 2.6.18-194.el5 (devsp007.netcracker.com) 	बुधवार  16 अगस्त 2017

05:00:01  MSK   cswch/s
05:10:01  MSK  11716.06
05:20:01  MSK  11691.14
05:30:01  MSK   8847.26
05:40:01  MSK   9590.91
05:50:01  MSK  12005.88
Average:     10769.53

jdk1.8.0_102:

बुधवार, 16 अगस्त, 2017, locale: hi_IN, style: FULL
16 अगस्त, 2017, locale: hi_IN, style: LONG
16 अगस्त, 2017, locale: hi_IN, style: MEDIUM
16/8/17, locale: hi_IN, style: SHORT

"FULL" format cannot parse the date with the following exception:

java.time.format.DateTimeParseException: Text 'बुधवार  16 अगस्त 2017' could not be parsed at index 6
	at java.time.format.DateTimeFormatter.parseResolved0(DateTimeFormatter.java:1949)
	at java.time.format.DateTimeFormatter.parse(DateTimeFormatter.java:1777)

That means there is no way to parse date header automatically.

vlsi added a commit that referenced this issue Aug 18, 2017
Add more patterns to automatic format detector, pick a format that produces maximum date

fixes #103
vlsi added a commit that referenced this issue Aug 18, 2017
Add more patterns to automatic format detector, pick a format that produces maximum date

fixes #103
@delta160
Copy link

It would be nice to use the ISO format as default
see; https://xkcd.com/1179/
for some arguments ;-)

@Pitterling
Copy link
Collaborator

@delta160 what do you mean by "use as default" ?

@delta160
Copy link

delta160 commented Oct 20, 2017 via email

@Pitterling
Copy link
Collaborator

@delta160 well ISO format is supported since awhile. You can also stick to this format in the option dialog.

image

ksar2-0.0.2 ?? could you point me to this version?

Thx.

@delta160
Copy link

delta160 commented Oct 20, 2017 via email

@Pitterling
Copy link
Collaborator

This thread is improperly used in the meantime .. the original issue has been fixed #108

issue #113 "improve automatic format detection" has been created for further discussion

@Pitterling Pitterling added this to the 5.2.4 milestone Dec 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants