Skip to content

Commit

Permalink
analyze.py: improve stargazer resampler
Browse files Browse the repository at this point in the history
Make it so that the right-most data point
never goes into the future even if the
resample interval gets large.

The resample interval in the test suite is
~240 hours, and it became a little more than
a cosmetic problem: it looked like inventing
a data point from the (near) future :).
  • Loading branch information
jgehrcke committed Sep 30, 2023
1 parent 4dd2e9e commit cdd5b52
Showing 1 changed file with 3 additions and 1 deletion.
4 changes: 3 additions & 1 deletion analyze.py
Original file line number Diff line number Diff line change
Expand Up @@ -1573,7 +1573,9 @@ def downsample_series_to_N_points(df, column):
# up-sampled data points (so that each data point still reflects an actual
# event or a group of events, but when there was no event within a bin then
# that bin does not appear with a data point in the resulting plot).
s = s.resample(f"{bin_width_hours}h").max().dropna()
# The resample operation might put the last data point into the future,
# Let's correct for that by putting origin="end".
s = s.resample(f"{bin_width_hours}h", origin="end").max().dropna()

log.info("len(series): %s", len(s))

Expand Down

0 comments on commit cdd5b52

Please sign in to comment.