Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

model: make next_offset saturate rather than overflow #18308

Merged

Conversation

nvartolomei
Copy link
Contributor

model::offset::max() is often used to indicate "no upper bound" on operations. E.g. for tiered storage uploads1, for reading from local storage2, etc.

We also do often convert from closed to opened offset intervals representations. E.g. committed offset to LSO and the other way around.

When combined, these can result in unexpected behaviors. In particular, if on a read path the max offset is specified as model::offset_max() but at lower level this is converted into an exclusive offset by calling next_offset(model::offset::max()), the result is model::offset::min() aka -2^63.

This is dangerous. Let's instead saturate the offset similar to how we saturate prev_offset.

We also have a few cases where we just do o + model::offset(1). These should be refactored to use next_offset too.

This isn't fixing any existing known bug. Discovered this while trying to rewrite some logic related to tiered storage uploads.

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v24.1.x
  • v23.3.x
  • v23.2.x

Release Notes

  • none

Footnotes

  1. https://github.com/redpanda-data/redpanda/blob/79bf7eed6e04da1d0987b5abd719c4b289dde761/src/v/archival/ntp_archiver_service.cc#L1656

  2. https://github.com/redpanda-data/redpanda/blob/79bf7eed6e04da1d0987b5abd719c4b289dde761/src/v/cluster/migrations/tx_manager_migrator.cc#L219

model::offset::max() is often used to indicate "no upper bound" on
operations. E.g. for tiered storage uploads[^1], for reading from local
storage[^2], etc.

We also do often convert from closed to opened offset intervals
representations. E.g. committed offset to LSO and the other way around.

When combined, these can result in unexpected behaviors. In particular,
if on a read path the max offset is specified as model::offset_max() but
at lower level this is converted into an exclusive offset by calling
next_offset(model::offset::max()), the result is model::offset::min()
aka -2^63.

This is dangerous. Let's instead saturate the offset similar to how we
saturate prev_offset.

We also have a few cases where we just do `o + model::offset(1)`. These
should be refactored to use next_offset too.

This isn't fixing any existing known bug. Discovered this while trying
to rewrite some logic related to tiered storage uploads.

[^1]: https://github.com/redpanda-data/redpanda/blob/79bf7eed6e04da1d0987b5abd719c4b289dde761/src/v/archival/ntp_archiver_service.cc#L1656
[^2]: https://github.com/redpanda-data/redpanda/blob/79bf7eed6e04da1d0987b5abd719c4b289dde761/src/v/cluster/migrations/tx_manager_migrator.cc#L219
@nvartolomei nvartolomei marked this pull request as ready for review May 8, 2024 20:07
@nvartolomei
Copy link
Contributor Author

nvartolomei commented May 8, 2024

I believe the model::offset::max() uses are wrong when used to indicate "no upper bound" for an inclusive offset. Instead model::offset{} should be used. But it isn't easy to follow that role/spot these mistakes in code reviews...

That's wrong! next_offset(model::offset{}) == 0 <=> "model::offset{} means empty"

@vbotbuildovich
Copy link
Collaborator

Copy link
Contributor

@andrwng andrwng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A little bit scary but I think it makes sense. It's also similar to how prev_offset handles anything below 0 as min()

@nvartolomei nvartolomei merged commit a1ab255 into redpanda-data:dev May 13, 2024
18 checks passed
@nvartolomei nvartolomei deleted the nv/saturating-next-offset branch May 13, 2024 08:40
@dotnwat
Copy link
Member

dotnwat commented May 15, 2024

A little bit scary but I think it makes sense. It's also similar to how prev_offset handles anything below 0 as min()

@nvartolomei @andrwng should we log an ERROR message for this case so that ducktape will reveal any cases that we might not know about? then later after some time we can remove it (or not).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants