-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Instrument upsert points, optimization and flushing operations #4158
base: dev
Are you sure you want to change the base?
Conversation
lib/collection/src/collection_manager/holders/segment_holder.rs
Outdated
Show resolved
Hide resolved
lib/collection/src/collection_manager/optimizers/segment_optimizer.rs
Outdated
Show resolved
Hide resolved
lib/collection/src/collection_manager/optimizers/segment_optimizer.rs
Outdated
Show resolved
Hide resolved
lib/collection/src/collection_manager/optimizers/segment_optimizer.rs
Outdated
Show resolved
Hide resolved
Additionally: - Remove *all* trailing commas in `tracing` macro calls
341a846
to
ef73378
Compare
…erations if tracing is disabled BTW, neither work :/
fn tmp_path(&self) -> PathBuf { | ||
self.write_segment.get().read().data_path() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The result of this function is not stored anywhere and is used immediately. Would it make sense to return a non-owning type here to avoid clones? E.g. something like this: fn tmp_path(&self) -> MappedRwLockReadGuard<'_, Path<'_>>
. It would require a similar change for fn data_path()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think I'll make just that. Initially this was made for debug session, and I didn't care to refactor SegmentEntry::data_path
, so I just rolled with PathBuf
s everywhere.
But if we gonna merge this, I don't want to have allocating calls in the upsert_points
path. I've tried to implement a check with tracing
if particular span/event would be logged, and if not, avoid calling SegmentEntry::id
... but neither of my approaches worked, so far. :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, having a tracing::info!("{}", something())
is not enough to not call the something()
if it is disabled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I haven't strictly tested all of this, but:
- for message format arguments, it might be enough
- but I'm not using
SegmentEntry::id
as a format argument, I use it as a "field" - and
tracing
allows using fields for filtering - so, I assume, that fields have to always be evaluated (so that they can be used as for filtering)
- maybe, except when the span/event is disabled by the plain log-level
- but I explicitly use fields to filter tracing data in this PR
- so I assume
segment.id = segment.id()
in the instrumentation is always evaluated
To make it worse:
- I tried a few approaches to pre-filter instrumentation
- e.g., query "will this span/event be enabled given it has this log-level and these fields"
- and don't call
SegmentEntry::id
if it's disabled
- and don't call
- but in all cases
tracing
always reported instrumentation as enabled- even though instrumentation was not logged, i.e., it was filtered out
- so, I'd assume, that with how PR is structured right now,
tracing
evaluates all expressions first- and filtering applied later, once everything is already evaluated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Partially fixed this. SegmentEntry::id
is cheap to call now, though, I assume segment.id = %segment.id()
in instrumentation still allocates.
This PR adds
tracing
instrumentation to upsert points operations, optimizer and flush workers.Intended use is to enable this additional logs on chaos-testing, so that we can debug consistency failures.
tracing
filtering is a clusterfuck, so I've added a custominner = true
field to all added spans and events, so that we can turn them on/off easily (and independently of other log messages).All added spans and events are disabled by default (so they would not spam the console; this logging is extensive).
They can be enabled with the following filters:
[{inner}]=info
- enable all added instrumentation[upsert_points/insert]=info,[upsert_points/update]=info,[upsert_points/move]=info
- enable instrumentation related to upsert points operations[optimize]=info
- enable instrumentation related to optimization[flush_all]=info,[flush]=info
- enable instrumentation related to flushingQDRANT__LOG_LEVEL=error,[upsert_points/insert]=info,[optimize]=info,[flush]=info
All Submissions:
dev
branch. Did you create your branch fromdev
?New Feature Submissions:
cargo +nightly fmt --all
command prior to submission?cargo clippy --all --all-features
command?Changes to Core Features: