Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create historical versions of source statistics built in views #27135

Merged
merged 6 commits into from
May 16, 2024

Conversation

SangJunBak
Copy link
Contributor

@SangJunBak SangJunBak commented May 16, 2024

Due to an issue with latency and high CPU usage, we create a non-historical index + view on top of mz_source_statistics such that Console's source lists efficiently query on it. The latency and high CPU usage is related to us retaining history, causing the number of updates in the index to be large.

From 5 cases, there seems to be a stark improvement with these new indexes:

Screenshot 2024-05-16 at 2 16 11 PM Screenshot 2024-05-16 at 2 19 18 PM

Motivation

Tips for reviewer

Checklist

@SangJunBak SangJunBak force-pushed the jun/add-source-statistics-history-tables branch from b975634 to 41371dc Compare May 16, 2024 16:35
Due to an issue with latency and high CPU usage, we create a non-historical index + view on top of mz_source_statistics such that Console's source lists efficiently query on it. The latency and high CPU usage is related to us retaining history, causing the number of updates in the index to be large.
@SangJunBak SangJunBak force-pushed the jun/add-source-statistics-history-tables branch from 41371dc to d254907 Compare May 16, 2024 17:38
Copy link
Contributor

@RobinClowers RobinClowers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how this works, but does renaming those indexes mean we will loose all the existing historical data?

Copy link
Contributor

@maddyblue maddyblue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will need signoff from the epd-mz-introspection-council due to a new index.

@@ -674,3 +674,5 @@ pub const INDEX_MZ_KAFKA_SOURCES_IND_OID: u32 = 16950;
pub const INDEX_MZ_WEBHOOK_SOURCES_IND_OID: u32 = 16951;
pub const TABLE_MZ_HISTORY_RETENTION_STRATEGIES_OID: u32 = 16952;
pub const SOURCE_MZ_MATERIALIZED_VIEW_REFRESHES_OID: u32 = 16953;
pub const VIEW_MZ_SOURCE_STATISTICS_OID: u32 = 16954;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this name is staying, it should keep its old oid. Give VIEW_MZ_SOURCE_STATISTICS_WITH_HISTORY_OID and the other new thing new oids instead.

@SangJunBak
Copy link
Contributor Author

SangJunBak commented May 16, 2024

I'm not sure how this works, but does renaming those indexes mean we will loose all the existing historical data?

@RobinClowers My first guess would be no given the historical view derives from MZ_SOURCE_STATISTICS_RAW that has the existing historical data so this new index would hydrate with the same state. I'll let @maddyblue weigh in tho

@maddyblue
Copy link
Contributor

You can run REWRITE=true bin/cargo-test -p mz-environmentd --test server test_http_sql to fix the failing tests in mz-environmentd::server test_http_sql.

To fix the sqllogictests run ./bin/sqllogictest -- --postgres-url $COCKROACH_URL test/sqllogictest/cluster.slt test/sqllogictest/information_schema_tables.slt test/sqllogictest/mz_introspection_index_accounting.slt test/sqllogictest/oid.slt --rewrite-results

The testdrive test will need manual rewriting, but its diff is printed in the test failure output so should be applied by hand. I'm not going to push up a fix for these because I think changing the oids as requested will result in more changes, but these may be enough to unblock you.

@chaas chaas added the release-blocker Critical issue that should block *any* release if not fixed label May 16, 2024
@SangJunBak SangJunBak marked this pull request as ready for review May 16, 2024 19:01
@SangJunBak SangJunBak requested a review from a team as a code owner May 16, 2024 19:01
@SangJunBak SangJunBak requested review from ParkMyCar, maddyblue and RobinClowers and removed request for ParkMyCar May 16, 2024 19:01
@chaas
Copy link
Contributor

chaas commented May 16, 2024

LGTM from mz_introspection council. The new index has only a very small number of records, so the memory impact should be minimal.

Copy link
Contributor

@maddyblue maddyblue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good chance that the autogenerated/docs tests will fail due to the new things not being documented. ./ci/test/lint-docs-catalog.sh --rewrite may help there, and probably need to add docs.

@SangJunBak SangJunBak force-pushed the jun/add-source-statistics-history-tables branch from d0f9731 to c418769 Compare May 16, 2024 19:27
such that existing schema doesn't need to change
@SangJunBak SangJunBak force-pushed the jun/add-source-statistics-history-tables branch from c418769 to 4140341 Compare May 16, 2024 19:28
@SangJunBak SangJunBak enabled auto-merge May 16, 2024 19:42
@SangJunBak SangJunBak merged commit 2710e23 into main May 16, 2024
75 checks passed
@SangJunBak SangJunBak deleted the jun/add-source-statistics-history-tables branch May 16, 2024 20:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-blocker Critical issue that should block *any* release if not fixed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants