Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert github_content registries including Standard Registry to SQLite for performance when installing them #2520

Open
suzuki-shunsuke opened this issue Nov 24, 2023 · 3 comments
Labels
enhancement New feature or request performance

Comments

@suzuki-shunsuke
Copy link
Member

Feature Overview

Convert github_content registries including Standard Registry to SQLite when installing them.

Why is the feature needed?

Similar with #2517 .

To improve the performance to read the standard registry.
Stanard registry is a huge YAML file over 30,000 lines and aqua needs to read entire files so it has a little overhead to read it.
By converting YAML to SQLite, aqua doesn't need to read all of them.
And registry maintainers don't need to do the conversion themselves because aqua converts them internally.

How to reproduce the issue

No response

Workaround

No response

Example Code

aqua converts registry.yaml to registry.yaml.sqlite3 when aqua installs registries.
When aqua reads registries, aqua tries to read registry.yaml.sqlite3 first.
If registry.yaml.sqlite3 isn't found aqua looks for registry.yaml.
If registry.yaml is found, aqua creates registry.yaml.sqlite3.
If registry.yaml isn't found aqua installs it.

Reference

@suzuki-shunsuke suzuki-shunsuke added enhancement New feature or request performance labels Nov 24, 2023
@suzuki-shunsuke
Copy link
Member Author

I'm not sure if aqua really gets fast by SQLite3.
I'm not familiar with SQLite3, but RDB itself has overhead.
Standard Registry is a huge YAML file so SQLite3 may be useful, but almost all local and github_content registries are small so maybe SQLite makes aqua slow.
So maybe aqua needs to support both SQLite3 and YAML.
SQLite3 support makes aqua complicated.
Unlike JSON conversion, I guess we need to fix many code for SQLite3.

Anyway, about performance we should measure, not guess.

@sheldonhull
Copy link

I recall we chatted about this in a past discussion. Are you beginning to see performance impact?

Also if it's the size of a single file that's the problem I'm curious if you've thought about instead having the registry be the actual split yaml files without merging to a single and just have Go load all of those from the directory.

curious so no rush in response. There's a few cool local storage packages and I'm interested to see how this works for you.

@suzuki-shunsuke
Copy link
Member Author

Are you beginning to see performance impact?

I don't think so.
I saw a little complaint about the performance of aqua on X (formerly Twitter), but I think aqua is enough fast.
So I don't have any motivation to change codes drastically for performance.
But if we can improve the performance with small changes, it's great.

Also if it's the size of a single file that's the problem I'm curious if you've thought about instead having the registry be the actual split yaml files without merging to a single and just have Go load all of those from the directory.

I didn't thought that.
Indeed, the standard registry is split according to the package names (e.g. cli/cli => pkgs/cli/cli/registry.yaml) so aqua can read only necessary files.
It's interesting.
One of the concerns is that when aliases are used aqua can't find registry.yaml, but this is edge cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request performance
Projects
Status: Backlog
Development

No branches or pull requests

2 participants