Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limiting the maximum disk usage #1899

Open
delthas opened this issue Oct 20, 2023 · 6 comments
Open

Limiting the maximum disk usage #1899

delthas opened this issue Oct 20, 2023 · 6 comments
Labels
enhancement New feature or request

Comments

@delthas
Copy link

delthas commented Oct 20, 2023

I'd like to install a small package cache on a VM with limited storage, to speed up go get in my Golang CI jobs. The VM storage space is very limited: ~10GB. I'd like to tell athens to use disk storage, up to that limit. If it's reached, it deletes existing packages to make space for new packages (eg, least recently used, least frequently used, ...)

I'd typically see this as a config option next to the disk storage path.

I suppose I could also clean existing packages manually with a cron, but it's a bit more cumbersome, athens knows how much it stores and can clean packages just in time to make space for a new one.

@matt0x6F
Copy link
Contributor

Athens would have to do a lot of work here that's kind of out of the scope of a Go Proxy. Would it be reasonable to suggest writing a daemon or CronJob that monitors that disk and removes files in the order you desire?

@delthas
Copy link
Author

delthas commented Feb 10, 2024

It could work, but ideally the cache should be based on LRU (least recently used), like most HTTP caches, etc. So that when space needs to be cleared for the package to be saved, the package that was requested (not saved) the longest time ago is deleted. With an external script, I would probably only be able to delete based on file mtime, so the least recently fetched. Meaning that a package that is fetched often will still get cleared regularly after a full cache rotation.

While Athens is a "Go Proxy", it's a also really a cache for Go packages, and having a fixed cache size, with a small logic to delete the LRU ones, is quite standard for caches.

As a first step, an external script could work, but I think that it would really make sense for Athens to have this kind of logic.

@matt0x6F
Copy link
Contributor

Ah, these are good points. Theoretically we could probably attach some "access" metadata to the indexer. Then I could see having a subprocess that runs continually and removes entries from the index as well as the filesystem based on some threshold criteria.

@matt0x6F matt0x6F added the enhancement New feature or request label Feb 19, 2024
@ionrover2
Copy link

What you’re suggesting, i think, could be accomplished with an nginx proxy. I’m not the authority, but i think the intention behind Athens is to ensure longevity of modules than act as an LRU cache. The nature of unused modules getting purged seems counterpoint to the intention of the product. To me at least, I could be very wrong.

Why do you need Athens for this over something like nginx?

@delthas
Copy link
Author

delthas commented Feb 28, 2024

For public packages, setting an nginx HTTP reverse proxy in front of https://proxy.golang.org would probably work after some tweaks. Additional configuration would probably be needed to cache only source files (so basically, .zip) but not metadata; and to store requests on the disk, with a maximum size. At this point nginx would be a working alternative; although would require some thinking for configuring etc.

However for my use case, I would also like to cache packages of a large internal Gitlab (unreachable over the Internet). So I can't just proxy HTTP over proxy.golang.org; and ask proxy.golang.org to fetch the packages for me over Git. I really need a tool that fetches the packages over Git itself. Hence, Athens.

@ionrover2
Copy link

I also work in an airgapped lab environment and have this same use case!

I have an instance of Athens running on the public internet that I use as a GOPROXY in order to get the packages I need.

Using a deployment that has the same dockerfile in my airgapped environment, I manually tar up the Athens storage directory to a flash drive and unpack it to the same spot on the airgapped side with decent regularity.

This has been great for my team as previously they were vendoring dependencies to dummy projects and then having to hand jam into the project they actually wanted. If you use the GOPROXY variable with the direct clause at the end, you can access your internal golang projects through the proxy in your airgappend environment or directly from their git repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants