Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enforce minimum cache size for transit backend #12418

Merged
merged 13 commits into from Sep 13, 2021
Merged

Enforce minimum cache size for transit backend #12418

merged 13 commits into from Sep 13, 2021

Conversation

divyapola5
Copy link
Contributor

https://hashicorp.atlassian.net/browse/VAULT-2914

This PR includes changes to enforce minimum cache size (10) for transit backend.

Testing done:

Default size:

divyapola@divyapola-C02CF744MD6R vault % vault secrets enable transit
Success! Enabled the transit secrets engine at: transit/
divyapola@divyapola-C02CF744MD6R vault % vault read transit/cache-config
Key Value
size 0


Set the cache size to a value less than 10:

divyapola@divyapola-C02CF744MD6R vault % vault write transit/cache-config size=6
WARNING! The following warnings were returned from Vault:

  • size 6 is less than default minimum 10. Cache size is set to 10

  • cache configurations will be applied when this backend is restarted

divyapola@divyapola-C02CF744MD6R vault % vault read transit/cache-config
Key Value
size 10


Set the cache size to 1:

divyapola@divyapola-C02CF744MD6R vault % vault write transit/cache-config size=1
WARNING! The following warnings were returned from Vault:

  • size 1 is less than default minimum 10. Cache size is set to 10

  • cache configurations will be applied when this backend is restarted

divyapola@divyapola-C02CF744MD6R vault % vault read transit/cache-config
Key Value
size 10


Successfully restarted vault:

divyapola@divyapola-C02CF744MD6R vault % vault server -config=config.hcl
You cannot specify a custom root token ID outside of "dev" mode. Your request
has been ignored.
==> Vault server configuration:

         Api Address: http://127.0.0.1:8200
                 Cgo: disabled
     Cluster Address: https://127.0.0.1:8201
          Go Version: go1.16.6
          Listener 1: tcp (addr: "0.0.0.0:8200", cluster address: "0.0.0.0:8201", max_request_duration: "1m30s", max_request_size: "33554432", tls: "disabled")
           Log Level: info
               Mlock: supported: false, enabled: false
       Recovery Mode: false
             Storage: file
             Version: Vault v1.8.0-dev
         Version Sha: 910da26a2b67dff7727fc01a56392bf4affaa2bb

==> Vault server started! Log data will stream in below:

2021-08-24T10:47:42.258-0500 [INFO] proxy environment: http_proxy="" https_proxy="" no_proxy=""
2021-08-24T10:47:47.681-0500 [INFO] core.cluster-listener.tcp: starting listener: listener_address=0.0.0.0:8201
2021-08-24T10:47:47.681-0500 [INFO] core.cluster-listener: serving cluster requests: cluster_listen_address=[::]:8201
2021-08-24T10:47:47.682-0500 [INFO] core: post-unseal setup starting
2021-08-24T10:47:47.685-0500 [INFO] core: loaded wrapping token key
2021-08-24T10:47:47.685-0500 [INFO] core: successfully setup plugin catalog: plugin-directory=""
2021-08-24T10:47:47.690-0500 [INFO] core: successfully mounted backend: type=system path=sys/
2021-08-24T10:47:47.692-0500 [INFO] core: successfully mounted backend: type=identity path=identity/
2021-08-24T10:47:47.693-0500 [INFO] core: successfully mounted backend: type=transit path=transit/
2021-08-24T10:47:47.693-0500 [INFO] core: successfully mounted backend: type=cubbyhole path=cubbyhole/
2021-08-24T10:47:47.700-0500 [INFO] core: successfully enabled credential backend: type=token path=token/
2021-08-24T10:47:47.701-0500 [INFO] rollback: starting rollback manager
2021-08-24T10:47:47.701-0500 [INFO] core: restoring leases
2021-08-24T10:47:47.703-0500 [INFO] identity: entities restored
2021-08-24T10:47:47.703-0500 [INFO] identity: groups restored
2021-08-24T10:47:47.704-0500 [INFO] expiration: lease restore complete
2021-08-24T10:47:47.706-0500 [INFO] core: post-unseal setup complete
2021-08-24T10:47:47.706-0500 [INFO] core: vault is unsealed


Verify that cache size is set to 10 after restarting vault:

divyapola@divyapola-C02CF744MD6R vault % vault operator unseal
Unseal Key (will be hidden):
Key Value
Seal Type shamir
Initialized true
Sealed false
Total Shares 1
Threshold 1
Version 1.8.0-dev
Storage Type file
Cluster Name vault-cluster-74d7615e
Cluster ID 9a05ee54-e2a3-f709-b33e-c18f27513c4a
HA Enabled false

divyapola@divyapola-C02CF744MD6R vault % vault read transit/cache-config
Key Value
size 10


Set cache size to 0:

divyapola@divyapola-C02CF744MD6R vault % vault write transit/cache-config size=0
WARNING! The following warnings were returned from Vault:

  • cache configurations will be applied when this backend is restarted

divyapola@divyapola-C02CF744MD6R vault % vault read transit/cache-config
Key Value
size 0


Set the cache size to value greater than 10:

divyapola@divyapola-C02CF744MD6R vault % vault write transit/cache-config size=24
WARNING! The following warnings were returned from Vault:

  • cache configurations will be applied when this backend is restarted

divyapola@divyapola-C02CF744MD6R vault % vault read transit/cache-config
Key Value
size 24


@hashicorp-cla
Copy link

hashicorp-cla commented Aug 24, 2021

CLA assistant check
All committers have signed the CLA.

@vercel vercel bot temporarily deployed to Preview – vault-storybook August 24, 2021 16:06 Inactive
@vercel vercel bot temporarily deployed to Preview – vault August 24, 2021 16:06 Inactive
@divyapola5 divyapola5 changed the title Cache size fix Enforce minimum cache size for transit backend Aug 24, 2021
Copy link
Contributor

@sgmiller sgmiller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple small points only.

builtin/logical/transit/backend.go Outdated Show resolved Hide resolved
builtin/logical/transit/backend.go Outdated Show resolved Hide resolved
builtin/logical/transit/path_cache_config.go Outdated Show resolved Hide resolved
@sgmiller
Copy link
Contributor

And you should add a unit test for this behavior too. You can add it to the existing TestTransit_CacheConfig function.

Copy link
Contributor

@victorr victorr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a small suggestion.

Good job!

@mladlow mladlow added this to the 1.8.3 milestone Aug 24, 2021
@mladlow
Copy link
Contributor

mladlow commented Aug 24, 2021

@divyapola5 we're aiming to include this in Vault 1.8.3, so once this PR has merged we'll need to backport it to the release/1.8.x branch.

@vercel vercel bot temporarily deployed to Preview – vault-storybook August 24, 2021 22:22 Inactive
@vercel vercel bot temporarily deployed to Preview – vault August 24, 2021 22:23 Inactive
@ncabatoff
Copy link
Contributor

Fixes #12064.

@ncabatoff
Copy link
Contributor

ncabatoff commented Aug 25, 2021

This change solves the immediate issue. From what I can see that was:

  • patchCacheConfigWrite stores the new cache size without making use of it
  • at startup we read the non-zero configured cache size from storage and pass it to NewTransitLRU
  • NewTransitLRU calls lru.New2Q(size), and uses int(size/2) for one of its internal LRU caches, which works out to 0 if size==1, which is invalid

There's still the bug that patchCacheConfigWrite isn't making use of the updated cache size, so it only takes effect after a restart. It feels like we should fix that too, else conceivably some future change to the lru library could re-introduce a variant of this bug. In other words, failing to apply the change prior to storing the config opens the door to another bug that prevents Vault from coming up. It's also a bad UX - we don't normally require a vault restart for changes to be applied.

@divyapola5 divyapola5 closed this Aug 25, 2021
@divyapola5 divyapola5 reopened this Aug 25, 2021
@divyapola5
Copy link
Contributor Author

This change solves the immediate issue. From what I can see that was:

  • patchCacheConfigWrite stores the new cache size without making use of it
  • at startup we read the non-zero configured cache size from storage and pass it to NewTransitLRU
  • NewTransitLRU calls lru.New2Q(size), and uses int(size/2) for one of its internal LRU caches, which works out to 0 if size==1, which is invalid

There's still the bug that patchCacheConfigWrite isn't making use of the updated cache size, so it only takes effect after a restart. It feels like we should fix that too, else conceivably some future change to the lru library could re-introduce a variant of this bug. In other words, failing to apply the change prior to storing the config opens the door to another bug that prevents Vault from coming up. It's also a bad UX - we don't normally require a vault restart for changes to be applied.

pathCacheConfigWrite isn't making use of the updated cache size even before this change. Are you saying that we should change that behavior as part of this PR?

@ncabatoff
Copy link
Contributor

pathCacheConfigWrite isn't making use of the updated cache size even before this change. Are you saying that we should change that behavior as part of this PR?

I'm saying we should try, yeah. It's possible that will turn out to be a bigger change than we want to tackle as part of this PR, but I'd like us to make the attempt.

@vercel vercel bot temporarily deployed to Preview – vault-storybook September 10, 2021 18:50 Inactive
@vercel vercel bot temporarily deployed to Preview – vault September 10, 2021 18:50 Inactive
Copy link
Contributor

@sgmiller sgmiller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of requests plus this needs a Changelog entry, but otherwise looks good!

builtin/logical/transit/backend.go Show resolved Hide resolved
builtin/logical/transit/backend.go Show resolved Hide resolved
@@ -101,6 +101,24 @@ func (lm *LockManager) InvalidatePolicy(name string) {
}
}

func (lm *LockManager) RefreshCache(cacheSize int) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we call this InitCache or something? Refresh kind of implies you're refreshing the cached data somehow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that makes sense.

@vercel vercel bot temporarily deployed to Preview – vault September 13, 2021 19:22 Inactive
@vercel vercel bot temporarily deployed to Preview – vault-storybook September 13, 2021 19:22 Inactive
@divyapola5 divyapola5 merged commit 94d4fdb into main Sep 13, 2021
@divyapola5 divyapola5 deleted the cacheSizeFix branch September 13, 2021 21:45
divyapola5 added a commit that referenced this pull request Sep 14, 2021
* Enforce Minimum cache size for transit backend

* enfore minimum cache size and log a warning during backend construction

* Update documentation for transit backend cache configuration

* Added changelog

* Addressed review feedback and added unit test

* Modify code in pathCacheConfigWrite to make use of the updated cache size

* Updated code to refresh cache size on transit backend without restart

* Update code to acquire read and write locks appropriately
divyapola5 added a commit that referenced this pull request Sep 15, 2021
* Enforce Minimum cache size for transit backend

* enfore minimum cache size and log a warning during backend construction

* Update documentation for transit backend cache configuration

* Added changelog

* Addressed review feedback and added unit test

* Modify code in pathCacheConfigWrite to make use of the updated cache size

* Updated code to refresh cache size on transit backend without restart

* Update code to acquire read and write locks appropriately
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants