hi. I'm using the Postgres DB for storage. I'm fin...
# help
hi. I'm using the Postgres DB for storage. I'm finding that if we disable a policy and the policy is already in cache, any resource checks will still ALLOW until I manually send a Store Reload request to purge the cache. I would have thought that Cerbos would automatically invalidate its own cache when a policy changes. Can someone confirm?
Do you have multiple Cerbos instances? The cache should be purged on the instance that handled the disable request but the other instances would still have the policy in their cache.
@Charith (Cerbos) how would it work with k8s autoscaling then? the same instance is not guaranteed there as it’s load-balanced, right?
Yes. Cerbos instances are not aware of each other so they can't really communicate amongst themselves to clear the cache across the fleet. If you're using a database store, you have to either disable caching (and take a small perf hit) or call
Admin API endpoint on all the instances after a policy update.
If you are OK with eventual consistency, you can tweak the cache expiry time to a value that suits you and the instances would eventually refresh themselves.
@Charith (Cerbos) The type of cache Cerbos instances use is in-memory, right? If so, could we have Redis caching as an option? I know this is not an easy addition, but this solves the above-mentioned issue and allows for better scalability and lower resource (memory) usage.
So in the config we could specify a
We can consider it but, this cache is storing compiled policies that on average take a few milliseconds to compile anyway. So having to hit Redis over the network might actually defeat the purpose of having the cache in the first place. It also kind of complicates the setup of Cerbos and introduces an external dependency not within our control that we'd still have to troubleshoot. So, it's not a simple, straightforward decision to make.
@Charith (Cerbos)
The cache should be purged on the instance that handled the disable request
This has not been my experience with the tests I have setup
See the summary results of this behavioral test that I whipped up.
Hmm.. that shouldn't happen unless your second request was right on the back of the reload call. It's an async process so it might not have been scheduled yet by the time your request comes through.
Actually, looks like it's a recently introduced bug. We're looking into it.
got it. thanks.