Hi everyone, I had a question about the <Reload A...
# help
b
Hi everyone, I had a question about the Reload API I'm noticing a scenario in my integration tests where a newly added policy does not seem accessible. However, this occurs only when there are many policies being added in sequence (from previously ran tests). It smells like a race condition... Calling reload with wait=true solves the issue, but I think it's because of the wait rather than the reloading itself. (Note, we have not set the polling config value, which therefore means the polling is set to 0) I'm wondering if there is a way to confirm whether such a scenario is indeed a race condition or not. Do we have any load test bench marking, etc?
o
Hi @Billy Bolton, Let me ask some questions to understand the problem better. • What storage driver you're using? • While these policies are added during the integration tests, do you exercise these policies via
CheckResources
or
PlanResources
RPC calls? • Could
newly added policy
also mean the policies overriden along the way? (meaning addition of some specific policy in some step, and later updating it)
b
• What storage driver you're using?
Postgres
• While these policies are added during the integration tests, do you exercise these policies via
CheckResources
or
PlanResources
RPC calls?
Yes to both, but mostly PlanResources, which yields
KIND_ALWAYS_DENIED
because it is not found despite just adding the policy.
• Could
newly added policy
also mean the policies overriden along the way? (meaning addition of some specific policy in some step, and later updating it)
Yes, also overwriting along the way (at times -- I've tried to avoid this) Again, adding the Reload RPC call after adding the policy seems to fix things, but not sure if it's because of setting wait=true
Also, running the failing test in isolation without the Reload call passes no problem -- should have mentioned this
o
Given these information and assuming there is only a single Cerbos instance running in the test environment, I'd say if the tests are adding/updating policies and then immediately exercise them it could be a race condition.
Calling reload with wait=true solves the issue, but I think it's because of the wait rather than the reloading itself.
You could try to add some
sleep
duration after policies are added/updated to validate this is the case. I will also try to reproduce it locally, but it might take me some time.
b
yea been meaning to try the sleep too. Will let you know when I'm back in it again, maybe Monday. Thanks for taking a look into this. (And yes, it is a single Cerbos instance) @oguzhan 🙏
🙌🏻 1