Hey folks, quick question regarding auxData <https...
# help
j
Hey folks, quick question regarding auxData https://docs.cerbos.dev/cerbos/latest/configuration/auxdata.html. The document only mention the JWT as the source of external data. Does Cerbos support referencing other data sources, like external API, DBs, or does it have a data store where we could store some data required for policy evaluation.
a
Hey there!
Currently the auxData block just support JWT tokens. One of our key design decisions early on was to make Cerbos stateless and not having it call out to external services or databases, and requiring the calling application (Policy Enforcement Point/PEP) do the data fetching up and pass it along in the request Cerbos (the Policy Decision Point). The reasons behind this comes from some war stories we had a previous companies where we had to build AuthZ for large scale/high throughput systems (30bn+ requests a day) and having the AuthZ layer call out to other parts of the infrastructure to fetch state we didn't see as a sensible solution as what maybe a simple condition in a policy could be unexpected loaded on those other components - meaning your entire stack needs to be scaled OR the PDP needs to handle caching etc which then opens up a whole can of works around cache busting and eventual consistency. In most cases we see, the PEP is the owner of a lot that data required and already has loaded to handle the request anyway, and knows best what can be cached or not - meaning the PDP is purely making an AuthZ decision based on the inputs it recieves - evaluate it all in memory without requiring any disk or network IO ensuring a timely response - which is key as AuthZ is in the blocking path of every single request to an application.
Also from a deployment perspective, having the PDP being stateless means you can run it right alongside your application (k8s sidecar for example) and scale without having worry about any dependant services and eliminating any unnesscary network hops.
j
Thanks for the quick response. IMO, If we overload the request to PDP with all the necessary data, we're essentially transferring the data aggregation from the PDP to the PEP. The overall system load would remain constant, just shifted to a different component. Relying more on data from claims on JWT might not necessarily address complex authentication requirements effectively. What do you think about these scenarios?
a
Claims from JWT only get you so far as you pointed out, and can quickly lead to token bloat. In a 'perfect' scenario, if a user is interacting with a resource, the handling service will be the one most likely 'owning' that data so will be pulling it out of the database anyway - and so can pass it along to the PDP. If the PDP was also going out to fetch state, then that would be two separate systems querying for data rather than one. I appreciate that isn't always the case depending on your architecture, but we see the benefits of the PDP being stateless and a known quantity interms of how it handles data being a reasonable tradeoff. If you would like we can jump on a call and go into your specific use case?