Hey, sometimes I have a problem with the response ...
# help
Hey, sometimes I have a problem with the response time of cerbos requests. IDK why sometimes it can even take 20-25 seconds, but usually it's only 0.005 seconds. No errors in the logs of cerbos container. What can be the cause? This is the request that I am trying:
Copy code
CERBOS_CLIENT = CerbosClient(host="<>", request_retries=10)    
sessionList=crud.session.get_active_by_owner_sub(db=db, owner_id=user_id)
    if sessionList:
        principal = Principal(
            roles=roles,  # type: ignore
        resource = Resource(
            sessionList[0].id.__str__(),  # type: ignore

        action = "read"
        allowed = CERBOS_CLIENT.is_allowed(action, principal, resource)
        if not allowed:
            raise HTTPException(
            status_code=403, detail="Unauthorized"
And these are the logs of last two requests, last one took around 16 seconds:
Copy code
{"log.level":"info","@timestamp":"2023-04-19T14:08:32.043Z","log.logger":"cerbos.grpc","message":"Handled request","grpc.start_time":"2023-04-19T14:08:32Z","system":"grpc","span.kind":"server","grpc.service":"cerbos.svc.v1.CerbosService","grpc.method":"CheckResources","peer.address":"","http":{"x_forwarded_for":[""],"x_forwarded_host":[""]},"cerbos":{"call_id":"01GYCXGWSBJR2P1WZ4HMZTG58W"},"grpc.code":"OK","grpc.time_ms":0.182}

{"log.level":"info","@timestamp":"2023-04-19T14:08:48.494Z","log.logger":"cerbos.grpc","message":"Handled request","grpc.start_time":"2023-04-19T14:08:48Z","system":"grpc","span.kind":"server","grpc.service":"cerbos.svc.v1.CerbosService","grpc.method":"CheckResources","http":{"x_forwarded_for":[""],"x_forwarded_host":[""]},"cerbos":{"call_id":"01GYCXHCVEJM0PE5Z3SB002T7B"},"peer.address":"","grpc.code":"OK","grpc.time_ms":0.291}
Hmm .. that's interesting. You can see in the log line that the evaluation itself took less than a millisecond (
). So that suggests the problem is elsewhere in the request chain. Do you have a proxy or a load balancer in front of Cerbos? How many instances are you running and were they healthy during the times that you noticed these spikes?
I'm not a Node expert but I'd imagine that if you have lots of tasks in the event loop, that could add some overhead as well.
For now we are not using any load balancer in front of Cerbos. We're connecting with cerbos only with http request. Is it better to use load balancer? Can it be a cause of low speed?
No, you don't have to use a load balancer. (However, it might be better to have more than one Cerbos instance running in production.) I was just trying to eliminate potential reasons for the occasional spikes you mentioned. How busy is your service when you notice them? Are you able to provide us with a trace from Jaeger (or any Other sink) from one of these times?