Hi All I m experiencing a fairly slow startup of cerbos dock Cerbos Community #help

Hi All, I'm experiencing a fairly slow startup of ...

Rafael Szuminski

02/06/2024, 11:17 PM

Hi All, I'm experiencing a fairly slow startup of cerbos docker containers. It takes about 60 seconds to transition from (health: starting) to {healthy). Is this normal?

Dennis (Cerbos)

02/06/2024, 11:42 PM

No, it is not. Could you let me know if you are using the default health-check setup? If not, please share its configuration.

Rafael Szuminski

02/06/2024, 11:44 PM

Copy code

server:
  httpListenAddr: ":3592"
  grpcListenAddr: ":3593"

  cors: # CORS defines the CORS configuration for the server.
    allowedHeaders: [ 'content-type', 'user-agent' ] # AllowedHeaders is the contents of the allowed-headers header.
    allowedOrigins: [ '*' ] # AllowedOrigins is the contents of the allowed-origins header.
    disabled: false # Disabled sets whether CORS is disabled.
    maxAge: 10s # MaxAge is the max age of the CORS preflight check.

engine:
  defaultPolicyVersion: "default"

auxData:
  jwt:
    disableVerification: true

storage:
  driver: "disk"
  disk:
    directory: "/policies"
    watchForChanges: true



telemetry:
  disabled: true

Rafael Szuminski

02/06/2024, 11:46 PM

container is started with docker compose:

Copy code

cerbos:
    image: ghcr.io/cerbos/cerbos:latest
    command: "server --config=/conf/config.yaml"
    ports:
      - "3592:3592"
      - "3593:3593"
    volumes:
      - ../cerbos/conf:/conf
      - ../cerbos/policies:/policies
    networks:
      - xxxx_network
    restart: unless-stopped
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.cerbos.rule=Host(`xxx.xxx.com`)"
      - "traefik.http.routers.cerbos.entrypoints=websecure"
      - "traefik.http.routers.cerbos.tls.certresolver=myresolver"
      - "traefik.http.services.cerbos.loadbalancer.server.port=3592"

Rafael Szuminski

02/06/2024, 11:54 PM

here is a log snippet:

Copy code

{"log.level":"info","@timestamp":"2024-02-06T23:52:08.661Z","log.logger":"cerbos.server","message":"Shutting down"}
{"log.level":"info","@timestamp":"2024-02-06T23:52:08.661Z","log.logger":"cerbos.dir.watch","message":"Stopped watching directory for changes","dir":"/policies"}
{"log.level":"info","@timestamp":"2024-02-06T23:52:08.661Z","log.logger":"cerbos.http","message":"HTTP server stopped"}
{"log.level":"info","@timestamp":"2024-02-06T23:52:08.661Z","log.logger":"cerbos.grpc","message":"gRPC server stopped"}
{"log.level":"info","@timestamp":"2024-02-06T23:52:08.662Z","log.logger":"cerbos.server","message":"Shutdown complete"}
{"log.level":"info","@timestamp":"2024-02-06T23:52:08.662Z","log.logger":"cerbos.server","message":"maxprocs: No GOMAXPROCS change to reset"}
{"log.level":"info","@timestamp":"2024-02-06T23:52:22.030Z","log.logger":"cerbos.server","message":"maxprocs: Leaving GOMAXPROCS=20: CPU quota undefined"}
{"log.level":"info","@timestamp":"2024-02-06T23:52:22.030Z","log.logger":"cerbos.server","message":"Loading configuration from /conf/config.yaml"}
{"log.level":"info","@timestamp":"2024-02-06T23:52:22.032Z","log.logger":"cerbos.disk.store","message":"Initializing disk store from /policies"}
{"log.level":"info","@timestamp":"2024-02-06T23:52:22.107Z","log.logger":"cerbos.index","message":"Found 50 executable policies"}
{"log.level":"info","@timestamp":"2024-02-06T23:52:22.108Z","log.logger":"cerbos.telemetry","message":"Telemetry disabled"}
{"log.level":"info","@timestamp":"2024-02-06T23:52:22.108Z","log.logger":"cerbos.dir.watch","message":"Watching directory for changes","dir":"/policies"}
{"log.level":"info","@timestamp":"2024-02-06T23:52:22.108Z","log.logger":"cerbos.grpc","message":"Starting gRPC server at :3593"}
{"log.level":"info","@timestamp":"2024-02-06T23:52:22.109Z","log.logger":"cerbos.http","message":"Starting HTTP server at :3592"}
{"log.level":"info","@timestamp":"2024-02-06T23:53:38.309Z","log.logger":"cerbos.grpc","message":"Handled request","protocol":"grpc","grpc.component":"server","grpc.service":"cerbos.svc.v1.CerbosService","grpc.method":"CheckResources","grpc.method_type":"unary","cerbos":{"call_id":"01HP0DEV85Y1PPYA3Y74EWVTP1"},"http":{"x_forwarded_for":["172.19.0.5"]

Rafael Szuminski

02/06/2024, 11:55 PM

so it took a little bit over a minute to accept a first connection after a restart of container

Rafael Szuminski

02/07/2024, 12:01 AM

and it's not traefik cause there are other services going through traefik that are up instantly

Dennis (Cerbos)

02/07/2024, 12:02 AM

So, the PDP served the check request as soon as it got healthy. Right?

Rafael Szuminski

02/07/2024, 12:02 AM

yes

Dennis (Cerbos)

02/07/2024, 12:04 AM

That’s interesting. The config looks pretty standard.

Rafael Szuminski

02/07/2024, 12:04 AM

one thing: other services in the compose file depend on cerbos, traefik and cerbos are the only ones configured without dependencies

Rafael Szuminski

02/07/2024, 12:05 AM

not sure if this plays a role...

Dennis (Cerbos)

02/07/2024, 12:06 AM

if you remove Cerbos as a dependency for other services, will those get ready quickly?

Dennis (Cerbos)

02/07/2024, 12:08 AM

The idea is there might be a hidden dependency for all services, but Cerbos is the first one to require it, so it waits. Once the dependency is ready, everyone starts quickly.

Rafael Szuminski

02/07/2024, 12:16 AM

so every other container starts VERY quickly. There are 7 containers defined, all of them start within 2-5 seconds

Rafael Szuminski

02/07/2024, 12:18 AM

but, yes, let me try removing dependency definitions...

Rafael Szuminski

02/07/2024, 12:20 AM

did not help

Rafael Szuminski

02/07/2024, 12:21 AM

Dennis (Cerbos)

02/07/2024, 12:29 AM

So you removed dependency on cerbos for other services and they still start quickly. Cerbos should also get ready quickly, but let’s wait to hear what others say about the delay. The rest of our team will be online in a few hours.

Rafael Szuminski

02/07/2024, 12:29 AM

Thanks!

Rafael Szuminski

02/07/2024, 12:41 AM

I get same behavior when using docker run:

Copy code

docker run --name cerbos-test -d -v ./conf:/conf -v ./policies:/policies -p 3592:3592 -p 3593:3593 cerbos/cerbos:latest server --config=/conf/config.yaml

and result is:

Copy code

419776b826b7   cerbos/cerbos:latest   "/cerbos server --co…"   37 seconds ago   Up 37 seconds (health: starting)   0.0.0.0:3592-3593->3592-3593/tcp   cerbos-test

Rafael Szuminski

02/07/2024, 12:42 AM

so it has to be my conf or my policies? (I used this on multiple hosts and they all exhibit same behavior so not sure if host factors in)

Dennis (Cerbos)

02/07/2024, 12:48 AM

Did you try it with some basic policies?

Rafael Szuminski

02/07/2024, 2:27 AM

I only have 50 policies and they are all super basic. But, no, have not tried it with any sample policies

Dennis (Cerbos)

02/07/2024, 2:53 AM

50 policies is a small number. My next guess is that Cerbos has a bug revealed by a particular expression in your policies. You can get a basic policy here.

Rafael Szuminski

02/07/2024, 6:20 AM

same issue with the demo policies. Command:

Copy code

docker run --rm --name cerbos-test -d -v ./conf:/conf -v ./test_policies:/policies -p 3592:3592 -p 3593:3593 cerbos/cerbos:latest server --config=/conf/config.yaml

Rafael Szuminski

02/07/2024, 6:20 AM

test_policies only has the two yaml files from github

Rafael Szuminski

02/07/2024, 6:25 AM

I tried attaching to the container to see if there are any additional clues but bash/sh is not available:

Copy code

docker exec -it cerbos-test /bin/sh

Do you have any other means of attaching to the container?

Dennis (Cerbos)

02/07/2024, 6:49 AM

There’s no shell in there. You can only run cerbos itself. For example, a health-check.

Dennis (Cerbos)

02/07/2024, 7:10 AM

I also see it takes 60 seconds for the Cerbos container to start. Looks like a timeout.

Charith (Cerbos)

02/07/2024, 8:10 AM

Hey @Rafael Szuminski, this is probably due to the default Docker healthcheck interval defined in the Cerbos container which has an interval of 60s. So, it'd take 60s for Docker to run the first healthcheck and mark the container as healthy. You can override the healthcheck interval in the compose file like this:

Copy code

cerbos:
    image: ghcr.io/cerbos/cerbos:latest
    healthcheck:
      test: ["CMD", "/cerbos", "healthcheck"]
      interval: 5s

Cerbos itself starts almost instantly. You can confirm this with

curl localhost:3592/_cerbos/health

. It's the Docker Compose orchestrator that's adding the extra delay until all healthchecks are accounted.

Rafael Szuminski

02/07/2024, 6:28 PM

setting the healthcheck interval as above solves the issue. Thanks you for your help!

🙇 1

4 Views

Open in Slack

Previous Next