hi cerbos running into some odd helm chart deployment issues Cerbos Community #help

Join Slack

hi cerbos running into some odd helm chart deploym...

# help

Dmitry Meyerson

07/26/2023, 5:20 PM

hi cerbos running into some odd helm chart deployment issues - details in thread

Dmitry Meyerson

07/26/2023, 5:27 PM

so while I can run the helm command ok

Copy code

helm upgrade --install cerbos cerbos/cerbos --namespace cerbos-dev --version=0.29.0 --values=./cerbos_config.yaml --kubeconfig /tmp/kube_config.yaml
  shell: sh -e {0}
"cerbos" has been added to your repositories
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "cerbos" chart repository
Update Complete. ⎈Happy Helming!⎈
WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: /tmp/kube_config.yaml
WARNING: Kubernetes configuration file is world-readable. This is insecure. Location: /tmp/kube_config.yaml
NAME	NAMESPACE	REVISION	UPDATED	STATUS	CHART	APP VERSION
WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: /tmp/kube_config.yaml
WARNING: Kubernetes configuration file is world-readable. This is insecure. Location: /tmp/kube_config.yaml
Release "cerbos" does not exist. Installing it now.
NAME: cerbos
LAST DEPLOYED: Wed Jul 26 17:23:13 2023
NAMESPACE: cerbos-dev
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
You have successfully deployed Cerbos.

Dmitry Meyerson

07/26/2023, 5:30 PM

however. . . the app is stuck in ‘pending-install’ state

Dmitry Meyerson

07/26/2023, 5:49 PM

or more to the point

Copy code

dmeyerson@C02G73VMMD6P cerbos-ABAC % helm status cerbos -n cerbos-dev 
NAME: cerbos
LAST DEPLOYED: Wed Jul 26 17:23:13 2023
NAMESPACE: cerbos-dev
STATUS: pending-install
REVISION: 1
TEST SUITE: None
NOTES:
You have successfully deployed Cerbos.

Dmitry Meyerson

07/26/2023, 6:06 PM

i do notice this at deploy time - might this keep the helm chart stuck in ‘pending-install’

Copy code

50s         Warning   Unhealthy           pod/cerbos-996bd55cb-dtbws    Readiness probe failed: Get "<http://10.32.5.123:3592/_cerbos/health>": dial tcp 10.32.5.123:3592: connect: connection refused

Charith (Cerbos)

07/27/2023, 7:24 AM

That warning is normal while the pods are starting. Are they healthy now?

kubectl get deploy cerbos -n cerbos-dev

Charith (Cerbos)

07/27/2023, 7:26 AM

I think getting stuck on

pending-install

is a known issue with Helm. You can try doing a

helm rollback

and rolling back to the previous release to "reset" the state or just simply uninstall and reinstall the chart.

Dmitry Meyerson

07/27/2023, 4:22 PM

I uninstalled and reinstalled a few time w/ same end result

Dmitry Meyerson

07/27/2023, 4:23 PM

I will try to hit the _cerbos/health endpoint manually

Dmitry Meyerson

07/27/2023, 4:27 PM

is there a way to set rediness probe timeout value - don’t see anything to control readiness probe config in the helm chart schema - https://artifacthub.io/packages/helm/cerbos/cerbos?modal=values

Copy code

6m8s        Warning   Unhealthy           pod/cerbos-5859d99ff8-8r9t8    Readiness probe failed: Get "<http://10.32.5.129:3592/_cerbos/health>": dial tcp 10.32.5.129:3592: connect: connection refused

Charith (Cerbos)

07/27/2023, 4:35 PM

If the pod is not ready for so long, then that suggests something wrong with the config that prevents Cerbos from starting. Please check the logs of the pod.

Dmitry Meyerson

07/27/2023, 4:57 PM

what odd is that I can kubectl port-forward the curl the pod directly . . seems to be up and ‘SERVING’

Copy code

dmeyerson@my_laptop cerbos-ABAC % curl <http://localhost:3592/_cerbos/health>
{"status":"SERVING"}

Dmitry Meyerson

07/27/2023, 5:01 PM

here are the pod logs 😕 - seems ok

Copy code

%  kubectl logs cerbos-5859d99ff8-8r9t8  -n cerbos-dev
{"log.level":"info","@timestamp":"2023-07-27T16:09:20.242Z","log.logger":"cerbos.server","message":"maxprocs: Leaving GOMAXPROCS=4: CPU quota undefined"}
{"log.level":"info","@timestamp":"2023-07-27T16:09:20.243Z","log.logger":"cerbos.server","message":"Loading configuration from /config/config.yaml"}
{"log.level":"info","@timestamp":"2023-07-27T16:09:20.244Z","log.logger":"cerbos.git.store","message":"Cloning git repo from <https://git.viasat.com/OPS-ML-Engineering/cerbos-ABAC.git>","dir":"/work"}
{"log.level":"info","@timestamp":"2023-07-27T16:09:20.581Z","log.logger":"cerbos.git.store","message":"Opening git repo","dir":"/work"}
{"log.level":"info","@timestamp":"2023-07-27T16:09:20.592Z","log.logger":"cerbos.index","message":"Found 2 executable policies"}
{"log.level":"info","@timestamp":"2023-07-27T16:09:20.593Z","log.logger":"cerbos.telemetry","message":"Telemetry disabled"}
{"log.level":"info","@timestamp":"2023-07-27T16:09:20.593Z","log.logger":"cerbos.git.store","message":"Polling for updates every 1m0s","dir":"/work"}
{"log.level":"info","@timestamp":"2023-07-27T16:09:20.595Z","log.logger":"cerbos.grpc","message":"Starting gRPC server at :3593"}
{"log.level":"info","@timestamp":"2023-07-27T16:09:20.598Z","log.logger":"cerbos.http","message":"Starting HTTP server at :3592"}

Charith (Cerbos)

07/27/2023, 5:20 PM

So what makes you think it's the health check? What's the output of

kubectl get deploy cerbos -n cerbos-dev

Dmitry Meyerson

07/27/2023, 5:31 PM

well I don’t think it is the health check, but does it does seem like the readiness check fails once (as seen in events) and that the helm chart is stuck in ‘pending-install’ even though the workload is healthy - its more like ~ ‘why does the helm chart think the readiness probe is failing or why does it fail for the helm chart?’

Charith (Cerbos)

07/27/2023, 5:57 PM

The Helm chart doesn't do a health check. It waits for all the deployed resources to become available. Try running Helm with verbose logging to see if that gives you a clue as to why it gets stuck in pending install state

Dmitry Meyerson

07/27/2023, 5:59 PM

i just ran it w/ --wait -turns out the service account I used to run ‘helm . . .’ needed more verbs+resources, but still stuck in ‘pending-install’ - will dump updates here

Dmitry Meyerson

07/27/2023, 6:14 PM

ok got it fixed - here are some observations

Dmitry Meyerson

07/27/2023, 6:19 PM

• helm issue: helm as a cli has a race condition - so while on the client end one may that

'STATUS: deployed'

in reality on the server side it may yet get stuck • permissions and service accounts: one needs to run

helm .. .

with an account having sufficient permission, because I using some automation w/ service account creds rather than just running

helm

myself helm get stuck (silenty) in

pending-install

because of insufficient set of verbs+resources associated w/ the account running the helm command, rather than raise an error complaining that ~ “service account X doesn’t get to perform Y on resource Z”

Dmitry Meyerson

07/27/2023, 6:19 PM

but all good now

Dmitry Meyerson

07/27/2023, 6:20 PM

anyway short version - helm + service account issue unrelated to cerbos itself

👍 1

76 Views

Open in Slack

Previous Next