Hello I am trying to deploy cerbos the issue I am ...
# help
n
Hello I am trying to deploy cerbos the issue I am facing is that while our pod is showing as healthy without any errors, we are facing problems when trying to expose it via ingress. Specifically, the target group associated with the load balancer is not reaching a healthy state. Can you please help with it ?
c
Hey. Some cloud load balancers do their own health checks by connecting directly to the pods and you need to add their IP range to the firewall allow list. Sometimes there are mismatches in TLS and HTTP/2 settings between the load balancer and the deployment as well. You have to consult your ingress provider documentation and add the appropriate annotations to the ingress to get them in sync.
n
Thanks for responding @Charith (Cerbos) @prathmesh 1 please ask any follow up if you have ?
1
p
@Charith (Cerbos) Our EKS cluster is configured with IPv6. Port forwarding to the service works, but we are having issues with ingress—neither ALB nor NLB seems to be functioning properly. Do you have any insights into this issue?
c
I haven't worked on AWS for a long time so I don't know exactly which settings need to be tweaked. According to https://aws.github.io/aws-eks-best-practices/networking/loadbalancing/loadbalancing/#choosing-load-balancer-target-type the default target type is
instance
and it seems to require a NodePort service. So, my hunch is that you might need to change the Cerbos service type to a NodePort to get the ALB working. However, since you're on IPv6, it's probably better to switch the load balancer to
ip
mode instead by adding the
<http://alb.ingress.kubernetes.io/target-type|alb.ingress.kubernetes.io/target-type>: ip
annotation.
p
Yes, we are currently using IP as the target type. We also attempted using the NodePort type also for the ALB, but it seems that requests are not reaching the target pod coz of unhealthy target group.
c
Do you have the AWS Load Balancer Controller installed?
Also, have you changed the listener address for Cerbos in the configuration?
n
@Charith (Cerbos) we have the ALB ingress installed in EKS cluster and we exposed it via ALB ingress. Issue is healthcheck does not pass inside the container as well as our EKS cluster is ipv6
p
Ingress yaml which we are using -
Copy code
apiVersion: <http://networking.k8s.io/v1|networking.k8s.io/v1>
kind: Ingress
metadata:
  name: dev-cerbos-ingress
  annotations:
    <http://alb.ingress.kubernetes.io/actions.ssl-redirect|alb.ingress.kubernetes.io/actions.ssl-redirect>: '{"Type":"redirect","RedirectConfig":{"Protocol":"HTTPS","Port":"443","StatusCode":"HTTP_301"}}'
    <http://kubernetes.io/ingress.class|kubernetes.io/ingress.class>: alb
    <http://alb.ingress.kubernetes.io/certificate-arn|alb.ingress.kubernetes.io/certificate-arn>: arn
    <http://alb.ingress.kubernetes.io/listen-ports|alb.ingress.kubernetes.io/listen-ports>: '[{"HTTPS": 443}]'
    <http://alb.ingress.kubernetes.io/scheme|alb.ingress.kubernetes.io/scheme>: internet-facing
    <http://alb.ingress.kubernetes.io/ip-address-type|alb.ingress.kubernetes.io/ip-address-type>: dualstack
    <http://alb.ingress.kubernetes.io/target-type|alb.ingress.kubernetes.io/target-type>: ip
spec:
  rules:
    - http:
        paths:
          - path: /*
            pathType: ImplementationSpecific
            backend:
              service:
                name: cerbos
                port:
                  number: 80
c
Sorry, I am confused. @nishant gupta said that the deployment is unhealthy as well? I assumed that the problem was that ALB couldn't route to Cerbos but Cerbos itself was healthy.
n
@Charith (Cerbos) for ipv6 , deployment is unhealth when we put a healthcheck
when ingress was not working then we digged it more and found this
for eg when you just instal cerbos using helm then also we see this
let me quickly do it and paste the issue here
c
Have you modified the Cerbos service to listen on port 80? By default it listens on 3592 but your ALB definition is trying to connect to it on port 80.
n
we have a service which maps the port to pod port 3592 and ingress is ponting to that service
This is the error in cerbos pod
nishantgupta@ip-192-168-104-45 cars24-service-configs % kubectl describe pod cerbos-6d8f8c9484-qnhv5 Name: cerbos-6d8f8c9484-qnhv5 Namespace: default Priority: 0 Service Account: cerbos Node: ip-10-0-24-229.ap-south-1.compute.internal/2406da1a66e5915:31b7 Start Time: Tue, 20 Aug 2024 135106 +0530 Labels: app.kubernetes.io/instance=cerbos app.kubernetes.io/name=cerbos pod-template-hash=6d8f8c9484 Annotations: checksum/config: a973ef0eb705dd2d436aeddacd9d1ba9567eba7981f0f14d4e546c6fe4429729 prometheus.io/path: /_cerbos/metrics prometheus.io/port: 3592 prometheus.io/scheme: http prometheus.io/scrape: true Status: Running IP: 2406da1a66e59153572::5 IPs: IP: 2406da1a66e59153572::5 Controlled By: ReplicaSet/cerbos-6d8f8c9484 Containers: cerbos: Container ID: containerd://50e502d7b36bd2be1f8dfca9af2d022123b2db2172d3e6c5d5ae5f0d29467781 Image: ghcr.io/cerbos/cerbos:0.38.1 Image ID: ghcr.io/cerbos/cerbos@sha256:c3c8736f08f07705ebd6bfa4ae8ede870c68a3f1f1d0ca8526fb456aef1bc20a Ports: 3592/TCP, 3593/TCP Host Ports: 0/TCP, 0/TCP Args: server --config=/config/.cerbos.yaml --log-level=INFO State: Running Started: Tue, 20 Aug 2024 135207 +0530 Last State: Terminated Reason: Completed Exit Code: 0 Started: Tue, 20 Aug 2024 135137 +0530 Finished: Tue, 20 Aug 2024 135207 +0530 Ready: False Restart Count: 2 Liveness: http-get http//http/_cerbos/health delay=0s timeout=1s period=10s #success=1 #failure=3 Readiness: http-get http//http/_cerbos/health delay=0s timeout=1s period=10s #success=1 #failure=3 Environment: <none> Mounts: /config from config (ro) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-wgjcm (ro) /work from work (rw) Conditions: Type Status PodReadyToStartContainers True Initialized True Ready False ContainersReady False PodScheduled True Volumes: config: Type: ConfigMap (a volume populated by a ConfigMap) Name: cerbos Optional: false work: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium: SizeLimit: <unset> kube-api-access-wgjcm: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: <nil> DownwardAPI: true QoS Class: BestEffort Node-Selectors: <none> Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 76s default-scheduler Successfully assigned default/cerbos-6d8f8c9484-qnhv5 to ip-10-0-24-229.ap-south-1.compute.internal Normal Pulling 74s kubelet Pulling image "ghcr.io/cerbos/cerbos:0.38.1" Normal Pulled 69s kubelet Successfully pulled image "ghcr.io/cerbos/cerbos:0.38.1" in 4.287s (4.287s including waiting). Image size: 25824503 bytes. Normal Killing 46s kubelet Container cerbos failed liveness probe, will be restarted Normal Created 45s (x2 over 69s) kubelet Created container cerbos Normal Started 45s (x2 over 69s) kubelet Started container cerbos Normal Pulled 45s kubelet Container image "ghcr.io/cerbos/cerbos:0.38.1" already present on machine Warning Unhealthy 16s (x11 over 68s) kubelet Readiness probe failed: Get "http//[2406da1a66e591535725]3592/_cerbos/health": dial tcp [2406da1a66e59153572:5]3592: connect: connection refused Warning Unhealthy 16s (x6 over 66s) kubelet Liveness probe failed: Get "http//[2406da1a66e591535725]3592/_cerbos/health": dial tcp [2406da1a66e59153572:5]3592: connect: connection refused
Since our EKS cluster is ipv6 cluster and hence the health check inside the pod is failing
I just ran this command helm install cerbos cerbos/cerbos --version=0.38.1
in our ipv6 EKS cluster
@Charith (Cerbos)
c
Right. I don't think the Helm chart enables dual stack by default. I think dual stack might be the default setting in newer k8s versions but I don't know how EKS deals with that. So, you may need to update the Cerbos manifests to enable dual stack networking.
n
@Charith (Cerbos) can you help with the same ?
is there any flag which we need to set ?
c
Depending on your cluster configuration, you may need to set
spec.ipFamilyPolicy
or
spec.ipFamilies
on the service. Out of curiosity, what's the output of
kubectl describe svc cerbos
? The Cerbos helm chart currently doesn't have a setting to define the IP family so you'd need to patch it with kustomize as described in https://docs.cerbos.dev/cerbos/latest/installation/helm#_customizing_the_manifests.
n
when we expose the svc as type LoadBalancer
@Charith (Cerbos) pls be more specific, we have tried all the things and it does not work in ipv6 EKS cluster as your pod itself is not running in ipv6 eks cluster.
c
We have other users running Cerbos on IPv6 clusters so it's incorrect to say that Cerbos is not working. There's some configuration issue here and that's what we are trying to figure out. I can't be more specific because I am trying to debug your issue without full visibility into your system. I can only go by the bits and pieces of information you tell me.
n
so @Charith (Cerbos) pls suggest what do you need from our end
Since the pod itself is not coming up in ipv6 and hence we see this as a cause
sure pls suggest what can we try to solve this
@Charith (Cerbos) I meant that this is not working for us and hence need your support
c
If you port forward to the pod and run
curl localhost:3592/_cerbos/health
, does that work?
What are the first few lines of the pod logs when Cerbos is starting?
p
Yes port forward works, not on ingress
Copy code
ec2-user@ip-10-0-0-244:~$ kubectl port-forward svc/cerbos --address 0.0.0.0 8080:80 -n dev-cerbos-namespace-test
Forwarding from 0.0.0.0:8080 -> 3592
Handling connection for 8080
Copy code
ec2-user@ip-10-0-0-244:~$ curl <http://localhost:8080/_cerbos/health>
{"status":"SERVING"}
Logs -
Copy code
k logs -f cerbos-7bdbd89ddd-rw7n4 -n dev-cerbos-namespace-test
{"log.level":"info","@timestamp":"2024-08-20T10:26:14.075Z","log.logger":"cerbos.server","message":"maxprocs: Leaving GOMAXPROCS=8: CPU quota undefined"}
{"log.level":"info","@timestamp":"2024-08-20T10:26:14.075Z","log.logger":"cerbos.server","message":"Loading configuration from /config/config.yaml"}
{"log.level":"warn","@timestamp":"2024-08-20T10:26:14.076Z","log.logger":"cerbos.otel","message":"Disabling OTLP traces because neither OTEL_EXPORTER_OTLP_ENDPOINT nor OTEL_EXPORTER_OTLP_TRACES_ENDPOINT is defined"}
{"log.level":"info","@timestamp":"2024-08-20T10:26:14.076Z","log.logger":"cerbos.postgres","message":"Initializing Postgres storage","host":"dev-rds","database":"pdb"}
{"log.level":"info","@timestamp":"2024-08-20T10:26:15.126Z","log.logger":"cerbos.db","message":"Checking database schema. Set skipSchemaCheck to true to disable."}
{"log.level":"info","@timestamp":"2024-08-20T10:26:15.144Z","log.logger":"cerbos.db","message":"Database schema check completed"}
{"log.level":"info","@timestamp":"2024-08-20T10:26:15.145Z","log.logger":"cerbos.telemetry","message":"Anonymous telemetry enabled. Disable via the config file or by setting the CERBOS_NO_TELEMETRY=1 environment variable"}
{"log.level":"info","@timestamp":"2024-08-20T10:26:15.145Z","log.logger":"cerbos.grpc","message":"Starting admin service"}
{"log.level":"info","@timestamp":"2024-08-20T10:26:15.145Z","log.logger":"cerbos.grpc","message":"Starting gRPC server at :3593"}
{"log.level":"info","@timestamp":"2024-08-20T10:26:15.181Z","log.logger":"cerbos.http","message":"Starting HTTP server at :3592"}
c
OK, so it's the kubelet having trouble connecting to the pod and running the health check. Can you post the output of
k describe pod cerbos-7bdbd89ddd-rw7n4 -n dev-cerbos-namespace-test
as well.
n
so @Charith (Cerbos) if we dont put the liveliness and readiness probe healthcheck in our deployment then it comes healthy but if we put it then the health check starts failing with ipv6.
@prathmesh 1 pls post the o/p of describe pod
p
Copy code
k describe pod cerbos-7bdbd89ddd-rw7n4 -n dev-cerbos-namespace-test
Name:             cerbos-7bdbd89ddd-rw7n4
Namespace:        dev-cerbos-namespace-test
Priority:         0
Service Account:  dev-cerbos-sa
Node:             ip-10-0-24-229.ap-south-1.compute.internal/2406:da1a:66e:5915::31b7
Start Time:       Tue, 20 Aug 2024 10:26:11 +0000
Labels:           <http://app.kubernetes.io/instance=cerbos|app.kubernetes.io/instance=cerbos>
                  <http://app.kubernetes.io/name=cerbos|app.kubernetes.io/name=cerbos>
                  pod-template-hash=7bdbd89ddd
Annotations:      <none>
Status:           Running
IP:               2406:da1a:66e:5915:3572::d
IPs:
  IP:           2406:da1a:66e:5915:3572::d
Controlled By:  ReplicaSet/cerbos-7bdbd89ddd
Containers:
  cerbos:
    Container ID:  <containerd://66cf60394703dea739581b377e7d38d75baa6d2909d11c7b0870c907f8f8273>5
    Image:         <http://ghcr.io/cerbos/cerbos:0.38.1|ghcr.io/cerbos/cerbos:0.38.1>
    Image ID:      <http://ghcr.io/cerbos/cerbos@sha256:c3c8736f08f07705ebd6bfa4ae8ede870c68a3f1f1d0ca8526fb456aef1bc20a|ghcr.io/cerbos/cerbos@sha256:c3c8736f08f07705ebd6bfa4ae8ede870c68a3f1f1d0ca8526fb456aef1bc20a>
    Port:          <none>
    Host Port:     <none>
    Args:
      server
      --config=/config/config.yaml
      --log-level=INFO
    State:          Running
      Started:      Tue, 20 Aug 2024 10:26:13 +0000
    Ready:          True
    Restart Count:  0
    Environment:
      ENVIRONMENT:                  dev
      PGHOST:                       10.0.6.116
      CERBOS_CONFIG:                /config/config.yaml
      CERBOS_PASSWORD_HASH:         <set to the key 'passwordHash' in secret 'cerbos-secret'>      Optional: false
      CERBOS_USERNAME:              <set to the key 'username' in secret 'cerbos-secret'>          Optional: false
      POSTGRES_USERNAME:            <set to the key 'postgresUser' in secret 'cerbos-secret'>      Optional: false
      POSTGRES_PASSWORD:            <set to the key 'postgresPassword' in secret 'cerbos-secret'>  Optional: false
      AWS_STS_REGIONAL_ENDPOINTS:   regional
      AWS_DEFAULT_REGION:           ap-south-1
      AWS_REGION:                   ap-south-1
      AWS_ROLE_ARN:                 arn:aws:iam::339712719004:role/dev-cerbos-sa-role
      AWS_WEB_IDENTITY_TOKEN_FILE:  /var/run/secrets/eks.amazonaws.com/serviceaccount/token
    Mounts:
      /config from config-volume (ro)
      /var/run/secrets/eks.amazonaws.com/serviceaccount from aws-iam-token (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-qkzmp (ro)
      /work from work (rw)
Conditions:
  Type                        Status
  PodReadyToStartContainers   True 
  Initialized                 True 
  Ready                       True 
  ContainersReady             True 
  PodScheduled                True 
Volumes:
  aws-iam-token:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  86400
  config-volume:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      cerbos-config
    Optional:  false
  work:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  kube-api-access-qkzmp:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 <http://node.kubernetes.io/not-ready:NoExecute|node.kubernetes.io/not-ready:NoExecute> op=Exists for 300s
                             <http://node.kubernetes.io/unreachable:NoExecute|node.kubernetes.io/unreachable:NoExecute> op=Exists for 300s
Events:
  Type    Reason     Age   From               Message
  ----    ------     ----  ----               -------
  Normal  Scheduled  16m   default-scheduler  Successfully assigned dev-cerbos-namespace-test/cerbos-7bdbd89ddd-rw7n4 to ip-10-0-24-229.ap-south-1.compute.internal
  Normal  Pulled     16m   kubelet            Container image "<http://ghcr.io/cerbos/cerbos:0.38.1|ghcr.io/cerbos/cerbos:0.38.1>" already present on machine
  Normal  Created    16m   kubelet            Created container cerbos
  Normal  Started    16m   kubelet            Started container cerbos
c
This is interesting...
Copy code
Port:          <none>
    Host Port:     <none>
It seems there's no port exposed from the container. That could be why the health check fails.
What's the output of
k get deploy cerbos -o yaml -n dev-cerbos-namespace-test
?
p
Copy code
k get deploy cerbos -o yaml -n dev-cerbos-namespace-test
apiVersion: apps/v1
kind: Deployment
metadata:
  annotations:
    <http://deployment.kubernetes.io/revision|deployment.kubernetes.io/revision>: "4"
    <http://kubectl.kubernetes.io/last-applied-configuration|kubectl.kubernetes.io/last-applied-configuration>: |
      {"apiVersion":"apps/v1","kind":"Deployment","metadata":{"annotations":{},"labels":{"<http://app.kubernetes.io/instance|app.kubernetes.io/instance>":"cerbos","<http://app.kubernetes.io/name|app.kubernetes.io/name>":"cerbos","company":"cars24"},"name":"cerbos","namespace":"dev-cerbos-namespace-test"},"spec":{"replicas":1,"selector":{"matchLabels":{"<http://app.kubernetes.io/instance|app.kubernetes.io/instance>":"cerbos","<http://app.kubernetes.io/name|app.kubernetes.io/name>":"cerbos"}},"template":{"metadata":{"labels":{"<http://app.kubernetes.io/instance|app.kubernetes.io/instance>":"cerbos","<http://app.kubernetes.io/name|app.kubernetes.io/name>":"cerbos"}},"spec":{"containers":[{"args":["server","--config=/config/config.yaml","--log-level=INFO"],"env":[{"name":"ENVIRONMENT","value":"dev"},{"name":"PGHOST","value":"10.0.6.116"},{"name":"CERBOS_CONFIG","value":"/config/config.yaml"},{"name":"CERBOS_PASSWORD_HASH","valueFrom":{"secretKeyRef":{"key":"passwordHash","name":"cerbos-secret"}}},{"name":"CERBOS_USERNAME","valueFrom":{"secretKeyRef":{"key":"username","name":"cerbos-secret"}}},{"name":"POSTGRES_USERNAME","valueFrom":{"secretKeyRef":{"key":"postgresUser","name":"cerbos-secret"}}},{"name":"POSTGRES_PASSWORD","valueFrom":{"secretKeyRef":{"key":"postgresPassword","name":"cerbos-secret"}}}],"image":"<http://ghcr.io/cerbos/cerbos:0.38.1|ghcr.io/cerbos/cerbos:0.38.1>","imagePullPolicy":"IfNotPresent","name":"cerbos","volumeMounts":[{"mountPath":"/config","name":"config-volume","readOnly":true},{"mountPath":"/work","name":"work"}]}],"securityContext":{},"serviceAccountName":"dev-cerbos-sa","volumes":[{"configMap":{"name":"cerbos-config"},"name":"config-volume"},{"emptyDir":{},"name":"work"}]}}}}
  creationTimestamp: "2024-08-20T09:55:48Z"
  generation: 4
  labels:
    <http://app.kubernetes.io/instance|app.kubernetes.io/instance>: cerbos
    <http://app.kubernetes.io/name|app.kubernetes.io/name>: cerbos
    company: cars24
  name: cerbos
  namespace: dev-cerbos-namespace-test
  resourceVersion: "36391550"
  uid: 49009cc9-86b5-41b7-8238-b353bdfb4394
spec:
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      <http://app.kubernetes.io/instance|app.kubernetes.io/instance>: cerbos
      <http://app.kubernetes.io/name|app.kubernetes.io/name>: cerbos
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      creationTimestamp: null
      labels:
        <http://app.kubernetes.io/instance|app.kubernetes.io/instance>: cerbos
        <http://app.kubernetes.io/name|app.kubernetes.io/name>: cerbos
    spec:
      containers:
      - args:
        - server
        - --config=/config/config.yaml
        - --log-level=INFO
        env:
        - name: ENVIRONMENT
          value: dev
        - name: PGHOST
          value: 10.0.6.116
        - name: CERBOS_CONFIG
          value: /config/config.yaml
        - name: CERBOS_PASSWORD_HASH
          valueFrom:
            secretKeyRef:
              key: passwordHash
              name: cerbos-secret
        - name: CERBOS_USERNAME
          valueFrom:
            secretKeyRef:
              key: username
              name: cerbos-secret
        - name: POSTGRES_USERNAME
          valueFrom:
            secretKeyRef:
              key: postgresUser
              name: cerbos-secret
        - name: POSTGRES_PASSWORD
          valueFrom:
            secretKeyRef:
              key: postgresPassword
              name: cerbos-secret
        image: <http://ghcr.io/cerbos/cerbos:0.38.1|ghcr.io/cerbos/cerbos:0.38.1>
        imagePullPolicy: IfNotPresent
        name: cerbos
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /config
          name: config-volume
          readOnly: true
        - mountPath: /work
          name: work
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      serviceAccount: dev-cerbos-sa
      serviceAccountName: dev-cerbos-sa
      terminationGracePeriodSeconds: 30
      volumes:
      - configMap:
          defaultMode: 420
          name: cerbos-config
        name: config-volume
      - emptyDir: {}
        name: work
status:
  availableReplicas: 1
  conditions:
  - lastTransitionTime: "2024-08-20T10:26:14Z"
    lastUpdateTime: "2024-08-20T10:26:14Z"
    message: Deployment has minimum availability.
    reason: MinimumReplicasAvailable
    status: "True"
    type: Available
  - lastTransitionTime: "2024-08-20T10:26:11Z"
    lastUpdateTime: "2024-08-20T10:26:14Z"
    message: ReplicaSet "cerbos-7bdbd89ddd" has successfully progressed.
    reason: NewReplicaSetAvailable
    status: "True"
    type: Progressing
  observedGeneration: 4
  readyReplicas: 1
  replicas: 1
  updatedReplicas: 1
c
Are you using a custom manifest to deploy Cerbos? This doesn't look like it came from the Helm chart.
p
yes this is custom manifest, but tried with helm also before didn't worked.
n
@Charith (Cerbos) This is from the helm
nishantgupta@ip-192-168-104-45 cars24-service-configs % kubectl describe svc cerbos Name: cerbos Namespace: default Labels: app.kubernetes.io/instance=cerbos app.kubernetes.io/managed-by=Helm app.kubernetes.io/name=cerbos app.kubernetes.io/version=0.38.1 helm.sh/chart=cerbos-0.38.1 Annotations: meta.helm.sh/release-name: cerbos meta.helm.sh/release-namespace: default Selector: app.kubernetes.io/instance=cerbos,app.kubernetes.io/name=cerbos Type: ClusterIP IP Family Policy: SingleStack IP Families: IPv6 IP: fde4911c453b::aa3a IPs: fde4911c453b::aa3a Port: http 3592/TCP TargetPort: http/TCP Endpoints: Port: grpc 3593/TCP TargetPort: grpc/TCP Endpoints: Session Affinity: None Events: <none>
nishantgupta@ip-192-168-104-45 cars24-service-configs % kubectl describe svc cerbos Name: cerbos Namespace: default Labels: app.kubernetes.io/instance=cerbos app.kubernetes.io/managed-by=Helm app.kubernetes.io/name=cerbos app.kubernetes.io/version=0.38.1 helm.sh/chart=cerbos-0.38.1 Annotations: meta.helm.sh/release-name: cerbos meta.helm.sh/release-namespace: default Selector: app.kubernetes.io/instance=cerbos,app.kubernetes.io/name=cerbos Type: ClusterIP IP Family Policy: SingleStack IP Families: IPv6 IP: fde4911c453b::aa3a IPs: fde4911c453b::aa3a Port: http 3592/TCP TargetPort: http/TCP Endpoints: Port: grpc 3593/TCP TargetPort: grpc/TCP Endpoints: Session Affinity: None Events: <none> nishantgupta@ip-192-168-104-45 cars24-service-configs % k get deploy cerbos -o yaml zsh: command not found: k nishantgupta@ip-192-168-104-45 cars24-service-configs % kubectl get deploy cerbos -o yaml apiVersion: apps/v1 kind: Deployment metadata: annotations: deployment.kubernetes.io/revision: "1" meta.helm.sh/release-name: cerbos meta.helm.sh/release-namespace: default creationTimestamp: "2024-08-20T082106Z" generation: 1 labels: app.kubernetes.io/instance: cerbos app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: cerbos app.kubernetes.io/version: 0.38.1 helm.sh/chart: cerbos-0.38.1 name: cerbos namespace: default resourceVersion: "36339988" uid: d16d970b-b9c0-4c88-9226-0cee98b100db spec: progressDeadlineSeconds: 600 replicas: 1 revisionHistoryLimit: 10 selector: matchLabels: app.kubernetes.io/instance: cerbos app.kubernetes.io/name: cerbos strategy: rollingUpdate: maxSurge: 25% maxUnavailable: 25% type: RollingUpdate template: metadata: annotations: checksum/config: a973ef0eb705dd2d436aeddacd9d1ba9567eba7981f0f14d4e546c6fe4429729 prometheus.io/path: /_cerbos/metrics prometheus.io/port: "3592" prometheus.io/scheme: http prometheus.io/scrape: "true" creationTimestamp: null labels: app.kubernetes.io/instance: cerbos app.kubernetes.io/name: cerbos spec: containers: - args: - server - --config=/config/.cerbos.yaml - --log-level=INFO image: ghcr.io/cerbos/cerbos:0.38.1 imagePullPolicy: IfNotPresent livenessProbe: failureThreshold: 3 httpGet: path: /_cerbos/health port: http scheme: HTTP periodSeconds: 10 successThreshold: 1 timeoutSeconds: 1 name: cerbos ports: - containerPort: 3592 name: http protocol: TCP - containerPort: 3593 name: grpc protocol: TCP readinessProbe: failureThreshold: 3 httpGet: path: /_cerbos/health port: http scheme: HTTP periodSeconds: 10 successThreshold: 1 timeoutSeconds: 1 resources: {} terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /config name: config readOnly: true - mountPath: /work name: work dnsPolicy: ClusterFirst restartPolicy: Always schedulerName: default-scheduler securityContext: {} serviceAccount: cerbos serviceAccountName: cerbos terminationGracePeriodSeconds: 30 volumes: - configMap: defaultMode: 420 name: cerbos name: config - emptyDir: {} name: work status: conditions: - lastTransitionTime: "2024-08-20T082106Z" lastUpdateTime: "2024-08-20T082106Z" message: Deployment does not have minimum availability. reason: MinimumReplicasUnavailable status: "False" type: Available - lastTransitionTime: "2024-08-20T083107Z" lastUpdateTime: "2024-08-20T083107Z" message: ReplicaSet "cerbos-6d8f8c9484" has timed out progressing. reason: ProgressDeadlineExceeded status: "False" type: Progressing observedGeneration: 1 replicas: 1 unavailableReplicas: 1 updatedReplicas: 1
above is the deployment yaml
t
The "connection refused" in the pod description suggests that cerbos is not actually binding to all addresses, possibly not bound to the IPv6 address.
c
You could try setting the
server.httpListenAddr
in Cerbos configuration file to
[::]:3592
to see if that helps.
k edit cm cerbos
and
k rollout restart deploy/cerbos
.
👍 1
t
@Charith (Cerbos) testing locally, that seems to work. With the `:3592' I cannot netcat my ipv6 link address, but with
[::]:3592]
I can
👍 1
p
Added
server.httpListenAddr
in config.yaml (ConfigMap) -
Copy code
apiVersion: v1
kind: ConfigMap
metadata:
  name: cerbos-config
  namespace: dev-cerbos-namespace-test
data:
  config.yaml: |
    storage:
      driver: "postgres"
      postgres:
        url: "postgres://${POSTGRES_USERNAME}:${POSTGRES_PASSWORD}@dev-db-host:5432/postgres?sslmode=allow&search_path=cerbos"
    server:
      adminAPI:
        enabled: true
        adminCredentials:
          username: ${CERBOS_USERNAME}
          passwordHash: ${CERBOS_PASSWORD_HASH}
      httpListenAddr: "[::]:3592"
but after rollout restart new pod is restarting again and again with logs -
Copy code
k logs -f cerbos-59494bc6f9-wvnpr -n dev-cerbos-namespace-test
{"log.level":"info","@timestamp":"2024-08-20T12:09:14.700Z","log.logger":"cerbos.server","message":"maxprocs: Leaving GOMAXPROCS=8: CPU quota undefined"}
{"log.level":"info","@timestamp":"2024-08-20T12:09:14.700Z","log.logger":"cerbos.server","message":"Loading configuration from /config/config.yaml"}
{"log.level":"warn","@timestamp":"2024-08-20T12:09:14.700Z","log.logger":"cerbos.otel","message":"Disabling OTLP traces because neither OTEL_EXPORTER_OTLP_ENDPOINT nor OTEL_EXPORTER_OTLP_TRACES_ENDPOINT is defined"}
{"log.level":"info","@timestamp":"2024-08-20T12:09:14.700Z","log.logger":"cerbos.postgres","message":"Initializing Postgres storage","host":"dev-cars24-cf-common-db.chuskyscy5za.ap-south-1.rds.amazonaws.com","database":"postgres"}
{"log.level":"info","@timestamp":"2024-08-20T12:09:15.787Z","log.logger":"cerbos.db","message":"Checking database schema. Set skipSchemaCheck to true to disable."}
{"log.level":"info","@timestamp":"2024-08-20T12:09:15.794Z","log.logger":"cerbos.db","message":"Database schema check completed"}
{"log.level":"info","@timestamp":"2024-08-20T12:09:15.794Z","log.logger":"cerbos.telemetry","message":"Anonymous telemetry enabled. Disable via the config file or by setting the CERBOS_NO_TELEMETRY=1 environment variable"}
{"log.level":"info","@timestamp":"2024-08-20T12:09:15.795Z","log.logger":"cerbos.grpc","message":"Starting admin service"}
{"log.level":"info","@timestamp":"2024-08-20T12:09:15.795Z","log.logger":"cerbos.grpc","message":"Starting gRPC server at :3593"}
{"log.level":"info","@timestamp":"2024-08-20T12:09:15.796Z","log.logger":"cerbos.http","message":"Starting HTTP server at :3592"}
{"log.level":"info","@timestamp":"2024-08-20T12:10:12.445Z","log.logger":"cerbos.server","message":"Shutting down"}
{"log.level":"info","@timestamp":"2024-08-20T12:10:12.446Z","log.logger":"cerbos.http","message":"HTTP server stopped"}
{"log.level":"info","@timestamp":"2024-08-20T12:10:12.446Z","log.logger":"cerbos.grpc","message":"gRPC server stopped"}
{"log.level":"info","@timestamp":"2024-08-20T12:10:12.446Z","log.logger":"cerbos.server","message":"Shutdown complete"}
{"log.level":"info","@timestamp":"2024-08-20T12:10:12.855Z","log.logger":"cerbos.server","message":"maxprocs: No GOMAXPROCS change to reset"}
t
@prathmesh 1 if you describe one of the pods, I suspect it's saying that the liveness or readiness if faililng?
p
Yes @Tristan Colgate-McFarlane -
Copy code
Warning  Unhealthy  18m (x6 over 19m)   kubelet            Liveness probe failed: Get "http://[2406:da1a:66e:5915:23b0::d]:3592/_cerbos/health": dial tcp [2406:da1a:66e:5915:23b0::d]:3592: connect: connection refused
  Warning  Unhealthy  14m (x31 over 19m)  kubelet            Readiness probe failed: Get "http://[2406:da1a:66e:5915:23b0::d]:3592/_cerbos/health": dial tcp [2406:da1a:66e:5915:23b0::d]:3592: connect: connection refused
  Warning  BackOff    12s (x52 over 14m)  kubelet            Back-off restarting failed container cerbos in pod cerbos-59494bc6f9-wvnpr_dev-cerbos-namespace-test(fb34ac55-0194-42b8-baac-e7175f
c
I don't think it has picked up the configuration change because it's still listening on
:3592
"Starting HTTP server at :3592"
t
My testing locally indicates that
[::]:3592
does seem to service both ipv6 ipv4 , where 0.0.0.0 or plain
:3592
will only do v4, so as long as the config is being read correctly, it /should/ be OK
n
Thanks @Tristan Colgate-McFarlane this should help us
we will check and update