I have a working EKS cluster. It is using a ALB for ingress.
When I apply a service and then an ingress most of these work as expected. However some target groups eventually have no registered targets. If I get the service IP address kubectl describe svc my-service-name and manually register the EndPoints in the target group the pods are reachable again but that's not a sustainable process.
Any ideas on what might be happening? Why doesn't EKS find the target groups as pods cycle?
Each service (secrets, deployment, service and ingress consists of a set of .yaml files applied like:
deploy.sh
#!/bin/bash
set -e
kubectl apply -f ./secretsMap.yaml
kubectl apply -f ./configMap.yaml
kubectl apply -f ./deployment.yaml
kubectl apply -f ./service.yaml
kubectl apply -f ./ingress.yaml
service.yaml
apiVersion: v1
kind: Service
metadata:
name: "site-bob"
namespace: "next-sites"
spec:
ports:
- port: 80
targetPort: 3000
protocol: TCP
type: NodePort
selector:
app: "site-bob"
ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: "site-bob"
namespace: "next-sites"
annotations:
kubernetes.io/ingress.class: alb
alb.ingress.kubernetes.io/tags: Environment=Production,Group=api
alb.ingress.kubernetes.io/backend-protocol: HTTP
alb.ingress.kubernetes.io/ip-address-type: ipv4
alb.ingress.kubernetes.io/target-type: ip
alb.ingress.kubernetes.io/scheme: internet-facing
alb.ingress.kubernetes.io/listen-ports: '[{"HTTP":80},{"HTTPS":443}]'
alb.ingress.kubernetes.io/load-balancer-name: eks-ingress-1
alb.ingress.kubernetes.io/group.name: eks-ingress-1
alb.ingress.kubernetes.io/certificate-arn: arn:aws:acm:us-east-2:402995436123:certificate/9db9dce3-055d-4655-842e-xxxxx
alb.ingress.kubernetes.io/healthcheck-port: traffic-port
alb.ingress.kubernetes.io/healthcheck-path: /
alb.ingress.kubernetes.io/healthcheck-interval-seconds: '30'
alb.ingress.kubernetes.io/healthcheck-timeout-seconds: '16'
alb.ingress.kubernetes.io/success-codes: 200,201
alb.ingress.kubernetes.io/healthy-threshold-count: '2'
alb.ingress.kubernetes.io/unhealthy-threshold-count: '2'
alb.ingress.kubernetes.io/load-balancer-attributes: idle_timeout.timeout_seconds=60
alb.ingress.kubernetes.io/target-group-attributes: deregistration_delay.timeout_seconds=30
alb.ingress.kubernetes.io/actions.ssl-redirect: >
{
"type": "redirect",
"redirectConfig": { "Protocol": "HTTPS", "Port": "443", "StatusCode": "HTTP_301"}
}
alb.ingress.kubernetes.io/actions.svc-host: >
{
"type":"forward",
"forwardConfig":{
"targetGroups":[
{
"serviceName":"site-bob",
"servicePort": 80,"weight":20}
],
"targetGroupStickinessConfig":{"enabled":true,"durationSeconds":200}
}
}
labels:
app: site-bob
spec:
rules:
- host: "staging-bob.imgeinc.net"
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: ssl-redirect
port:
name: use-annotation
- backend:
service:
name: svc-host
port:
name: use-annotation
pathType: ImplementationSpecific
Something in my configuration added tagged two security groups as being owned by the cluster. When I checked the load balancer controller logs:
kubectl logs -n kube-system aws-load-balancer-controller-677c7998bb-l7mwb
I saw many lines like:
{"level":"error","ts":1641996465.6707578,"logger":"controller-runtime.manager.controller.targetGroupBinding","msg":"Reconciler error","reconciler group":"elbv2.k8s.aws","reconciler kind":"TargetGroupBinding","name":"k8s-nextsite-sitefest-89a6f0ff0a","namespace":"next-sites","error":"expect exactly one securityGroup tagged with kubernetes.io/cluster/imageinc-next-eks-4KN4v6EX for eni eni-0c5555fb9a87e93ad, got: [sg-04b2754f1c85ac8b9 sg-07b026b037dd4d6a4]"}
sg-07b026b037dd4d6a4 has description: EKS created security group applied to ENI that is attached to EKS Control Plane master nodes, as well as any managed workloads.
sg-04b2754f1c85ac8b9 has description: Security group for all nodes in the cluster.
I removed the tag:
{
Key: 'kubernetes.io/cluster/_cluster name_',
value:'owned'
}
from sg-04b2754f1c85ac8b9
and the TargetGroups started to fill in and everything is now working. Both groups were created and tagged by Terraform. I suspect my worker group configuration is off.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With