Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Airflow on kubernetes cannot fetch logs

My airflow service runs as a kubernetes deployment, and has two containers, one for the webserver and one for the scheduler. I'm running a task using a KubernetesPodOperator, with in_cluster=True parameters, and it runs well, I can even kubectl logs pod-name and all the logs show up.

However, the airflow-webserver is unable to fetch the logs:

*** Log file does not exist: /tmp/logs/dag_name/task_name/2020-05-19T23:17:33.455051+00:00/1.log
*** Fetching from: http://pod-name-7dffbdf877-6mhrn:8793/log/dag_name/task_name/2020-05-19T23:17:33.455051+00:00/1.log
*** Failed to fetch log file from worker. HTTPConnectionPool(host='pod-name-7dffbdf877-6mhrn', port=8793): Max retries exceeded with url: /log/dag_name/task_name/2020-05-19T23:17:33.455051+00:00/1.log (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fef6e00df10>: Failed to establish a new connection: [Errno 111] Connection refused'))

It seems as the pod is unable to connect to the airflow logging service, on port 8793. If I kubectl exec bash into the container, I can curl localhost on port 8080, but not on 80 and 8793.

Kubernetes deployment:

# Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: pod-name
  namespace: airflow
spec:
  replicas: 1
  selector:
    matchLabels:
      app: pod-name
  template:
    metadata:
      labels:
        app: pod-name
    spec:
      restartPolicy: Always
      volumes:
        - name: airflow-cfg
          configMap:
            name: airflow.cfg
        - name: dags
          emptyDir: {}
      containers:
      - name: airflow-scheduler
        args:
        - airflow
        - scheduler
        image: registry.personal.io:5000/image/path
        imagePullPolicy: Always
        volumeMounts:
        - name: dags
          mountPath: /airflow_dags
        - name: airflow-cfg
          mountPath: /home/airflow/airflow.cfg
          subPath: airflow.cfg
        env:
        - name: EXECUTOR
          value: Local
        - name: LOAD_EX
          value: "n"
        - name: FORWARDED_ALLOW_IPS
          value: "*"
        ports:
          - containerPort: 8793
          - containerPort: 8080
      - name: airflow-webserver
        args:
        - airflow
        - webserver
        - --pid
        - /tmp/airflow-webserver.pid
        image: registry.personal.io:5000/image/path
        imagePullPolicy: Always
        volumeMounts:
        - name: dags
          mountPath: /airflow_dags
        - name: airflow-cfg
          mountPath: /home/airflow/airflow.cfg
          subPath: airflow.cfg
        ports:
        - containerPort: 8793
        - containerPort: 8080
        env:
        - name: EXECUTOR
          value: Local
        - name: LOAD_EX
          value: "n"
        - name: FORWARDED_ALLOW_IPS
          value: "*"

note: If airflow is run in dev environment (locally instead of kubernetes) it all works perfectly.

like image 596
Yuzobra Avatar asked Sep 03 '25 04:09

Yuzobra


2 Answers

Creating a Persistent Volume and storing logs on them might help.

--
kind: PersistentVolume
apiVersion: v1
metadata:
  name: testlog-volume
spec:
  accessModes:
    - ReadWriteMany
  capacity:
    storage: 2Gi
  hostPath:
    path: /opt/airflow/logs/
  storageClassName: standard
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: testlog-volume
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 2Gi
  storageClassName: standard

if you are using helm chart to deploy airflow, you can use

 --set executor=KubernetesExecutor --set logs.persistence.enabled=true --set logs.persistence.existingClaim=testlog-volume
like image 72
Programmer007 Avatar answered Sep 04 '25 23:09

Programmer007


The problem was a bug in how the KubernetesPodExecutor from Airflow v1.10.10 tried to launch the pods. Upgrading to Airflow 2.0 solved the issue.

like image 38
Yuzobra Avatar answered Sep 05 '25 00:09

Yuzobra