apache-airflow/airflow is the Helm chart that we deploy.The first airflow argument is the name we give to the release.Now Airflow can be deployed on GKE with just one command: helm upgrade -install airflow apache -airflow /airflow -n airflow -debug Verify that the chart is in your local repository: helm repo list With this chart we can bootstrap Airflow on our newly created Kubernetes cluster with relative ease.įirst install the official Helm chart for Apache Airflow in your local Helm repository: helm repo add apache-airflow https: // Apache Airflow released the official Helm chart for Airflow in July 2021. Helm is a package manager that bundles Kubernetes applications into so called charts. Now that the cluster is up and running we can install Airflow with Helm. Deploying the official Apache Airflow Helm chart kubectl create namespace airflowĭon't forget to pass the namespace airflow to the -namespace or -n flag in the following kubectl commands.īrowse to the Clusters tab on Kubernetes Engine to view the newly created cluster.Ģ. This is not strictly necessary but it's worthwhile to learn how this feature works. Authenticate kubectl against this cluster with the following command: gcloud container clusters get -credentials airflow -cluster -region "europe-west4"įinally, we will create a Kubernetes namespace called airflow for this deployment using the kubectl CLI. We will use the kubectl CLI to interact with our newly deployed Kubernetes cluster on GKE. gcloud container clusters create airflow -cluster \ You are free to choose a different geographical region. Now we can create a cluster named airflow-cluster with a public endpoint. Your GCP project will have a different Project ID than the one in this article. The Project ID can be found in the Project info panel on the GCP dashboard. Creating a Kubernetes cluster on GKEīefore we can initialize a Kubernetes cluster on GKE we must first set the project in the gcloud CLI using its Project ID: gcloud config set project airflow-gke -338120 If you need a quick introduction to Kubernetes watch this light-hearted video. The CLI-tools gcloud, kubectl and helm.A GCP project named 'airflow-gke' with an active billing account (potentially with free trial credit).This article assumes that the prerequisites have been met on your workstation: This allows us to optimize the infrastructure for our specific use case and lower the cost. However, by managing our own deployment on Kubernetes we maintain more granular control over the underlying infrastructure. It's worth noting that GCP offers its own managed deployment of Airflow called Cloud Composer. This makes integrating Airflow with the many GCP services such as BigQuery and GCS a breeze. The apache-airflow-providers-google Python package provides a larger number of Airflow operators, hooks and sensors. GCP is an excellent cloud provider choice for Airflow. Integrate other GCP services such as Google Cloud Storage.Īfter part two we will have extended our Airflow deployment with a DAG that writes a daily batch of data to a Google Cloud Storage bucket.Automatically pull Airflow DAGs from a private GitHub repository with the git-sync feature.Install Airflow dependencies and custom operators via a Docker image loaded from the Artifact Registry.Manage Airflow Connections using GKE Secrets.Expose the Airflow web server on GKE through a GCP LoadBalancer.Īt the end of this part we will have an Airflow deployment running the LocalExecutor and an Airflow web server accessible through a GCP LoadBalancer.Deploy and configure Airflow using Helm and the values.yaml file.This talk is aimed for Airflow users who would like to make use of all the effort.This two-part article will demonstrate how to deploy and configure Apache Airflow on the Google Kubernetes Engine on GCP using the official Helm chart. Starting from official container image, through quick-start docker-compose configuration, culminating in April with release of the official Helm Chart for Airflow. Over the last year community members made an enormous effort to provide robust, simple and versatile support for those deployments that would respond to all kinds of Airflow users. The full support for Kubernetes deployments was developed by the community for quite a while and in the past users of Airflow had to rely on 3rd-party images and helm-charts to run Airflow on Kubernetes. In this talk Jarek and Kaxil will talk about official, community support for running Airflow in the Kubernetes environment.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |