Working with composer using airflow

  • date 31st May, 2021 |
  • by Prwatech |
  • 0 Comments

Cloud Composer and Airflow Integration

Prerequisites

GCP account

Open Console

Open Menu > Cloud Storage > Browser

Click on Create Bucket

Create one bucket with same name as the project ID. Click create

The bucket will be created.

In Composer, Click on Airflow

Choose the login

DAG Airflow will be opened.

Go to Menu > Kubernetes Engine > Clusters

The cluster has been created.

In Airflow, Go to Admin > Variables

Click on Create.

Key                                        Val

gcp_project               <project-ID>

gcs_bucket                 gs://<bucket-name>

gce_zone                    <zone of cluster>

Do these one by one in Key and Val. And press Save and Add Another

In last one Press save

The key and Value will be added.

Open Composer.

Click on DAGs Folder

Copy the path.

Click on Activate Cloud Shell

Paste the below code in shell. In DAG path paste the copied DAG path and press Enter

$          gsutil cp gs://cloud-training/datawarehousing/lab_assets/hadoop_tutorial.py gs://<paste the DAG path>

It will copy hadoop_tutorial.py  into Cluster bucket

The file will be copied into the cluster bucket, and it will also be displayed in Airflow.

In Airflow, click on the DAG. 

Please click on the Graph View. You will see a graph-like structure.

Hover the curser to each one. You can see the details.

Click any one of it.

Press View Log.

You can see the log for the Execution.

Go to Bucket. Open the Bucket which we created.

file saved.

If it is not Executed, Open the Airflow > composer_hadoop_tutorial.

 Trigger DAG

Trigger

 Graph view. Here you can see the execution.

The below colors shows the execution state.

Its runnning create_dataproc_cluster.

Open Menu > Dataproc > Clusters

cluster  created.

Now the green border is on run_dataproc_hadoop. It is executing the content

Then it changes to delete_dataproc_cluster. It will delete the cluster.

Check the cluster in dataproc. deleted.

Open the Dataproc > jobs. Open the job.

 file saved.

The system will save the file in the bucket.

In Airflow click on Code.

 

To delete the composer environment , Click on Delete.

Press Delete

 

Cloud Composer and Airflow Integration

Quick Support

image image