Working with composer using airflow

  • date 31st May, 2021 |
  • by Prwatech |


GCP account

Open Console

Open Menu > Cloud Storage > Browser

Click on Create Bucket

Create one bucket with same name as the project ID. Click create

The bucket will be created.

In Composer, Click on Airflow

Choose the login

DAG Airflow will be opened.

Go to Menu > Kubernetes Engine > Clusters

The cluster will be created.

In Airflow, Go to Admin > Variables

Click on Create.

Key                                        Val

gcp_project               <project-ID>

gcs_bucket                 gs://<bucket-name>

gce_zone                    <zone of cluster>

Do these one by one in Key and Val. And press Save and Add Another

In last one Press save

The key and Value will be added.

Open Composer.

Click on DAGs Folder

Copy the path.

Click on Activate Cloud Shell

Paste the below code in shell. In DAG path paste the copied DAG path and press Enter

$          gsutil cp gs://cloud-training/datawarehousing/lab_assets/ gs://<paste the DAG path>

It will copy  into Cluster bucket

The file will be copied into cluster bucket as well as it will displayed in  Airfow

In Airflow, click on the DAG. It will be already executed

Click on Graph View. You can see the graph alike structure.

Hover the curser to each one. You can see the details.

Click any one of it.

Press View Log.

You can see the log for the Execution.

Go to Bucket. Open the Bucket which we created.

The file will be saved in it.

If it is not Executed, Open the Airflow > composer_hadoop_tutorial.

Click on Trigger DAG

Click on Trigger

Click on Graph view. Here you can see the execution.

The below colors shows the execution state.

Its runnning create_dataproc_cluster.

Open Menu > Dataproc > Clusters

Here you can see the cluster is getting created.

Now the green border is on run_dataproc_hadoop. It is executing the content

Then it changes to delete_dataproc_cluster. It will delete the cluster.

Check the cluster in dataproc. The created cluster will be deleted.

Open the Dataproc > jobs. Open the job.

See where the file is saved.

The file will be saved in the bucket

In Airflow click on Code.

You can see the code which is used for execution

To delete the composer environment , Click on Delete.

Press Delete

Quick Support

image image