Open Menu > Cloud Storage > Browser
Click on Create Bucket
Create one bucket with same name as the project ID. Click create
The bucket will be created.
In Composer, Click on Airflow
Choose the login
DAG Airflow will be opened.
Go to Menu > Kubernetes Engine > Clusters
The cluster will be created.
In Airflow, Go to Admin > Variables
Click on Create.
gce_zone <zone of cluster>
Do these one by one in Key and Val. And press Save and Add Another
In last one Press save
The key and Value will be added.
Click on DAGs Folder
Copy the path.
Click on Activate Cloud Shell
Paste the below code in shell. In DAG path paste the copied DAG path and press Enter
$ gsutil cp gs://cloud-training/datawarehousing/lab_assets/hadoop_tutorial.py gs://<paste the DAG path>
It will copy hadoop_tutorial.py into Cluster bucket
The file will be copied into cluster bucket as well as it will displayed in Airfow
In Airflow, click on the DAG. It will be already executed
Click on Graph View. You can see the graph alike structure.
Hover the curser to each one. You can see the details.
Click any one of it.
Press View Log.
You can see the log for the Execution.
Go to Bucket. Open the Bucket which we created.
The file will be saved in it.
If it is not Executed, Open the Airflow > composer_hadoop_tutorial.
Click on Trigger DAG
Click on Trigger
Click on Graph view. Here you can see the execution.
The below colors shows the execution state.
Its runnning create_dataproc_cluster.
Open Menu > Dataproc > Clusters
Here you can see the cluster is getting created.
Now the green border is on run_dataproc_hadoop. It is executing the content
Then it changes to delete_dataproc_cluster. It will delete the cluster.
Check the cluster in dataproc. The created cluster will be deleted.
Open the Dataproc > jobs. Open the job.
See where the file is saved.
The file will be saved in the bucket
In Airflow click on Code.
You can see the code which is used for execution
To delete the composer environment , Click on Delete.