Prerequisites
GCP account
Open Console.
Click on activate cloud shell
$ git clone https://github.com/GoogleCloudPlatform/training-data-analyst
data:image/s3,"s3://crabby-images/87274/8727449704637189a29139c69fa05f3c4d263c56" alt=""
$ ls
data:image/s3,"s3://crabby-images/e086c/e086cf19974cd6927e6037d758ac003e2677e335" alt=""
Create bucket in console. Give bucket name as same as the project ID
In shell, execute the below command
$ BUCKET=”<bucket-name>”
$ echo $BUCKET
data:image/s3,"s3://crabby-images/ec14e/ec14e475cf1f4ec68d2cc5334a91d69be8bdd26e" alt=""
Open Menu > API services > Library
data:image/s3,"s3://crabby-images/3f4cb/3f4cb5b5ea5f4ccd790ed5a8ed59e8e24b6f2c5f" alt=""
Search Dataflow. Click Dataflow API
data:image/s3,"s3://crabby-images/74fac/74fac59254f6607b2a0ae2f6484d4640338b65b1" alt=""
Click Enable
data:image/s3,"s3://crabby-images/ea74f/ea74ff8e4af4ad0e1c1cdde7d5d40b46fbababf3" alt=""
$ cd training-data-analyst/courses/data_analysis/lab2/python
$ ls
The files will be displayed
data:image/s3,"s3://crabby-images/05b10/05b10a0a76d5d71756f2d3da70d36a48f7cca238" alt=""
$ nano install_packages.sh #open the file install_packages.sh
data:image/s3,"s3://crabby-images/354d6/354d652f37eb395c7006ff510b56b24ab69c6e22" alt=""
The file contents can be shown. This file is to install the components.
data:image/s3,"s3://crabby-images/b2a12/b2a12f432ff4b564c2cb329e271605a6de24e09d" alt=""
$ sudo ./install_packages.sh
data:image/s3,"s3://crabby-images/a86be/a86bea25aef5e5ea5f668750d45ae6d0875e80be" alt=""
To check python version
$ pip-V
$ pip3 -V
data:image/s3,"s3://crabby-images/52be3/52be37d96d97df213aa3f73a61104893d82f0684" alt=""
$ nano grep.py Open the file grep.py and check the content
data:image/s3,"s3://crabby-images/63c63/63c63996f1c9a0a9b49999e80ec7348f5fa77c61" alt=""
data:image/s3,"s3://crabby-images/92ae7/92ae747823537cb63d00488560add06cce29623f" alt=""
$ python3 grep.py
data:image/s3,"s3://crabby-images/6b0eb/6b0eb16048330372438869bff8c24a6dc40f6dfe" alt=""
$ ls /tmp #It will display whether the file is executed or not.
data:image/s3,"s3://crabby-images/a389d/a389d6f95957aed348e1ec4c643bcdcfb70ea15e" alt=""
$ cat /tmp/output-* #It will display detailed output.
data:image/s3,"s3://crabby-images/4287b/4287b8358caf4c84055c34fe1ef47b2aed88d7c8" alt=""
$ gsutil cp ../javahelp/src/main/java/com/google/cloud/training/dataanalyst/javahelp/*.java gs://$BUCKET/javahelp
data:image/s3,"s3://crabby-images/1d4a5/1d4a597b108e04ef53d216ebd266bba6f9bcd1ea" alt=""
Check the file is saved or not.
Open Menu > Cloud Storage.
Open Bucket.
data:image/s3,"s3://crabby-images/2dba7/2dba702b153639c7b284d26574b2979603e26c2d" alt=""
The file will be copied or not.
data:image/s3,"s3://crabby-images/9aca6/9aca6515ccc95c64fbf49f2d633f750af244c7b3" alt=""
$ echo $DEVSHELL_PROJECT_ID $
echo $BUCKET
data:image/s3,"s3://crabby-images/8cb8a/8cb8a599a14bbeeeaa9766ef868b9d456cb04f4e" alt=""
$ nano grepc.py
data:image/s3,"s3://crabby-images/a17de/a17de2b9adfd5642b472fb0f603e782bbf189a97" alt=""
Edit the file.
PROJECT='<project_ID>’
BUCKET='<bucket_name>’
NB : If the Project ID and Bucket is same, we can give the same ID
data:image/s3,"s3://crabby-images/83459/83459e1afdbdf44f6dd668f1c4e1c06f5a5fd3c4" alt=""
To Save and exit. Press ‘Ctrl + X’. Press ‘Y’ and ‘Enter’
$ python3 grepc.py #Execute file grepc.py
data:image/s3,"s3://crabby-images/bdbc0/bdbc0238411632e938ba6b7f1dba3729a9b35a67" alt=""
Open Console >Dataflow > Jobs
data:image/s3,"s3://crabby-images/f2c79/f2c79e3eee18ce6e55c57baab0b6bbff4e39d721" alt=""
Open the Job which is executed.
data:image/s3,"s3://crabby-images/36272/3627254d9f371ffea8b9d409d76182a7b16c5e07" alt=""
Click the Job Graph.
The Graph is displayed.
data:image/s3,"s3://crabby-images/fc852/fc852b4745f2f0ed82493ae409bd85b609c53017" alt=""
In Job Graph on right side you can see the Job info and resource metrics.
data:image/s3,"s3://crabby-images/26efd/26efd7424351ff45463bb0f15dfdc680e56849bc" alt=""
Open Shell.
$ ls
$ nano is_popular.py
data:image/s3,"s3://crabby-images/62975/6297552736c03a1b93a1b0488160af4188115bf7" alt=""
It will open the file is_popular.py
data:image/s3,"s3://crabby-images/566a4/566a4ec5070e138267e4c6a3bd75b639bde89274" alt=""
$ python3 ./is_popular.py #To execute theis_popular.py file
$ cat /tmp/output-* #Display the output
data:image/s3,"s3://crabby-images/446e1/446e18148fd9ba4704957fb22c0f0181c4109b23" alt=""
$ python3 ./is_popular.py –output_prefix=/tmp/myoutput
data:image/s3,"s3://crabby-images/c88e2/c88e24f60aeca2f516a06b07a9da658ab8019f5f" alt=""
$ nano /tmp/myoutput-00000-of-00001
data:image/s3,"s3://crabby-images/0e9a5/0e9a55924c191c61bf15269c03bbd177d63edb4e" alt=""
It will open the file with output.
data:image/s3,"s3://crabby-images/cd95b/cd95bd164c144ff4648cfb023cde1e96bc811bfb" alt=""
Open Menu > Cloud Storage.
Open Bucket.
data:image/s3,"s3://crabby-images/96b6e/96b6ecbf6ca3057a861645e80457cb8671421afe" alt=""
Open javahelp/ folder
data:image/s3,"s3://crabby-images/2c046/2c04650949498474047f64f90047bac5bfadc71e" alt=""
The outputs will be stored in it.
data:image/s3,"s3://crabby-images/29748/29748786f8ca31d01e0913347f38601b244bd678" alt=""