Experiance : Fresher - Core Development in Big Data -Hadoop
Required Certification : Hadoop
Job Responsibilities :
Loading data from different datasets and deciding on which file format is efficient for a task.
Defining Hadoop Job Flows.
Having knowledge of distributed, reliable and scalable data pipelines to ingest and process data in real-time. With fetching impression streams, transaction behaviours, clickstream data and other unstructured data.
Managing Hadoop jobs using scheduler.
Reviewing and managing hadoop log files.
Design and implement column family schemas of Hive and HBase within HDFS,Assign schemas and create Hive tables.
Good to having knowledge of pig and hive scripts with joins on datasets using various techniques.
Apply different HDFS formats and structure like Parquet, Avro, etc. to speed up analytics.