Hadoop Developer with Cloudera / Denver, CO(Onsite from day 1 or Remote for 1-2 months)
Software engineer with Hadoop
Location: Denver, CO(Onsite from day 1 or Remote for 1-2 months)
Customer ::-- Dish Network
Day 1 onsite is good and if someone is asking for remote (1-2 Months) , please present those folks also
· Working experience on Hadoop YARN cluster running on Cloudera Data Platform (CDH 5.12.X) and EMR 5.X.
· Experience on Big Data infrastructure for batch processing as well as real-time processing (lambda architecture). Responsible for building scalable distributed data solutions in Hadoop.
· Should have extensively worked on AWS cloud services like Storage, Analytics and Management Tools.
· Worked on Remodeling and Migrating existing applications from Cloudera on-premise to AWS cloud services.
· Must have developed and Scheduled multiple AWS Data Pipeline workflows to orchestrate jobs on both Transient and Persistent Clusters.
· Must have developed workflows in OOZIE scheduler with multiple Actions, Forks and Joins.
· Must have developed Lambda Functions to use Cloud Formation Templates to Spin Up and Terminate EMR Clusters.
· Should have utilized AWS services like EMR, S3, Glue Metastore and Athena extensively for building the data applications.
· Should have used Cloud Watch to trigger Lambda functions which standardize the environment daily.
· Should have developed HIVE HQL Scripts for Loading data to External Tables and Internal Tables using Hive Optimization Techniques.
· Must have wrote python scripts to manage AWS resources using standard packages like boto3.
· Must have developed Spark jobs using PySpark and Spark SQL to write data to tables and S3.
· Must have tuned spark job using own random partitioning method to distribute the data evenly while writing to handle data skew.
Looking for a positive response and fruitful alliance :)
Naveen Pandey
Sr. Technical Recruiter
Cell: 512-384-5686
Email: Naveen. pandey@okayainc.com
Comments
Post a Comment
Thanks