Search This Blog

August 2, 2021

DE (Hadoop/Spark/PySpark) with Dataiku exp

Role: DE (Hadoop/Spark/PySpark) with Dataiku exp

Location: NYC or Alpharetta GA

Duration: 6 months contract to hire

 

Job Description:

Skills required:

·      Experienced professional with 10-12 years of experience developing and implementing statistical models in Big Data ecosystem, i.e., Hadoop, Spark, HBase, Hive / Impala or any other similar distributed computing technology

·      Proficiency with Python/R and basic libraries for statistical/econometric modeling such as scikit-learn, pandas

·      Experienced in Hadoop, Spark, HDFS, Python, R, PySpark and other leading technologies

·      Proficiency with Dataiku or similar tools

·      Proficiency in data analysis using complex and optimized SQL and / or above-mentioned technologies

·      Understanding of data architecture, structures, data modeling and database design and performance management

·      Good written and verbal communication skills

 

Responsibilities include:

·      Work closely with members of WM Strats and Modeling team in the design, development and implementation of large statistical databases in Dataiku/Hadoop environment

·      Work closely with members of WM Strats and Modeling team in the implementation of statistical and econometric models in Python/PySpark/R on the Dataiku platform

·      Work closely with members of WM Strats and Modeling team to facilitate processing large data in

·      Hadoop environment using Spark/PySpark/RSpark

·      Ensure data integrity through - data quality, validation, governance and transparency

·      Production deployment and model monitoring to ensure stable performance and adherence to standards

 

Proficiency I Experience with the following a plus:

o  In-depth understanding of Statistics

o  Finance, Mortgages, Bank Deposit Products

Thanks & Regards!

Neha Bharti Chaudhary

Sr. Talent Acquisition Specialist

neha.bharti@cyberThink.com

Contact: 908 739 3810 Ext 166