Role: DE (Hadoop/Spark/PySpark) with Dataiku exp Location: NYC or Alpharetta GA Duration: 6 months contract to hire Job Description: Skills required: · Experienced professional with 10-12 years of experience developing and implementing statistical models in Big Data ecosystem, i.e., Hadoop, Spark, HBase, Hive / Impala or any other similar distributed computing technology · Proficiency with Python/R and basic libraries for statistical/econometric modeling such as scikit-learn, pandas · Experienced in Hadoop, Spark, HDFS, Python, R, PySpark and other leading technologies · Proficiency with Dataiku or similar tools · Proficiency in data analysis using complex and optimized SQL and / or above-mentioned technologies · Understanding of data architecture, structures, data modeling and database design and performance management · Good written and verbal communication skills Responsibilities include: · Work closely with members of WM Strats and Modeling team in the design, development and implementation of large statistical databases in Dataiku/Hadoop environment · Work closely with members of WM Strats and Modeling team in the implementation of statistical and econometric models in Python/PySpark/R on the Dataiku platform · Work closely with members of WM Strats and Modeling team to facilitate processing large data in · Hadoop environment using Spark/PySpark/RSpark · Ensure data integrity through - data quality, validation, governance and transparency · Production deployment and model monitoring to ensure stable performance and adherence to standards Proficiency I Experience with the following a plus: o In-depth understanding of Statistics o Finance, Mortgages, Bank Deposit Products |