Sr. Data Engineer

Job Title: Sr. Data Engineer

Location: Tarrytown, NY

Type: CTH (Contract to Hire)

Responsibilities:

Client is a growing technology company that focusing on retail, mortgage lending and life sciences verticals. Looking for a Data Engineer who can design/architect and guide technical solution for our customer’s data processing requirements. The successful applicant will need to support pre-sales, architect a solution, develop data pipelines using open source technologies and guide the development team and review the deliverables for client projects.

The application must be able to use a wide variety of technologies and be aware of the latest trends in the data engineering, open source computing, cloud computing and data warehousing subject areas.

Client operates as a startup so the need for a critical thinker who has strong attention to detail and the ability to be proactive is critical. The data engineer will help develop solutions that meet the project needs, and work closely with the developers and stakeholders to ensure proper execution.

Responsibilities:

· Develop data pipelines using Airflow, Apache Spark, Apache NiFi , Hortonworks data platform, AWS Glue, Athena and other open source tools

· Model different layers of a data warehouse solution that meet the client/project needs

· To contrast and compare different open source tools, databases, their function and application to data processing and analytics solutions

· Apply experience in Apache Spark and/or NoSQL environments as well as traditional RDBMS in order to architect best in class hybrid solutions

· Interacts with both functional and technical groups to understand business needs, provides oversight, identifies architectural options, evaluates options against the overall enterprise, information management architecture, and makes recommendations that return analytic business value

· Design creative and original solutions to solve complex business problems

· Design and develop parallel and distributed data pipelines

· Ability to analyze data, profile data, validate patterns, develop hypothesis on data and prove out hypothesis

· Understanding of the industry competitive landscape and the strengths and weaknesses commercial source tools or emerging open source technologies

Requirements:

· Master’s or Bachelor’s Degree in engineering science or computer science or a related field

· Knowledge of Data Warehousing , Big Data, BI and analytics solution

· Expert knowledge of Linux

· Experience in developing highly performant data pipelines

· Experience in programming languages such as Scala/Java/Python

· 3+ years of experience in designing and running data pipelines

· Experience with open source technologies such as Apache Airflow, Apache NiFi, Hortonworks Data Platform, Apache Atlas, Apache Spark, Kafka, Flink, Cassandra

· Experience with public cloud services such as Microsoft Azure and Amazon’s AWS