Land top tech jobs in Silicon Valley! Find software, data, and AI roles at the biggest U.S. startups and tech giants.

Search This Blog

Sr. Data Engineer

Location : Philadelphia, PA, will be remote to start due to COVID but must be onsite at PA

Python, Spark, Databricks, Scala and AWS is must have.

Requirements: 
·        Databricks has a ‘community’ (free) version that you can log into and play around with the environment that we use. Demonstrating some proficiency in navigating Databricks would be something nice to see. To my knowledge, any coding exercise will take place in Databricks. But no promises there – I’m just mentioning it cause the team has used it for recent interviews.
·        Python is the primary code that’s been evaluated. My project relies on Scala and R mostly, but other projects the team is managing are heavier into Python, with Scala sprinkled in. For example, Databricks is considered as a ‘notebook’ environment where it allows you to write in a few different languages in one set of code. So in any given notebook, you could see SQL, Python, Scala, and R (and we have used all 4 in a single notebook). We don’t do any Java that I’m aware of. If so, very very little. 
·        Spark is absolutely critical. I do not know how they evaluate it in the interviews, but we are all in on Spark. We do a ton of data transformations on Spark Dataframes - manipulating columns into a new column, user-defined functions, reading in TBs of data and filtering it, streaming, ETL, etc… it’s all done in Spark mostly. We use the serverless framework and AWS for building APIs.
·        AWS is vital to our operational scale. We use S3, Lambdas, API Gateway, Kinesis, CloudWatch, Timestream (a new AWS feature for a large scale time series database), Neptune (graph database), SNS, Step functions, VPCs, EC2, RDS (MySQL), and probably a bunch of other things out of my project scope.
Thanks, 
Chandan Soni
Cyberthink Inc.  
a:  685, Route 202/206 Ste. 101, Bridgewater NJ 08807

No comments:

Post a Comment

Thanks

Gigagiglet
gigagiglet.blogspot.com