Lead Data Engineer
Regeneron Pharmaceutical Inc
Contract Tarrytown, New York, United States Posted 11 months ago
About Position
Lead Data Engineer (Contract)
$75.00 / Hourly
Tarrytown, New York, United States
Lead Data Engineer
Contract Tarrytown, New York, United States Posted 11 months ago
Skills
PySpark RedShift Airflow AWS with Lead ExpDescription
· Candidate should have 12 years of experience in Data Engineering. Must have strong work experience with onshore-offshore model
· Designing, creating, testing and maintaining the complete data management & processing systems.
· Candidate need to have in depth understanding of how data pipelines are built
o Typical challenges with fetching data from various sources. How incremental/CDC data flows are handled.
o How do you ensure data quality
o How do you do Data profiling
· Hands-on experience with PySpark, Redshift (SQL) and Airflow at minimum
· Strong hands-on with required tech skills, flexible, right attitude to play the lead role
· Should be able to design and document data model at various levels
· Working closely with the stakeholders.
· Building highly scalable, robust & fault-tolerant systems.
· Knowledge of Hadoop ecosystem and different frameworks inside it – HDFS, YARN, MapReduce, Apache Pig, Hive, Flume, Sqoop, ZooKeeper, Oozie, Impala and Kafka
· Must have experience on SQL-based technologies (e.g. MySQL/ Oracle DB) and NoSQL technologies (e.g. Cassandra and MongoDB)
· Should have Python/Scala/Java Programming skills
· Discovering data acquisitions opportunities
· Finding ways & methods to find value out of existing data.
· Improving data quality, reliability & efficiency of the individual components & the complete system.
· Problem solving mindset working in agile environment
By applying to a job using PingJob.com you are agreeing to comply with and be subject to the PingJob.com Terms and Conditions for use of our website. To use our website, you must agree with the Terms and Conditions and both meet and comply with their provisions.