Data Engineer Big Data, Spark, SQL, Python, AWS
Gannett Company Inc
Contract McLean, Virginia, United States Posted 4 years ago
About Position
Data Engineer Big Data, Spark, SQL, Python, AWS (Contract)
$90.00 / Hourly
McLean, Virginia, United States
Data Engineer Big Data, Spark, SQL, Python, AWS
Contract McLean, Virginia, United States Posted 4 years ago
Description
Job Description Same as below with less expectation on architectural work (Required) Strong understanding of Spark with experience delivering large scale data processing application in Spark and related Big Data technologies (Required) Strong understanding of performance tuning for Spark (Scala), Hadoop, Presto and Hive. Performance tuning can be related to Hadoop Data Modeling, JVM tuning and Spark. (Required) Strong understanding of data structures along with strengths and weaknesses using them. One example is data structures for modeling a time series data set. (Required) Strong in SQL with an understanding of joins, windowing and OLAP (Required) Hands on coding in Scala and Java along with Python. We use Python for our operational and workflow management. We use Scala and Java for our data processing (over Spark & Hadoop) (Optional) Experience working in Big Data by any cloud provider (AWS, Google). Experience with Google is a huge bonus. (Required) Exceptional communication skills with an ability to work in a team of Sr and Junior engineers. Must be able to articulate and explain code, architecture and data model to Engineers and NonTechnical audience (Optional) Experience in NoSQL database such as HBase is a huge bonus Just to reiterate following are the key requirements for the position
Rock solid in Spark / Scala Strong understanding of Hadoop and its internal architecture (strong enough for a person to acknowledge the challenges associated with any solutions in Hadoop). Basic understanding of AWS (S3, EC2) Working skills in Python (we dont expect Python expertise) Strong SQL skills Ability to understand challenges associated with SQL (Joins, windowing, etc) in distributed environment (Hadoop/Spark/Hive) vs non distributed environment (Oracle/MySQL) Good Communicator with good listening skills Ability to understand the problem, ask relevant questions about problems. Ability to articulate technical design to appropriate audience via Email, inperson or group of people. Strong learning ability we work in diverse technical environment so person having ability to pick up things faster is key trait. We do not expect a person to be proficient in every technology we use. Basic understanding of Data Structures and associated complexities
By applying to a job using PingJob.com you are agreeing to comply with and be subject to the PingJob.com Terms and Conditions for use of our website. To use our website, you must agree with the Terms and Conditions and both meet and comply with their provisions.