Spark Developer

Roivant Sciences

Contract New York, New York, United States Posted 6 years ago

 Write a Review Add Vendor   Add Contact  

About Position

Spark Developer (Contract)

$85.00 / Hourly

New York, New York, United States

Spark Developer

Contract New York, New York, United States Posted 6 years ago

Description

Roivant is dedicated to transformative innovation in healthcare. Their mission is to systematically reduce the time and cost of the drug development process. They partner with innovative biopharmaceutical companies and academic institutions to ensure that important medicines are rapidly delivered to patients. Their goal is to serve their partners, contribute positively to the healthcare system, and improve the lives of patients around the world.Roivant is the parent of a growing family of companies focused on diverse areas including womens health, rare diseases, neurology, cardiometabolic diseases, urology, dermatology, and healthcare data analytics.Roivant is the parent company of subsidiaries. These subsidiaries or Vants as they are referred to internally are Axovant, Enzyvant, Urovant, Metavant,, Myovant, Dermavant, and Datavant. Roivant provides data and infrastructure as a service to the Vants. Within each Vant they have teams called Digital innovators. These innovators are Data Scientist using the latest technique to discover how to bring viable products to market faster. Roivant is currently looking for a Python/Spark developer to help them stand up a Spark cluster sandbox environment for the Digital Innovators to leverage for their Data Science, Artificial Intelligence and Machine Learning functionalities. The Data group is responsible for developing a set of standard tools that these innovators can leverage and facilitate their work. One of those tools is Spark. The spark developers responsibility will be to build the Spark prototype, test it, configure it and then showcase it to the digital innovators so they know how to utilize the tool. The Spark developer will have Installed, configured, code and maintain the Spark solution. Candidates should know PySpark, Spark Infrastructure, Yarn, and cluster management solutions. This role will be a mix of coding and Administration. As the solution becomes more developed there will be less and less administration needed. They are getting data from many different sources. They are getting data from their internal systems like HR, Travel, Finance, legal, enterprise level systems. Another source is bulky data from research data firms, and Claims data firms(up to 15 billion records). The last source of data is data that they actually produce such as trial data, clinical data and commercial data. 80 percent of their data is small data, 10 percent is medium size data sets and the other 10 percent is in the terabyte range. All these data sources feed into the firm wide data lake that they are building out. These data is delivered via S3 buckets, data bricks with encryption keys, and traditional data ingestion. All the data is stored in a parquet data format so potential candidates need specific experience working with data in a Parquet format. Data as a service in Spark. Parquet formats for quick computing Building out a prototype which will be their Spark Platform This will be the beginning of a long term engagement. Building Models in Spark that will eventually feed into Kubernetes and Docker (Devops) Candidate must be able to communicate technically Candidate must have Pyspark and Python experience YARN or other cluster management tool 50 Percent admin work and 50 percent development but will move to more development.

By applying to a job using PingJob.com you are agreeing to comply with and be subject to the PingJob.com Terms and Conditions for use of our website. To use our website, you must agree with the Terms and Conditions and both meet and comply with their provisions.

Questions / Comments:

Display Questions / Comments:

No Questions / comments

Job Summary

$85.00 / Hourly

Contract

New York, New York, United States

Experience Required : 7 Year/s

Posted : 6 years ago

Deadline : September 22, 2018 6 years ago

Job ID : Job0000014371

Roivant Sciences

320 West 37th Street 5th Floor

WWW.Roivant.COM