Site Reliability Engineer
Paramount Pictures Motion Picture Group
Contract New York, New York, United States Posted 1 year ago
About Position
Site Reliability Engineer (Contract)
$0.00 / Hourly
New York, New York, United States
Site Reliability Engineer
Contract New York, New York, United States Posted 1 year ago
Skills
You have a passion for data and seek to quantify all things! You thrive on designing systems with an eye towards scale self-healing and automation as your guiding principles. You love the challenges of monitoring at large scales and are compelled by problems of analysis and large-scale data collection. You are at home with system-engineering challenges and service-based architecture. You have experience with being on-call and seek-out ways to improve the on-call rotation for the team. You can plan project lifecycles and can evangelize best practices across teams. You are passionate about mentorship and seek to promote a culture of collaboration.Description
Paramount Streaming seeks a Senior Site Reliability Engineer for our online television and media-focused web properties. In this role, you will support our Kubernetes platform that serves our streaming products in the cloud. Our team seeks to produce Observability infrastructure that's fast, self-healing, and operates at a global scale. We aim to produce a platform that is both opinionated to reliability best practices, while also providing best-in-class tooling for our engineering organization. This a great opportunity for a seasoned Site Reliability Engineer to build systems that have that global reach, and which impact millions of users.
Responsibilities
- Provide support and guidance of the Observability platform, integrations, and best practices across multiple engineering teams.
- Build and manage Observability infrastructure for a global scale.
- Build self-healing and automated systems on Kubernetes.
- Design and build systems to collect, visualize, and store service health indicators.
- Design Observability tooling utilizing a hybrid of open-source and enterprise solutions.
- Additional other duties and responsibilities, as assigned.
Educational Requirements
- Implementing log collection and storage via Elasticsearch.
- Building visualizations for multiple services, utilizing different types of data sources.
- Working with Prometheus time-series data, producing metrics and integrations.
- Building and supporting robust event queues via Kafka.
- Work with our development teams to instrument their applications and capture events that support our global product deployment.
Experience Requirements
- Bachelor's degree or equivalent experience
- 5+ years managing and monitoring Linux systems
- 2+ years leading the design and implementation of Cloud systems in AWS/GCP using tools like Terraform, and Kubernetes.
- CI/CD tooling such as Jenkins.
- 4+ years’ experience working with monitoring, logging, and visualization tooling, such as Prometheus, Elasticsearch, and Grafana.
- 2+ years’ experience programming in a programming language such as Java, Python, Go
- On call experience
- Ability to manage the lifecycle of multiple projects
- Ability to collaborate across teams
- Experience mentoring junior engineers and writing onboarding documentation
By applying to a job using PingJob.com you are agreeing to comply with and be subject to the PingJob.com Terms and Conditions for use of our website. To use our website, you must agree with the Terms and Conditions and both meet and comply with their provisions.