Data Engineer
State Street Corporation
Contract Quincy, Massachusetts, United States Posted 2 weeks ago
About Position
Data Engineer (Contract)
$70.00 / Hourly
Quincy, Massachusetts, United States
Data Engineer
Contract Quincy, Massachusetts, United States Posted 2 weeks ago
Skills
• Technical Expertise: 8+ years in data engineering with strong skills in Python PySpark SQL and extensive hands-on experience with Databricks and big data frameworks. Expertise in integrating data science workflows and deploying ML models for real-time and batch processing within a cybersecurity context. • Cloud Proficiency: Advanced proficiency in AWS including EC2 S3 Lambda ELB and container orchestration (Docker Kubernetes). Experience in managing large-scale data environments on AWS optimizing for performance security and compliance. • Security Integration: Proven experience implementing SCAS SAST DAST/WAS and secure DevOps practices within an SDLC framework to ensure data security and compliance in a high- stakes cybersecurity environment. • Data Architecture: Demonstrated ability to design and implement complex data architectures including data lakes data warehouses and lake house solutions. Emphasis on secure scalable and highly available data structures that support ML-driven insights and real-time analytics. • Data Quality ; Governance: Hands-on experience with automated data quality checks data lineage and governance standards. Proficiency in Databricks DQM or similar tools to enforce data integrity and compliance across pipelines. • Data Analytics; Visualization: Proficiency with analytics and visualization tools such as Databricks Power BI and Tableau to generate actionable insights for cybersecurity risks threat patterns and vulnerability trends. Skilled in translating complex data into accessible visuals and reports for cross-functional teams. • CI/CD and Automation: Experience building CI/CD pipelines that automate testing security scans and deployment processes. Proficiency in deploying ML models and data processing workflows using CI/CD ensuring consistent quality and streamlined delivery. • Agile Experience: Deep experience in Agile/Scrum environments with a thorough understanding of Agile core values and principles effectively delivering complex projects with agility and cross-functional collaboration.Description
Data Integration, API Development: Integrate diverse cybersecurity data sources using varietyof API mechanisms and to standardize and streamline data across the data and user planes.
Build and maintain data APIs for seamless access to data pipelines, enabling real-time insights for applications, machine learning models, and analytical layers.
Data Pipeline Engineering Optimization: Design, develop, and optimize large-scale ETL/ELTpipelines on Databricks to efficiently process and transform cybersecurity data. Utilize Python,PySpark, and Databricks to automate and standardize data workflows across stages (raw,cleaned, curated), ensuring scalability and high performance.
Data Quality & Governance: Implement automated data quality checks, leveraging Databricks DQM tools and CI/CD pipelines to uphold data integrity and governance standards. Ensure datlineage, metadata management, and compliance with cybersecurity and privacy regulations,applying rigorous quality standards across data ingestion and processing workflows.
Data Analytics & Visualization: Design centralized data models and perform in-depth data analysis to support cybersecurity and risk management objectives. Develop visualizations and dashboards using tools like Databricks, encapsulate data to spin up to React.js application layer to provide stakeholders with actionable insights into threat landscapes, vulnerability trends, andperformance metrics across the platform.
Scalable & Secure Data Architecture: Architect and manage secure, high-performance dataenvironments on Databricks, utilizing AWS services such as S3, ELB, and Lambda. Ensure dataavailability, consistency, and security, aligning with AWS best practices and data encryptionstandards to safeguard sensitive cybersecurity data.
Agile Product; Engineering Continuous Delivery: Collaborate with advanced Agile Product; Engineering cross-functional teams to deliver data-driven insights through analytics tools andcustom visualizations that inform strategy and decision-making. Empower stakeholders withtimely, actionable intelligence from complex data analyses, enhancing their ability to respond toevolving cybersecurity risks.
Data Science & ML Integration: deploy machine learning models, including predictive analytics,anomaly detection, and risk scoring algorithms, into the CASM platform. Leverage Python andPySpark to enable real-time and batch processing of model outputs, enhancing CASM Platform’s proactive threat detection and response capabilities.
Mentorship; Best Practices Promotion: Mentor junior engineers, establishing best practices in data engineering, DevOps, data science, and analytics. Encourage high standards in model deployment, data security, performance optimization, and visualization practices, fostering aculture of innovation and excellence.
Responsibilities
- • Advanced Data Modeling; Governance: Expertise in designing data models for cybersecurity
- data analytics, emphasizing data lineage, federation, governance, and compliance. Experience
- ensuring security and privacy within data architectures.
- • Machine Learning; Predictive Analytics: Experience deploying ML algorithms, predictive
- models, and anomaly detection frameworks to bolster CASM platform’s cybersecurity
- capabilities.
- • High-Performance Engineering Culture: Background in mentoring engineers in data
- engineering best practices, promoting data science, ML, and analytics integration, and fostering
- a culture of collaboration and continuous improvement.
By applying to a job using PingJob.com you are agreeing to comply with and be subject to the PingJob.com Terms and Conditions for use of our website. To use our website, you must agree with the Terms and Conditions and both meet and comply with their provisions.