CrawlJobs Logo

Engineering Manager - Datasets Enrichment

wayve.ai Logo

Wayve

Location Icon

Location:
United Kingdom , London

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

We are hiring an Engineering Manager (M4) to lead the team responsible for both semantic enrichment pipelines and the final silver and gold layers of the Wayve Corpus. This team transforms multimodal driving data and perception model outputs into reliable, high quality data products used across autonomy, evaluation, simulation, and research. The role combines high scale ML in the loop enrichment pipelines such as semantic segmentation, cuboid annotation, embeddings, and BC and ODD signals with production grade data engineering ownership including schema governance, table interfaces, quality gates, lineage, and SLO based operations. You will lead a team of up to 10 engineers across ML engineering, perception, and data engineering. You will own a multi quarter roadmap that scales enrichment throughput, improves data quality, and hardens corpus tables used across Wayve. You will partner with application, model training, and evaluation, teams to ensure alignment on requirements and interfaces. This role requires a leader comfortable at the intersection of ML systems and data engineering who can provide clear direction, reliable delivery, and strong people leadership during a period of significant technical and organizational scaling.

Job Responsibility:

  • Lead, coach, and grow a team of up to 10 engineers across ML engineering, perception, and data engineering
  • Define team structure, roles, leveling, hiring needs, and long term growth plans
  • Own and scale semantic enrichment pipelines including semantic segmentation, cuboids, embeddings, scenario, and ODD classification
  • Integrate ML assisted labeling, validation, and automated quality checks into enrichment workflows
  • Own the silver and gold layers of the Wayve Corpus including schema evolution, versioning, documentation, lineage, observability, and SLO backed operations
  • Establish data quality gates and quality metrics for enriched and corpus level data
  • Deliver a multi quarter roadmap spanning enrichment and corpus systems with predictable execution
  • Lead architecture decisions to improve efficiency, maintainability, and reliability
  • Partner with Data Platform on distributed compute systems including Spark, Databricks, Ray, and Flyte
  • Align with autonomy, evaluation, and research teams on corpus requirements, interfaces, and lifecycle

Requirements:

  • 2+ years managing engineering teams in ML systems, perception, or large scale data infrastructure
  • Experience delivering ML in production or perception pipelines, or strong experience in production data engineering systems. Ideally exposure to both
  • Proven ownership of production data tables such as Delta Lake, Spark, Hive, or BigQuery including schema evolution and multi team consumers
  • Experience with distributed compute systems such as Spark, Databricks, Ray, or Flyte
  • Experience building observable, high throughput pipelines
  • Ability to lead multi quarter delivery, manage dependencies, and align with multiple stakeholders
  • Strong communication and cross functional collaboration skills

Nice to have:

  • Experience with multimodal perception data such as images, video, and LiDAR
  • Experience with annotation workflows or ML assisted labeling systems
  • Experience with embeddings, feature stores, or ML data layers
  • Familiarity with data quality frameworks and operational analytics
  • Experience in autonomous vehicles, robotics, or large scale computer vision systems

Additional Information:

Job Posted:
January 01, 2026

Employment Type:
Fulltime
Work Type:
Hybrid work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Engineering Manager - Datasets Enrichment

Tech Lead - Pretraining Team, Wayve Foundation Model

This is a rare opportunity to lead foundational work at the intersection of larg...
Location
Location
United States , Sunnyvale
Salary
Salary:
Not provided
wayve.ai Logo
Wayve
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Leadership in data-centric AI: Experience leading research or engineering teams focused on dataset curation, filtering, or enrichment at scale, particularly for large-scale model pretraining.
  • Contributions to data benchmarks or tools: Involvement in projects like DataComp, LAION, DINO, MOLMO, or equivalent initiatives that define or evaluate pretraining dataset quality.
  • Deep understanding of distributed data processing: Strong working knowledge of frameworks such as Ray, Spark, Dask, or equivalent, and designing scalable, fault-tolerant data pipelines.
  • Hands-on deep learning expertise: Strong proficiency in PyTorch and a solid grasp of how data quality, distribution, and structure impact training dynamics and model generalisation.
  • Experimental mindset: Demonstrated ability to run and interpret data-centric experiments (e.g., small-scale trials, ablations) to inform large-scale model training.
  • Collaboration with research: Experience working closely with ML researchers and contributing to experimental design, pretraining strategies, or evaluation design.
  • Minimum 5 years of relevant industry experience: Including at least several years in data-heavy, model-driven environments involving deep learning at scale.
Job Responsibility
Job Responsibility
  • Lead data curation, enrichment, and filtering efforts for large-scale pretraining of embodied models
  • Build and manage distributed data processing and ingestion pipelines across modalities
  • Partner with research teams to run data-centric experiments and influence model training strategy
  • Identify, integrate, and leverage third-party datasets to enhance pretraining and evaluation
  • Manage and mentor a team of engineers and data scientists to deliver scientific and technical impact
What we offer
What we offer
  • Attractive compensation with salary and equity
  • Immersion in a team of world-class researchers, engineers and entrepreneurs
  • A unique position to shape the future of autonomy and tackle the biggest challenge of our time
  • Bespoke learning and development opportunities
  • Relocation support with visa sponsorship
  • Flexible working hours - we trust you to do your job well, at times that suit you and your time
  • Benefits such as an onsite chef, workplace nursery scheme, private health insurance, therapy, daily yoga, onsite bar, large social budgets, unlimited L&D requests, enhanced parental leave, and more!
  • Fulltime
Read More
Arrow Right

Senior Platform Engineer, ML Data Systems

We’re looking for an ML Data Engineer to evolve our eval dataset tools to meet t...
Location
Location
United States , Mountain View
Salary
Salary:
137871.00 - 172339.00 USD / Year
khanacademy.org Logo
Khan Academy
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s or Master’s degree in Computer Science, Data Engineering, or a related field
  • 5 years of Software Engineering experience with 3+ of those years working with large ML datasets, especially those in open-source repositories such as Hugging Face
  • Strong programming skills in Go, Python, SQL, and at least one data pipeline framework (e.g., Airflow, Dagster, Prefect)
  • Experience with data versioning tools (e.g., DVC, LakeFS) and cloud storage systems
  • Familiarity with machine learning workflows — from training data preparation to evaluation
  • Familiarity with the architecture and operation of large language models, and a nuanced understanding of their capabilities and limitations
  • Attention to detail and an obsession with data quality and reproducibility
  • Motivated by the Khan Academy mission “to provide a free world-class education for anyone, anywhere.”
  • Proven cross-cultural competency skills demonstrating self-awareness, awareness of other, and the ability to adopt inclusive perspectives, attitudes, and behaviors to drive inclusion and belonging throughout the organization.
Job Responsibility
Job Responsibility
  • Evolve and maintain pipelines for transforming raw trace data into ML-ready datasets
  • Clean, normalize, and enrich data while preserving semantic meaning and consistency
  • Prepare and format datasets for human labeling, and integrate results into ML datasets
  • Develop and maintain scalable ETL pipelines using Airflow, DBT, Go, and Python running on GCP
  • Implement automated tests and validation to detect data drift or labeling inconsistencies
  • Collaborate with AI engineers, platform developers, and product teams to define data strategies in support of continuously improving the quality of Khan’s AI-based tutoring
  • Contribute to shared tools and documentation for dataset management and AI evaluation
  • Inform our data governance strategies for proper data retention, PII controls/scrubbing, and isolation of particularly sensitive data such as offensive test imagery.
What we offer
What we offer
  • Competitive salaries
  • Ample paid time off as needed
  • 8 pre-scheduled Wellness Days in 2026 occurring on a Monday or a Friday for a 3-day weekend boost
  • Remote-first culture - that caters to your time zone, with open flexibility as needed, at times
  • Generous parental leave
  • An exceptional team that trusts you and gives you the freedom to do your best
  • The chance to put your talents towards a deeply meaningful mission and the opportunity to work on high-impact products that are already defining the future of education
  • Opportunities to connect through affinity, ally, and social groups
  • 401(k) + 4% matching & comprehensive insurance, including medical, dental, vision, and life.
  • Fulltime
Read More
Arrow Right
New

Senior Technical Product Manager, Identity

tvScientific is looking for a Senior Product Manager to lead our identity graph ...
Location
Location
United States
Salary
Salary:
165000.00 - 180000.00 USD / Year
tvscientific.com Logo
tvScientific
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 6+ years in product management, solutions engineering, or technical partnerships focused on data-driven products, audience targeting, or martech
  • Deep experience working with Data Science and Engineering teams to build and launch data infrastructure
  • Expertise in identity resolution, segmentation, onboarding, and activation workflows
  • Proven ability to source, evaluate, and operationalize third-party data partnerships
  • Strong analytics mindset and comfort working with large datasets
  • Technical fluency across APIs, data pipelines, audience graphs, and privacy frameworks
  • Excellent communication skills that translate technical ideas into real business value
  • A solid foundation in AdTech
Job Responsibility
Job Responsibility
  • Own the identity product strategy and lead the vision for tvScientific’s identity graph enabling persistent, multi-device recognition across CTV and digital
  • Partner with Data Engineering and Data Science to architect and optimize graph-based data models representing user identity, household relationships, and device linkages
  • Design APIs and services for real-time identity resolution, enrichment, and activation in programmatic ad workflows
  • Embed privacy-centric solutions like UID 2.0, RampID, and emerging standards into the graph infrastructure to ensure compliance and scalability
  • Source, evaluate, and onboard third-party identity and behavioral data providers to improve graph completeness and targeting precision
  • Lead technical integration and operationalization of identity and graph enrichment partners, managing ingestion, data mapping, and deployment
  • Collaborate with Legal, Security, and Data teams to ensure compliance with CCPA, GDPR, and global privacy regulations
  • Maintain a strategic view of the identity and data ecosystem to recommend build versus partner strategies that maximize value
  • Write detailed product requirements, data specifications, and user stories to guide Engineering and Infrastructure teams on performant graph storage, traversal, and querying
  • Define and monitor key metrics such as match rates, accuracy, persistence, identity coverage, and campaign performance impact
What we offer
What we offer
  • Full health, dental, and vision insurance - up to 95% funded by the company for employees
  • Employee stock option program
  • Company-sponsored retirement plan with a matching contribution program
  • 12 annual paid holidays (including 2 flexible days)
  • Generous PTO policy
  • A remote-first environment that allows employees flexibility to work from most places in the US
  • Fulltime
Read More
Arrow Right

Intern, Insights and Analytics

The BioMarin Summer Internship Program will enable students to gain valuable exp...
Location
Location
United States , Novato
Salary
Salary:
24.00 - 32.00 USD / Hour
biomarin.com Logo
BioMarin Pharmaceutical
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong communication and presentation skills
  • Comfort working with cross-functional teams and managing multiple inputs
  • Strong analytical and problem-solving abilities
  • Experience with data querying or manipulation (e.g., SQL)
  • Familiarity with data visualization tools (e.g., Power BI or similar)
  • Basic understanding of data modelling concepts
  • Ability to work with large datasets and maintain data accuracy
  • Clear written and verbal communication skills
  • Ability to collaborate effectively with cross-functional teams
  • Strong organizational skills and attention to detail
Job Responsibility
Job Responsibility
  • Support the refresh of a supplier and raw material risk-prioritization framework
  • Extract, prepare, and integrate operational and procurement data from enterprise systems
  • Update and maintain a risk-scoring model that incorporates usage, criticality, and performance factors
  • Build and refine analytical dashboards to highlight high-risk suppliers and materials
  • Summarize findings and present insights to cross-functional stakeholders
  • Develop and enhance analytics dashboards using data from cross-functional workflow and tracking tools
  • Model activity, resource, and performance metrics to support operational planning and process improvement
  • Collaborate with business partners to validate data inputs and ensure accuracy of underlying datasets
  • Present dashboard updates and insights to end users to support decision-making
  • Assist in building semantically enriched data models designed for AI-assisted business analytics
What we offer
What we offer
  • Paid hourly wage, paid company holidays, and sick time
  • Apply skills and knowledge learned in the classroom to on-the-job experiences
  • Comprehensive, value-added project(s)
  • Develop skills specific to your major
  • Opportunities for professional development by building relationships and learning about other parts of the business
  • Participate in company all hands meetings, monthly community lunches
  • Corporate office amenities such as: 24/7 on-site gym, coffee truck, snacks
  • Access to Employee Resource Groups
  • Fulltime
Read More
Arrow Right
New

Senior Java Developer

This role is part of an initiative to build a real-time data pipeline for proces...
Location
Location
Canada , Mississauga
Salary
Salary:
120800.00 - 170800.00 USD / Year
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 6-10 years of professional experience in Java application development
  • Expertise in Spring Boot and microservices architecture
  • Strong experience with Elasticsearch (indexing, queries, aggregations)
  • Hands-on experience with Apache Kafka (publish/subscribe, streams, scalability)
  • Proficiency in Oracle Database (SQL, PL/SQL, optimization)
  • Extensive experience with Apache Spark for batch processing, including Spark SQL
  • Experience with big data ecosystems and cloud-based data platforms (e.g., Hadoop, Data Lakes, Snowflake, Databricks) is highly desirable
  • Experience with caching frameworks (Redis or equivalent)
  • Ability to effectively leverage Gen AI coding assistants for improved development productivity
  • Knowledge of real-time data processing and large-scale batch processing and data pipeline design
Job Responsibility
Job Responsibility
  • Design, develop, and maintain high-performance Java applications for processing front-office chat data in real time
  • Design, develop, and optimize batch processing jobs using Apache Spark for large-scale data transformation and analysis
  • Implement config-driven, Spring-based components for data ingestion, transformation, and enrichment
  • Develop and optimize REST APIs for integration with NLP engines, internal systems, and external applications
  • Integrate and manage Apache Kafka for high-throughput, low-latency event streaming
  • Utilize Elasticsearch for efficient indexing and querying of large chat-derived datasets
  • Write optimized Oracle SQL/PLSQL for configuration management
  • Leverage continuous integration pipelines to streamline development and deployment
  • Use Gen AI development tools (Copilot and DevinAI) to write, review, and optimize code efficiently
  • Collaborate with business analysts, product team and developers to ensure system reliability, scalability, and alignment with requirements
  • Fulltime
Read More
Arrow Right

Data Product Specialist

Join our Global Digital & Loyalty team as a Data Product Specialist who lives an...
Location
Location
Poland , Warszawa
Salary
Salary:
Not provided
https://www.circlek.com Logo
Circle K
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Responsive to evolving business needs
  • Analytical thinker who balances creativity with pragmatism
  • Excellent communicator in English, able to translate complex concepts for technical and non-technical stakeholders
  • Customer-obsessed problem solver who collaborates across geographies and levels
  • Curious lifelong learner with a passion for retail, loyalty programs and emerging analytics techniques
  • Higher degree in Engineering, Data Science, Statistics or a related analytical discipline
  • 3+ years of experience in one or more of: data engineering, data analytics or data stewardship
  • Strong awareness of data governance frameworks and privacy regulations (e.g., GDPR)
  • Hands-on with analytical tools such as Power BI, Excel and cloud data warehouses
  • Familiarity with mobile analytics platforms (e.g., Google Analytics) and loyalty program data is a plus
Job Responsibility
Job Responsibility
  • Partner with the Group Product Manager to maintain the data product vision, roadmap and backlog with clear acceptance criteria and measurable outcomes
  • Translate business questions into metric definitions, data models and scalable self-service data products
  • Collaborate with Digital Analytics to ensure accurate tagging of key user events in mobile apps (e.g., GA4)
  • Coordinate with Data Engineering to integrate mobile apps and retail loyalty datasets for a unified end-to-end customer journey
  • Monitor data pipelines, investigate quality issues and implement automated validation, alerting and remediation processes
  • Operationalize data governance, privacy and security requirements
  • Maintain the data catalog and lineage documentation to support discovery
  • Define and enforce data contracts, SLAs and schema-change processes with source teams
  • Liaise with Product Performance Analytics to deliver certified datasets on time for monthly leadership dashboards and ad-hoc reporting
  • Conduct stakeholder interviews to confirm data readiness for key insights and dashboards
What we offer
What we offer
  • Contract of employment
  • Annual bonus
  • Private medical care
  • Cafeteria Platform/Multisport
  • English lessons subsidized by the company
  • Group insurance
  • Attractive discounts for products and services at our stations
  • Employee stock purchase plan
  • Employee Assistance Program (Lyra)
  • Modern and convenient office
  • Fulltime
Read More
Arrow Right

Data Analyst

Part of the Technology and Transformation Analytics team, the Data Analyst is an...
Location
Location
United Kingdom , Home based
Salary
Salary:
37000.00 GBP / Year
migranthelpuk.org Logo
Migrant Help
Expiration Date
February 20, 2026
Flip Icon
Requirements
Requirements
  • Experience collating and analysing management information and performance metrics
  • Strong SQL skills with experience querying relational databases (e.g. SQL Server)
  • Exposure to data modelling, enrichment and transformation techniques
  • Ability to work with complex datasets and apply business logic to analytical outputs
  • Excellent communication skills with the ability to explain insights to technical and non‑technical audiences in clear written English
  • This post is subject to a Disclosure and Barring Service (DBS) check
  • This post is subject to a Counter Terrorism Check (CTC)
  • Be able to provide a valid passport eg. 10 year full British passport, EU or non-EU Passport with indefinite leave to remain
  • Be able to provide continuous UK address history for the previous 5 years
  • Provide full employment history for the previous 3 years and/or suitable documentation to cover any gaps in employment
Job Responsibility
Job Responsibility
  • Design, build and maintain key performance indicators and management reports, delivering weekly and monthly insights, while ensuring accurate integration into strategic reporting cycles
  • Develop scalable Power BI dashboards and data visualisation outputs, underpinned by well governed models, and documented data pipelines
  • Partner with stakeholders across the organisation to define metrics and deliver analysis, translating findings into clear, actionable recommendations and compelling data stories
  • Engineer and manage robust ETL pipelines (including APIs) to integrate diverse data sources, enabling the creation of impactful, real-time reporting solutions
  • Design and maintain fit‑for‑purpose data models, views and transformations that enable reliable, near real‑time reporting solutions
  • Document data lineage and definitions to support transparency and reuse
  • Ensure data quality and integrity through rigorous validation, cleaning and reconciliation processes, fostering trust and reliability in all reporting outputs
  • Contribute to standardised metrics and data dictionaries to promote consistency and clarity across the organisation
  • Support the implementation of the Technology and Transformation Strategy, contributing to platform upgrades, migrations and scalable architecture improvements
  • Collaborate with teams and stakeholders across Migrant Help to identify strategic growth opportunities, understand analytical needs and translate requirements into deliverables
What we offer
What we offer
  • Our working week is 35 hours per week offering flexibility and work life balance
  • Enhanced family friendly provisions
  • Employees will gain an extra day annual leave per year to a maximum of 39 days, including bank holidays (pro-rata)
  • Option to buy or sell up to 5 days of annual leave
  • Access to Perkbox, an employee rewards and benefits platform with over 9,000 deals and discounts, a range of free perks, employee wellbeing support and other additional employee benefits and recognitions
  • Wellbeing support
  • Migrant Help offers employees a non-contributory pension scheme Migrant Help pays 8% worth of employee salary into the pension scheme
  • Fulltime
!
Read More
Arrow Right

Data Analyst

Part of the Technology and Transformation Analytics team, the Data Analyst is an...
Location
Location
United Kingdom
Salary
Salary:
37000.00 GBP / Year
jobs.360resourcing.co.uk Logo
360 Resourcing Solutions
Expiration Date
February 20, 2026
Flip Icon
Requirements
Requirements
  • Experience collating and analysing management information and performance metrics
  • Strong SQL skills with experience querying relational databases (e.g. SQL Server)
  • Exposure to data modelling, enrichment and transformation techniques
  • Ability to work with complex datasets and apply business logic to analytical outputs
  • Excellent communication skills with the ability to explain insights to technical and non‑technical audiences in clear written English
  • This post is subject to a Disclosure and Barring Service (DBS) check
  • This post is subject to a Counter Terrorism Check (CTC)
  • Be able to provide a valid passport eg. 10 year full British passport, EU or non-EU Passport with indefinite leave to remain
  • Be able to provide continuous UK address history for the previous 5 years
  • Provide full employment history for the previous 3 years and/or suitable documentation to cover any gaps in employment
Job Responsibility
Job Responsibility
  • Design, build and maintain key performance indicators and management reports, delivering weekly and monthly insights, while ensuring accurate integration into strategic reporting cycles
  • Develop scalable Power BI dashboards and data visualisation outputs, underpinned by well governed models, and documented data pipelines
  • Partner with stakeholders across the organisation to define metrics and deliver analysis, translating findings into clear, actionable recommendations and compelling data stories
  • Engineer and manage robust ETL pipelines (including APIs) to integrate diverse data sources, enabling the creation of impactful, real-time reporting solutions
  • Design and maintain fit‑for‑purpose data models, views and transformations that enable reliable, near real‑time reporting solutions
  • Document data lineage and definitions to support transparency and reuse
  • Ensure data quality and integrity through rigorous validation, cleaning and reconciliation processes, fostering trust and reliability in all reporting outputs
  • Contribute to standardised metrics and data dictionaries to promote consistency and clarity across the organisation
What we offer
What we offer
  • Our working week is 35 hours per week offering flexibility and work life balance
  • Enhanced family friendly provisions
  • Employees will gain an extra day annual leave per year to a maximum of 39 days, including bank holidays (pro-rata)
  • Option to buy or sell up to 5 days of annual leave
  • Access to Perkbox, an employee rewards and benefits platform with over 9,000 deals and discounts, a range of free perks, employee wellbeing support and other additional employee benefits and recognitions
  • Wellbeing support
  • Migrant Help offers employees a non-contributory pension scheme Migrant Help pays 8% worth of employee salary into the pension scheme
  • Fulltime
!
Read More
Arrow Right