CrawlJobs Logo

Senior Platform Engineer, ML Data Systems

khanacademy.org Logo

Khan Academy

Location Icon

Location:
United States, Mountain View

Category Icon
Category:
IT - Software Development

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

137871.00 - 172339.00 USD / Year

Job Description:

We’re looking for an ML Data Engineer to evolve our eval dataset tools to meet the growing platform needs of AI-based tutoring at Khan Academy. We’re looking for someone who can gather internal requirements, design schema based on well-known dataset patterns, and deploy, document, and train people on an internal dataset management framework. The systems you design will need to integrate with trace management and human labeling APIs. You’ll work closely with other AI engineers, platform developers, and labeling teams to ensure our data is clean, representative, and ready for both human and automated evaluation. This role bridges ML operations, data engineering and data science— enabling our AI systems to learn from reliable, well-structured datasets that reflect the diversity and nuance of real learners.

Job Responsibility:

  • Evolve and maintain pipelines for transforming raw trace data into ML-ready datasets
  • Clean, normalize, and enrich data while preserving semantic meaning and consistency
  • Prepare and format datasets for human labeling, and integrate results into ML datasets
  • Develop and maintain scalable ETL pipelines using Airflow, DBT, Go, and Python running on GCP
  • Implement automated tests and validation to detect data drift or labeling inconsistencies
  • Collaborate with AI engineers, platform developers, and product teams to define data strategies in support of continuously improving the quality of Khan’s AI-based tutoring
  • Contribute to shared tools and documentation for dataset management and AI evaluation
  • Inform our data governance strategies for proper data retention, PII controls/scrubbing, and isolation of particularly sensitive data such as offensive test imagery.

Requirements:

  • Bachelor’s or Master’s degree in Computer Science, Data Engineering, or a related field
  • 5 years of Software Engineering experience with 3+ of those years working with large ML datasets, especially those in open-source repositories such as Hugging Face
  • Strong programming skills in Go, Python, SQL, and at least one data pipeline framework (e.g., Airflow, Dagster, Prefect)
  • Experience with data versioning tools (e.g., DVC, LakeFS) and cloud storage systems
  • Familiarity with machine learning workflows — from training data preparation to evaluation
  • Familiarity with the architecture and operation of large language models, and a nuanced understanding of their capabilities and limitations
  • Attention to detail and an obsession with data quality and reproducibility
  • Motivated by the Khan Academy mission “to provide a free world-class education for anyone, anywhere.”
  • Proven cross-cultural competency skills demonstrating self-awareness, awareness of other, and the ability to adopt inclusive perspectives, attitudes, and behaviors to drive inclusion and belonging throughout the organization.

Nice to have:

  • Experience with labeling platforms (e.g., Label Studio, Scale AI, Toloka) or human-in-the-loop systems
  • Understanding of ML evaluation techniques, including prompt-based and generative model metrics
  • Exposure to MLOps practices such as model registry, feature store, or continuous evaluation
  • Background in education technology or other human-centered AI applications.
What we offer:
  • Competitive salaries
  • Ample paid time off as needed
  • 8 pre-scheduled Wellness Days in 2026 occurring on a Monday or a Friday for a 3-day weekend boost
  • Remote-first culture - that caters to your time zone, with open flexibility as needed, at times
  • Generous parental leave
  • An exceptional team that trusts you and gives you the freedom to do your best
  • The chance to put your talents towards a deeply meaningful mission and the opportunity to work on high-impact products that are already defining the future of education
  • Opportunities to connect through affinity, ally, and social groups
  • 401(k) + 4% matching & comprehensive insurance, including medical, dental, vision, and life.

Additional Information:

Job Posted:
December 09, 2025

Employment Type:
Fulltime
Work Type:
Remote work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Senior Platform Engineer, ML Data Systems

New

Senior Data Engineer

Senior Data Engineer – Dublin (Hybrid) Contract Role | 3 Days Onsite. We are see...
Location
Location
Ireland , Dublin
Salary
Salary:
Not provided
solasit.ie Logo
Solas IT Recruitment
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years of experience as a Data Engineer working with distributed data systems
  • 4+ years of deep Snowflake experience, including performance tuning, SQL optimization, and data modelling
  • Strong hands-on experience with the Hadoop ecosystem: HDFS, Hive, Impala, Spark (PySpark preferred)
  • Oozie, Airflow, or similar orchestration tools
  • Proven expertise with PySpark, Spark SQL, and large-scale data processing patterns
  • Experience with Databricks and Delta Lake (or equivalent big-data platforms)
  • Strong programming background in Python, Scala, or Java
  • Experience with cloud services (AWS preferred): S3, Glue, EMR, Redshift, Lambda, Athena, etc.
Job Responsibility
Job Responsibility
  • Build, enhance, and maintain large-scale ETL/ELT pipelines using Hadoop ecosystem tools including HDFS, Hive, Impala, and Oozie/Airflow
  • Develop distributed data processing solutions with PySpark, Spark SQL, Scala, or Python to support complex data transformations
  • Implement scalable and secure data ingestion frameworks to support both batch and streaming workloads
  • Work hands-on with Snowflake to design performant data models, optimize queries, and establish solid data governance practices
  • Collaborate on the migration and modernization of current big-data workloads to cloud-native platforms and Databricks
  • Tune Hadoop, Spark, and Snowflake systems for performance, storage efficiency, and reliability
  • Apply best practices in data modelling, partitioning strategies, and job orchestration for large datasets
  • Integrate metadata management, lineage tracking, and governance standards across the platform
  • Build automated validation frameworks to ensure accuracy, completeness, and reliability of data pipelines
  • Develop unit, integration, and end-to-end testing for ETL workflows using Python, Spark, and dbt testing where applicable
Read More
Arrow Right
New

Senior Software Engineer - ML Infrastructure

We build simple yet innovative consumer products and developer APIs that shape h...
Location
Location
United States , San Francisco
Salary
Salary:
180000.00 - 270000.00 USD / Year
plaid.com Logo
Plaid
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of industry experience as a software engineer, with strong focus on ML/AI infrastructure or large-scale distributed systems
  • Hands-on expertise in building and operating ML platforms (e.g., feature stores, data pipelines, training/inference frameworks)
  • Proven experience delivering reliable and scalable infrastructure in production
  • Solid understanding of ML Ops concepts and tooling, as well as best practices for observability, security, and reliability
  • Strong communication skills and ability to collaborate across teams
Job Responsibility
Job Responsibility
  • Design and implement large-scale ML infrastructure, including feature stores, pipelines, deployment tooling, and inference systems
  • Drive the rollout of Plaid’s next-generation feature store to improve reliability and velocity of model development
  • Help define and evangelize an ML Ops “golden path” for secure, scalable model training, deployment, and monitoring
  • Ensure operational excellence of ML pipelines and services, including reliability, scalability, performance, and cost efficiency
  • Collaborate with ML product teams to understand requirements and deliver solutions that accelerate experimentation and iteration
  • Contribute to technical strategy and architecture discussions within the team
  • Mentor and support other engineers through code reviews, design discussions, and technical guidance
What we offer
What we offer
  • medical, dental, vision, and 401(k)
  • Fulltime
Read More
Arrow Right
New

Senior Software Engineer - Data Infrastructure

We build the data and machine learning infrastructure to enable Plaid engineers ...
Location
Location
United States , San Francisco
Salary
Salary:
180000.00 - 270000.00 USD / Year
plaid.com Logo
Plaid
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of software engineering experience
  • Extensive hands-on software engineering experience, with a strong track record of delivering successful projects within the Data Infrastructure or Platform domain at similar or larger companies
  • Deep understanding of one of: ML Infrastructure systems, including Feature Stores, Training Infrastructure, Serving Infrastructure, and Model Monitoring OR Data Infrastructure systems, including Data Warehouses, Data Lakehouses, Apache Spark, Streaming Infrastructure, Workflow Orchestration
  • Strong cross-functional collaboration, communication, and project management skills, with proven ability to coordinate effectively
  • Proficiency in coding, testing, and system design, ensuring reliable and scalable solutions
  • Demonstrated leadership abilities, including experience mentoring and guiding junior engineers
Job Responsibility
Job Responsibility
  • Contribute towards the long-term technical roadmap for data-driven and machine learning iteration at Plaid
  • Leading key data infrastructure projects such as improving ML development golden paths, implementing offline streaming solutions for data freshness, building net new ETL pipeline infrastructure, and evolving data warehouse or data lakehouse capabilities
  • Working with stakeholders in other teams and functions to define technical roadmaps for key backend systems and abstractions across Plaid
  • Debugging, troubleshooting, and reducing operational burden for our Data Platform
  • Growing the team via mentorship and leadership, reviewing technical documents and code changes
What we offer
What we offer
  • medical, dental, vision, and 401(k)
  • equity and/or commission
  • Fulltime
Read More
Arrow Right
New

Senior Director - Data Engineering & Machine Learning

Lead the Data Revolution at Modus Create. At Modus Create, we empower the world’...
Location
Location
Canada
Salary
Salary:
Not provided
moduscreate.com Logo
Modus Create
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 15+ years in data or software engineering roles
  • 7+ years leading Data Engineering/ML teams—ideally at scale or in global consulting contexts
  • Hands-on experience with cloud data platforms, big data toolchains (Spark, Kafka), data transformation tools (Airflow, dbt), and ML platforms
  • Success in pre-sales, solutioning, and growing data/ML engagements within enterprise or mid-market accounts
  • Demonstrated ability to create enablement programs, uplift team capabilities, and grow inclusive, high-performing engineering cultures
  • Deep empathy for client pain points and the ability to craft and deliver impactful, measurable solutions
  • Excellent across technical, executive, and cross-functional settings, with an ability to successfully navigate diverse cultural differences
Job Responsibility
Job Responsibility
  • Build Modern Data Platforms
  • Design and oversee architecture for cloud-native data platforms, pipelines, and streaming systems on AWS, Azure, or GCP
  • Ensure robust solutions using platforms such as Databricks, Snowflake, Redshift, BigQuery, Spark, Kafka, Airflow, dbt, and Kubernetes
  • Deliver Products that are Intelligent
  • Define and drive responsible ML strategy, from model development to integration, using platforms like SageMaker, Azure ML, or TensorFlow
  • Enable smarter client experiences by embedding ML into applications, automation, and analytics
  • Lead & Grow Teams
  • Foster a culture of trust, continuous learning, and experimentation by hiring, mentoring, and empowering distributed teams of data and ML engineers
  • Define career paths, feedback frameworks, and learning programs grounded in a culture of experimentation,continuous growth, and a sense of belonging at every level
  • Shape Technical Strategy & Best Practices
What we offer
What we offer
  • Remote work with flexible working hours
  • Modus Global Office Programme:for when you want to get out of your home, we offer on-demand access to private offices, meeting rooms, coworking spaces and business lounges in locations in over 120 countries
  • Employee Referral Program
  • Client Referral Program
  • Travel according to client or team needs
  • The chance to work side-by-side with thought leaders in emerging tech
  • Access to more than 12,000 courses with a licensed Coursera account
  • Possibility to obtain paid certification/courses if they align with company goals and are relevant to the employee's role
  • Fulltime
Read More
Arrow Right
New

Senior Software Engineer, Data Products

As a Senior Software Engineer, you will play a pivotal role in the development o...
Location
Location
United States , Los Angeles
Salary
Salary:
143000.00 - 180000.00 USD / Year
Fox News Media
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Experience working in Software Engineering, Data Science, ML Engineering
  • Strong background in live media streaming and handling VOD content
  • Expertise in working with live media streaming
  • Experience working with Vector Database
  • Strong understanding of generative AI technologies and their underlying mechanisms
  • Good grasp of distributed system design
  • Experience with TensorFlow, PyTorch etc.
  • REST or GraphQL API Design Experience
  • Proficient with building batch and streaming data pipelines on cloud platforms
Job Responsibility
Job Responsibility
  • Design and implement novel and scalable AI solutions for real business problems
  • Design and implement workflows to generate and manage assets for live streaming and VOD
  • Build workflow orchestrations that can be readily extended to perform new analyses
  • Prototype new approaches and productionize solutions at scale for hundreds of millions of active users
  • Maintain high-level craftsmanship while delivering meaningful results
  • Mentor junior engineers on the team
  • Collaborate with peers, engineering leadership, and product management
What we offer
What we offer
  • Annual discretionary bonus
  • Medical/dental/vision insurance
  • 401(k) plan
  • Paid time off
  • Fulltime
Read More
Arrow Right
New

Senior Software Engineer, Data Products

As a Senior Software Engineer, you will play a pivotal role in the development o...
Location
Location
United States , Los Angeles
Salary
Salary:
143000.00 - 180000.00 USD / Year
Fox Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Experience working in Software Engineering, Data Science, ML Engineering
  • Strong background in live media streaming and handling VOD content
  • Expertise in working with live media streaming
  • Experience working with Vector Database
  • Strong understanding of generative AI technologies and their underlying mechanisms
  • Good grasp of distributed system design
  • Experience with TensorFlow, PyTorch etc.
  • REST or GraphQL API Design Experience
  • Proficient with building batch and streaming data pipelines on cloud platforms
Job Responsibility
Job Responsibility
  • Design and implement novel and scalable AI solutions for real business problems
  • Design and implement workflows to generate and manage assets for live streaming and VOD
  • Build workflow orchestrations that can be readily extended to perform new analyses
  • Prototype new approaches and productionize solutions at scale for hundreds of millions of active users
  • Maintain high-level craftsmanship while delivering meaningful results
  • Mentor junior engineers on the team
  • Collaborate with peers, engineering leadership, and product management
What we offer
What we offer
  • Annual discretionary bonus
  • Medical/dental/vision insurance
  • 401(k) plan
  • Paid time off
  • Fulltime
Read More
Arrow Right

Senior Data Engineer

This Senior Data Engineer position is a member of 3Cloud’s Managed Services team...
Location
Location
United States
Salary
Salary:
90200.00 - 130800.00 USD / Year
3cloudsolutions.com Logo
3Cloud
Expiration Date
December 22, 2025
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Computer Science, Mathematics, or related field, or equivalent certification and experience
  • Experience working in a support role and understanding standard support processes (i.e., ITIL)
  • 3+ years working with data and business intelligence solutions on the Microsoft tool stack for Business Intelligence (ADF, SQL DW, Synapse, Databricks, Power BI, etc.), or other vendor Business Intelligence tools (Business Objects, MicroStrategy, Qlik, SAP)
  • Strong verbal and written communication skills
  • A passion for problem solving and learning new technologies
  • Client relationship skills and experience managing vendors
  • Ability to work in a fast paced, rapidly changing environment
  • Strong desire for personal development and learning
  • Databricks: advanced experience with Delta Lake, Unity Catalog, jobs, clusters, notebook development
  • PySpark: strong hands-on development
Job Responsibility
Job Responsibility
  • Lead support of client’s Azure Data platform and Power BI Environment, including response to any escalations while helping to analyze and resolve incidents for customers environment
  • Consult, develop, and advise on solutions in Microsoft Azure with tools such as Synapse, Data Factory, Databricks, Azure ML, Data Lake, Data Warehouse, and Power BI
  • Consult, develop, and advise on Power BI, including governance, strategy, report development, performance
  • Consistently learn, apply, and refine skills around data engineering and data analytics
  • Proactively mentors junior team members and actively gives feedback on their work and performance
  • Design and deploy data pipelines, models, and AI-driven solutions for clients in various industries
  • Understand and work with customers on licensing involving Microsoft data solutions
  • Takes responsibility for the quality of deliverables and solutions
  • Supports and participates in the estimating of work to be done by self and others
  • Communicate ticket status information to all associated parties and escalate cases as appropriate
What we offer
What we offer
  • Flexible work location with a virtual first approach to work
  • 401(K) with match up to 50% of your 6% contributions of eligible pay
  • Generous PTO providing a minimum of 15 days in addition to 9 paid company holidays and 2 floating personal days
  • Two medical plan options
  • Option for vision and dental coverage
  • 100% employer paid coverage for life and disability insurance
  • Paid leave for birth parents and non-birth parents
  • Option for Healthcare FSA, HSA, and Dependent Care FSA
  • $67.00 monthly tech and home office allowance
  • Utilization and/or discretionary bonus eligibility based on role
  • Fulltime
Read More
Arrow Right

Senior Data Engineer

At Ingka Investments (Part of Ingka Group – the largest owner and operator of IK...
Location
Location
Netherlands , Leiden
Salary
Salary:
Not provided
https://www.ikea.com Logo
IKEA
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Formal qualifications (BSc, MSc, PhD) in computer science, software engineering, informatics or equivalent
  • Minimum 3 years of professional experience as a (Junior) Data Engineer
  • Strong knowledge in designing efficient, robust and automated data pipelines, ETL workflows, data warehousing and Big Data processing
  • Hands-on experience with Azure data services like Azure Databricks, Unity Catalog, Azure Data Lake Storage, Azure Data Factory, DBT and Power BI
  • Hands-on experience with data modeling for BI & ML for performance and efficiency
  • The ability to apply such methods to solve business problems using one or more Azure Data and Analytics services in combination with building data pipelines, data streams, and system integration
  • Experience in driving new data engineering developments (e.g. apply new cutting edge data engineering methods to improve performance of data integration, use new tools to improve data quality and etc.)
  • Knowledge of DevOps practices and tools including CI/CD pipelines and version control systems (e.g., Git)
  • Proficiency in programming languages such as Python, SQL, PySpark and others relevant to data engineering
  • Hands-on experience to deploy code artifacts into production
Job Responsibility
Job Responsibility
  • Contribute to the development of D&A platform and analytical tools, ensuring easy and standardized access and sharing of data
  • Subject matter expert for Azure Databrick, Azure Data factory and ADLS
  • Help design, build and maintain data pipelines (accelerators)
  • Document and make the relevant know-how & standard available
  • Ensure pipelines and consistency with relevant digital frameworks, principles, guidelines and standards
  • Support in understand needs of Data Product Teams and other stakeholders
  • Explore ways create better visibility on data quality and Data assets on the D&A platform
  • Identify opportunities for data assets and D&A platform toolchain
  • Work closely together with partners, peers and other relevant roles like data engineers, analysts or architects across IKEA as well as in your team
What we offer
What we offer
  • Opportunity to develop on a cutting-edge Data & Analytics platform
  • Opportunities to have a global impact on your work
  • A team of great colleagues to learn together with
  • An environment focused on driving business and personal growth together, with focus on continuous learning
  • Fulltime
Read More
Arrow Right
Welcome to CrawlJobs.com
Your Global Job Discovery Platform
At CrawlJobs.com, we simplify finding your next career opportunity by bringing job listings directly to you from all corners of the web. Using cutting-edge AI and web-crawling technologies, we gather and curate job offers from various sources across the globe, ensuring you have access to the most up-to-date job listings in one place.