CrawlJobs Logo

Software Engineer, Data Infrastructure - Research

openai.com Logo

OpenAI

Location Icon

Location:
United States , San Francisco

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

250000.00 - 380000.00 USD / Year

Job Description:

The Workload team is responsible for designing and running OpenAI’s LLM training and inference infrastructure that powers frontier models at massive scale. Our systems unify how researchers train and serve models, abstracting away the complexity of performance, parallelism, and execution across vast GPU/accelerator fleets. By providing this foundation, the Workload team ensures that researchers can focus on advancing model capabilities while we handle the scale, efficiency, and reliability required to bring those models to life.

Job Responsibility:

  • Design and implement the dataset infrastructure that powers OpenAI’s next-generation training stack
  • Design and maintain standardized dataset APIs, including for multimodal (MM) data that cannot fit in memory
  • Build proactive testing and scale validation pipelines for dataset loading at GPU scale
  • Collaborate with teammates to integrate datasets seamlessly into training and inference pipelines
  • Document and maintain dataset interfaces so they are discoverable, consistent, and easy for other teams to adopt
  • Establish safeguards and validation systems to ensure datasets remain reproducible and unchanged once standardized
  • Debug and resolve performance bottlenecks in distributed dataset loading
  • Provide visualization and inspection tools to surface errors, bugs, or bottlenecks in datasets

Requirements:

  • Strong engineering fundamentals with experience in distributed systems, data pipelines, or infrastructure
  • Experience building APIs, modular code, and scalable abstractions
  • Comfortable debugging bottlenecks across large fleets of machines
  • Pride in building infrastructure that 'just works'
  • Collaborative, humble, and excited to own a foundational part of the ML stack

Nice to have:

  • Background knowledge in data math, probability, or distributed data theory
  • Worked with GPU-scale distributed systems or dataset scaling for real-time data
What we offer:
  • Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts
  • Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)
  • 401(k) retirement plan with employer match
  • Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)
  • Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees
  • 13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time
  • Mental health and wellness support
  • Employer-paid basic life and disability coverage
  • Annual learning and development stipend to fuel your professional growth
  • Daily meals in our offices, and meal delivery credits as eligible
  • Relocation support for eligible employees
  • Additional taxable fringe benefits, such as charitable donation matching and wellness stipends
  • Offers Equity
  • Performance-related bonus(es) for eligible employees

Additional Information:

Job Posted:
February 21, 2026

Employment Type:
Fulltime
Work Type:
On-site work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Software Engineer, Data Infrastructure - Research

Software Engineer, Data Infrastructure

The Data Infrastructure team at Figma builds and operates the foundational platf...
Location
Location
United States , San Francisco; New York
Salary
Salary:
149000.00 - 350000.00 USD / Year
figma.com Logo
Figma
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of Software Engineering experience, specifically in backend or infrastructure engineering
  • Experience designing and building distributed data infrastructure at scale
  • Strong expertise in batch and streaming data processing technologies such as Spark, Flink, Kafka, or Airflow/Dagster
  • A proven track record of impact-driven problem-solving in a fast-paced environment
  • A strong sense of engineering excellence, with a focus on high-quality, reliable, and performant systems
  • Excellent technical communication skills, with experience working across both technical and non-technical counterparts
  • Experience mentoring and supporting engineers, fostering a culture of learning and technical excellence
Job Responsibility
Job Responsibility
  • Design and build large-scale distributed data systems that power analytics, AI/ML, and business intelligence
  • Develop batch and streaming solutions to ensure data is reliable, efficient, and scalable across the company
  • Manage data ingestion, movement, and processing through core platforms like Snowflake, our ML Datalake, and real-time streaming systems
  • Improve data reliability, consistency, and performance, ensuring high-quality data for engineering, research, and business stakeholders
  • Collaborate with AI researchers, data scientists, product engineers, and business teams to understand data needs and build scalable solutions
  • Drive technical decisions and best practices for data ingestion, orchestration, processing, and storage
What we offer
What we offer
  • equity
  • health, dental & vision
  • retirement with company contribution
  • parental leave & reproductive or family planning support
  • mental health & wellness benefits
  • generous PTO
  • company recharge days
  • a learning & development stipend
  • a work from home stipend
  • cell phone reimbursement
  • Fulltime
Read More
Arrow Right

Software Engineer, Infrastructure

As a Software Engineer on our Infrastructure team, you will help design and buil...
Location
Location
United States , New York; San Mateo; Redwood City
Salary
Salary:
140000.00 - 150000.00 USD / Year
fireworks.ai Logo
Fireworks AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Computer Science, Engineering, or a related technical field (or equivalent practical experience)
  • Strong programming skills in Python, C++, or a similar language
  • Solid understanding of computer systems concepts such as networking, storage, and distributed computing
  • Familiarity with cloud platforms like AWS, GCP, or Azure, and containerization tools like Docker or Kubernetes
  • Knowledge and interest in cloud infrastructure, distributed systems, and machine learning
Job Responsibility
Job Responsibility
  • Contribute to the design and development of scalable backend infrastructure that supports distributed training, inference, and data pipelines
  • Build and maintain core backend services such as job schedulers, autoscalers, resource managers, and model serving systems
  • Support performance optimization, cost efficiency, and reliability improvements across compute, storage, and networking layers
  • Collaborate with ML, DevOps, and product teams to translate research and product needs into infrastructure solutions
  • Learn and apply modern cloud technologies including Kubernetes, Ray, Kubeflow, and MLFlow
  • Participate in code reviews, technical discussions, and continuous integration and deployment processes
What we offer
What we offer
  • Meaningful equity in a fast-growing startup
  • Competitive salary and comprehensive benefits package
  • Fulltime
Read More
Arrow Right

Software Engineer, AI Infrastructure

As a Software Engineer on our AI Infrastructure team, you will help design the c...
Location
Location
United States , New York, NY; San Mateo, CA
Salary
Salary:
Not provided
fireworks.ai Logo
Fireworks AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Computer Science, Engineering, or a related technical field (or equivalent practical experience)
  • 3 years of experience in software engineering, with a focus on infrastructure or machine learning systems
  • Strong programming skills in Python, Go, or a similar language
  • Proven experience in ML infrastructure and tooling (e.g., PyTorch, MLflow, Vertex AI, SageMaker, Kubernetes, etc.)
  • Basic understanding of LLM knowledge (e.g., context length, disaggregated prefill, KV cache memory estimation, etc)
Job Responsibility
Job Responsibility
  • Contribute to the design and development of scalable backend infrastructure that supports distributed training, inference, and data pipelines
  • Build and maintain core backend services such as LLM CI/CD pipeline, control plane, and model serving systems
  • Support performance optimization, cost efficiency, and reliability improvements across compute, storage, and networking layers
  • Building frameworks and safeguards to ensure Fireworks AI has the best model quality in the industry
  • Collaborate with performance, training, and product teams to translate research and product needs into infrastructure solutions
  • Participate in code reviews, technical discussions, and continuous integration and deployment processes
What we offer
What we offer
  • Solve Hard Problems: Tackle challenges at the forefront of AI infrastructure
  • Build What’s Next: Work with bleeding-edge technology that impacts how businesses and developers harness AI globally
  • Ownership & Impact: Join a fast-growing, passionate team where your work directly shapes the future of AI—no bureaucracy, just results
  • Learn from the Best: Collaborate with world-class engineers and AI researchers who thrive on curiosity and innovation
  • Fulltime
Read More
Arrow Right

Platform Software Engineer

We reshaped bookkeeping to fit the e-comm needs: with Finaloop, customers get fl...
Location
Location
Israel , Tel Aviv
Salary
Salary:
Not provided
finaloop.com Logo
Finaloop
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 6+ years of experience in server-side development and distributed systems
  • Excellent knowledge of software and application design and architecture
  • Experience with different backend architectures and approaches
  • Experience in working with different types of data storage technologies and approaches (e.g. OLAP, OLTP, relational, document, etc.)
  • Proven track record of working efficiently in a fast-paced intensive startup environment
  • Strong communication and collaboration skills
Job Responsibility
Job Responsibility
  • The most fundamental components of our system, whether it is the cloud infrastructure or the different common software solutions used by developers
  • Developing system-wide solutions to be used by the product development teams
  • Developing scalable long-term solutions
  • Detecting opportunities to improve R&D efficiency and the reliability of our systems by improving and expanding the common infrastructure of our product
  • Researching new technologies and solutions with the potential to be incorporated into our product
Read More
Arrow Right

Data Engineer

Barbaricum is seeking a Data Engineer to provide support an emerging capability ...
Location
Location
United States , Omaha
Salary
Salary:
Not provided
barbaricum.com Logo
Barbaricum
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Active DoD Top Secret/SCI clearance required
  • 8+ years of demonstrated experience in software engineering
  • Bachelor’s degree in computer science or a related field
  • 8+ years of experience working with AWS big data technologies (S3, EC2) and demonstrate experience in distributed data processing, Data Modeling, ETL Development, and/or Data Warehousing
  • Demonstrated mid-level knowledge of software engineering best practices across the development lifecycle
  • 3+ years of experience using analytical concepts and statistical techniques
  • 8+ years of demonstrated experience across Mathematics, Applied Mathematics, Statistics, Applied Statistics, Machine Learning, Data Science, Operations Research, or Computer Science especially around software engineering and/or designing/implementing machine learning, data mining, advanced analytical algorithms, programming, data science, advanced statistical analysis, artificial intelligence
Job Responsibility
Job Responsibility
  • Design, implement, and operate data management systems for intelligence needs
  • Use Python to automate data workflows
  • Design algorithms databases, and pipelines to access, and optimize data retrieval, storage, use, integration and management by different data regimes and digital systems
  • Work with data users to determine, create, and populate optimal data architectures, structures, and systems
  • and plan, design, and optimize data throughput and query performance
  • Participate in the selection of backend database technologies (e.g. SQL, NoSQL, etc.), its configuration and utilization, and the optimization of the full data pipeline infrastructure to support the actual content, volume, ETL, and periodicity of data to support the intended kinds of queries and analysis to match expected responsiveness
  • Assist and advise the Government with developing, constructing, and maintaining data architectures
  • Research, study, and present technical information, in the form of briefings or written papers, on relevant data engineering methodologies and technologies of interest to or as requested by the Government
  • Align data architecture, acquisition, and processes with intelligence and analytic requirements
  • Prepare data for predictive and prescriptive modeling deploying analytics programs, machine learning and statistical methods to find hidden patterns, discover tasks and processes which can be automated and make recommendations to streamline data processes and visualizations
Read More
Arrow Right

Data Engineer

As a Data Engineer, you’ll build and refine the pipelines, data models, and serv...
Location
Location
United States , Redmond
Salary
Salary:
155000.00 - 175000.00 USD / Year
2a.consulting Logo
2A Consulting
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proven ability to design and build end-to-end data systems, from ingestion through cleaning, structuring, storage, and serving
  • Experience building and shipping data products that deliver practical value
  • Demonstrated impact using AI models in data workflows (applied use, not ML research)
  • 5+ years of software or data engineering experience, including at least 2 years of hands-on work with data pipelines
  • Comfortable defining architecture and starting systems from scratch, working independently in a small cross-functional team
  • Proficiency in Python, SQL, or similar languages used in data engineering workflows
Job Responsibility
Job Responsibility
  • Build and maintain core data pipelines
  • Build and maintain end-to-end ingestion pipelines for documents, datasets, code repositories, videos, transcripts, and internal knowledge sources
  • Clean, normalize, structure, and store data in formats that support both web applications and AI-driven use cases
  • Use “out of the box” Microsoft tools—such as Fabric, Azure services, Cosmos DB, or Copilot Studio—to create reliable, maintainable systems
  • Enrich and model research data
  • Use AI models to transform unstructured content into structured metadata and durable knowledge assets
  • Design the architecture and foundational data systems, establishing the patterns and infrastructure for a new, scalable environment
  • Develop and refine embeddings, vector indexes, and retrieval components to support semantic search and grounding scenarios
  • Build backend and data services
  • Build data services, APIs, and backend components that power internal applications and agent-supported workflows
What we offer
What we offer
  • Flexible time-off plan
  • 100% employer-paid medical, dental, and vision insurance
  • Employer-paid life insurance for those enrolled in medical coverage
  • 401(k) plan with company match
  • Fertility, surrogacy, and adoption benefits
  • Fitness and caregiver benefits
  • Employee Assistance Program
  • 100% employer-paid short- and long-term disability coverage
  • Fulltime
Read More
Arrow Right

Senior Data Engineer

As a senior member of our engineering team, you will take ownership of critical ...
Location
Location
Poland
Salary
Salary:
Not provided
userlane.com Logo
Userlane GmbH
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Minimum of 5 years of hands-on experience in designing and developing data processing systems
  • Experience being part of a team of software engineers and helping establish processes from scratch
  • Familiarity with DBMS like ClickHouse or a different SQL-based OLAP database
  • Experience with various data engineering tools like Airflow, Kafka, dbt
  • Experience building and maintaining applications with the following languages: Python, Golang, Typescript
  • Knowledge of container technologies like Docker and Kubernetes
  • Experience with CI/CD pipelines and automated testing
  • Ability to solve problems and balance structure with creativity
  • Ability to operate independently and apply strategic thinking with technical depth
  • Willingness to share information and skills with the team
Job Responsibility
Job Responsibility
  • Shape and maintain our various data and backend components - DBs, APIs and services
  • Understand business requirements and analyze their impact on the design of our software services and tools
  • Identify architectural changes needed in our infrastructure to support a smooth process of adding new features
  • Research, propose, and deliver changes to our software architecture to address our engineering and product requirements
  • Design, develop, and maintain a solid and stable RESTful API based on industry standards and best practices
  • Collaborate with internal and external teams to deliver software that fits the overall ecosystem of our products
  • Stay up to date with the new trends and technologies that enable us to work smarter, not harder
What we offer
What we offer
  • Team & Culture: A high-performance culture with great leadership and a fun, engaged, motivated, and diverse team with people from over 20 countries
  • Market: Userlane is among the global leaders in the rapidly growing Digital Adoption industry
  • Growth: We take you and your development seriously. You can expect weekly 121s, a personalised skills assessment and development plan, on the job coaching and a budget for events and training
  • Compensation: Significant financial upside with an attractive and incentivising package on B2B basis
  • Fulltime
Read More
Arrow Right

Software Engineer

Figma is growing our team of passionate creatives and builders on a mission to m...
Location
Location
United States , San Francisco; New York
Salary
Salary:
155000.00 USD / Year
figma.com Logo
Figma
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Solid technical knowledge of CS fundamentals and data structures
  • Have experience writing clean code in at least one general-purpose language (e.g. C++, JavaScript, Python, Java, etc.)
  • Have worked on projects through school, work, or personal exploration that required solving technical problems
  • Are excited to explore how systems are designed and operate, from infrastructure to UI
  • Communicate well, ask great questions, and enjoy collaborating with others
  • A growth mindset with an eagerness to learn and contribute not only to your engineering team, but the entire organization
Job Responsibility
Job Responsibility
  • Be paired with an onboarding buddy on your team who will help you ramp in our codebase and get to know Figma
  • Move from starter tasks meant to familiarize you with our workflows, to joining your teammates on larger projects
  • Understand how we operate as a business through our comprehensive onboarding program, Figma Today
  • Get to know your Figmates across the company
  • Work closely with teammates and partners in Product, Design, Marketing, User Research, and Data Science to build new features and achieve roadmap goals
  • Craft performance objectives with your manager that align to company priorities and career development opportunities
  • Help grow Figma by interviewing candidates for teams across the Engineering organization
What we offer
What we offer
  • equity to employees
  • health, dental & vision benefits
  • retirement with company contribution
  • parental leave & reproductive or family planning support
  • mental health & wellness benefits
  • generous PTO
  • company recharge days
  • a learning & development stipend
  • a work from home stipend
  • cell phone reimbursement
  • Fulltime
Read More
Arrow Right