CrawlJobs Logo

Research Engineer, Language Model Pre-Training

zyphra.com Logo

Zyphra

Location Icon

Location:
United States , Palo Alto

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

As a Research Engineer, Language Model Pre-training, you'll shape our language model roadmap through end-to-end pretraining development. You will work extremely closely with our pretraining team, who will integrate your insights into our next-generation models.

Job Responsibility:

  • Shape our language model roadmap through end-to-end pretraining development
  • Work across: Large-scale training runs and model parallelization
  • Performance optimization of our pretraining stack
  • Dataset collection, processing, and evaluation
  • Architecture and methodology research, including optimizer ablations

Requirements:

  • Strong engineering aptitude for rapidly implementing reliable and robust systems
  • Can rapidly learn new fields and are excited to implement new ideas
  • Excellent communication and collaboration skills, and can work effectively on both research and engineering implementation at scale
  • Deep expertise and intuition for solving machine learning problems and training models
  • Experience with training on large-scale (multi-node) GPU clusters
  • Deep understanding of model training pipelines – including model/data parallelism, distributed optimizers, etc.
  • Strong grasp of proper experimental methodology for running rigorous ablations and other hypothesis testing
  • Understanding of large-scale, highly parallel data processing pipelines
  • High proficiency with PyTorch and Python
  • Strong ability to dive into large pre-existing codebases and rapidly get up to speed
  • Postgraduate degree in a scientific subject (Computer Science, EE/EECS, Math, Physics)

Nice to have:

Published machine learning research in well-respected venues is a plus

What we offer:
  • Comprehensive medical, dental, vision, and FSA plans
  • Competitive compensation and 401(k)
  • Relocation and immigration support on a case-by-case basis
  • On-site meals prepared by a dedicated culinary team
  • Thursday Happy Hours
  • In-person team in Palo Alto, CA, with a collaborative, high-energy environment

Additional Information:

Job Posted:
January 13, 2026

Employment Type:
Fulltime
Work Type:
On-site work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Research Engineer, Language Model Pre-Training

Research Engineer, VLA Models

As a Research Engineer, Vision-Language Action (VLA) Models, you will train the ...
Location
Location
United States , Palo Alto
Salary
Salary:
180000.00 - 300000.00 USD / Year
1x.tech Logo
1X Technologies
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong programming experience in Python (and familiarity with tools like Bazel)
  • Experience with frameworks like PyTorch
  • Experience with simulation environments (e.g., Isaac Sim, MuJoCo)
  • Deep understanding of how autonomous systems generalize to new environments
  • Experience designing evaluation metrics and validating models in real or simulated settings
  • Ability to coordinate with cross‑functional teams (controls, QA, data) to bring models into production
Job Responsibility
Job Responsibility
  • Take extreme ownership over autonomous capabilities: reviewing data, designing model architectures, shipping models, and maintaining performance across the fleet
  • Train NEO for whole‑body manipulation and navigation tasks in unseen environments
  • Design robust evaluation metrics to support scaling of model pre‑training
  • Experiment with state‑of‑the‑art techniques from vision–language models and generative model literature to predict actions
  • Collaborate with controls, QA, and data collection teams to deploy reinforcement learning policies to the production fleet
What we offer
What we offer
  • Health, dental, and vision insurance
  • 401(k) with company match
  • Paid time off and holidays
  • Fulltime
Read More
Arrow Right

AI Research Engineer, VLA Models

As a Research Engineer on the Vision-Language Action (VLA) team, you will be res...
Location
Location
United States , Palo Alto
Salary
Salary:
180000.00 - 300000.00 USD / Year
1x.tech Logo
1X Technologies
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong programming skills in Python and familiarity with build systems like Bazel
  • Experience using deep learning frameworks such as PyTorch
  • Proficiency in simulation environments like Isaac Sim or MuJoCo
  • Deep understanding of generalization in autonomous systems
  • Experience designing and validating evaluation metrics in real or simulated environments
  • Ability to work cross-functionally with controls, QA, and data teams to operationalize models
Job Responsibility
Job Responsibility
  • Take end-to-end ownership of autonomous capability development: data review, model design, deployment, and fleet performance monitoring
  • Train NEO to perform whole-body manipulation and navigation tasks in unfamiliar environments
  • Design robust evaluation metrics to support scalable model pre-training
  • Experiment with cutting-edge vision-language and generative model techniques to predict robot actions
  • Collaborate with controls, QA, and data teams to deploy reinforcement learning policies to the production fleet
What we offer
What we offer
  • Equity
  • Health, dental, and vision insurance
  • 401(k) with company match
  • Paid time off and holidays
  • Fulltime
Read More
Arrow Right

Applied Researcher II (AI Foundations)

At Capital One, we are creating trustworthy and reliable AI systems, changing ba...
Location
Location
United States , New York; San Francisco; San Jose; Cambridge; McLean
Salary
Salary:
262500.00 - 326800.00 USD / Year
capitalone.com Logo
Capital One
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Currently has, or is in the process of obtaining, PhD in Electrical Engineering, Computer Engineering, Computer Science, AI, Mathematics, or related fields, with an exception that required degree will be obtained on or before the scheduled start date plus 2 years of experience in Applied Research or M.S. in Electrical Engineering, Computer Engineering, Computer Science, AI, Mathematics, or related fields plus 4 years of experience in Applied Research
  • PhD in Computer Science, Machine Learning, Computer Engineering, Applied Mathematics, Electrical Engineering or related fields
  • PhD focus on NLP or Masters with 5 years of industrial NLP research experience
  • Multiple publications on topics related to the pre-training of large language models (e.g. technical reports of pre-trained LLMs, SSL techniques, model pre-training optimization)
  • Member of team that has trained a large language model from scratch (10B + parameters, 500B+ tokens)
  • Publications in deep learning theory
  • Publications at ACL, NAACL and EMNLP, Neurips, ICML or ICLR
  • PhD focused on topics related to optimizing training of very large deep learning models
  • Multiple years of experience and/or publications on one of the following topics: Model Sparsification, Quantization, Training Parallelism/Partitioning Design, Gradient Checkpointing, Model Compression
  • Experience optimizing training for a 10B+ model
Job Responsibility
Job Responsibility
  • Partner with a cross-functional team of data scientists, software engineers, machine learning engineers and product managers to deliver AI-powered products
  • Leverage a broad stack of technologies — Pytorch, AWS Ultraclusters, Huggingface, Lightning, VectorDBs, and more — to reveal insights from data
  • Build AI foundation models through all phases of development, from design through training, evaluation, validation, and implementation
  • Engage in high impact applied research to take the latest AI developments into the next generation of customer experiences
  • Translate the complexity of your work into tangible business goals
What we offer
What we offer
  • Performance based incentive compensation, which may include cash bonus(es) and/or long term incentives (LTI)
  • Comprehensive, competitive, and inclusive set of health, financial and other benefits that support total well-being
  • Fulltime
Read More
Arrow Right

Applied Researcher II (AI Foundations)

At Capital One, we are creating trustworthy and reliable AI systems, changing ba...
Location
Location
United States , New York; San Francisco; San Jose; Cambridge; McLean
Salary
Salary:
262500.00 - 326800.00 USD / Year
capitalone.com Logo
Capital One
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Currently has, or is in the process of obtaining, PhD in Electrical Engineering, Computer Engineering, Computer Science, AI, Mathematics, or related fields, with an exception that required degree will be obtained on or before the scheduled start date plus 2 years of experience in Applied Research or M.S. in Electrical Engineering, Computer Engineering, Computer Science, AI, Mathematics, or related fields plus 4 years of experience in Applied Research
  • PhD in Computer Science, Machine Learning, Computer Engineering, Applied Mathematics, Electrical Engineering or related fields
  • PhD focus on NLP or Masters with 5 years of industrial NLP research experience
  • Multiple publications on topics related to the pre-training of large language models (e.g. technical reports of pre-trained LLMs, SSL techniques, model pre-training optimization)
  • Member of team that has trained a large language model from scratch (10B + parameters, 500B+ tokens)
  • Publications in deep learning theory
  • Publications at ACL, NAACL and EMNLP, Neurips, ICML or ICLR
  • PhD focused on topics related to optimizing training of very large deep learning models
  • Multiple years of experience and/or publications on one of the following topics: Model Sparsification, Quantization, Training Parallelism/Partitioning Design, Gradient Checkpointing, Model Compression
  • Experience optimizing training for a 10B+ model
Job Responsibility
Job Responsibility
  • Partner with a cross-functional team of data scientists, software engineers, machine learning engineers and product managers to deliver AI-powered products that change how customers interact with their money
  • Leverage a broad stack of technologies — Pytorch, AWS Ultraclusters, Huggingface, Lightning, VectorDBs, and more — to reveal the insights hidden within huge volumes of numeric and textual data
  • Build AI foundation models through all phases of development, from design through training, evaluation, validation, and implementation
  • Engage in high impact applied research to take the latest AI developments and push them into the next generation of customer experiences
  • Flex your interpersonal skills to translate the complexity of your work into tangible business goals
What we offer
What we offer
  • performance based incentive compensation, which may include cash bonus(es) and/or long term incentives (LTI)
  • comprehensive, competitive, and inclusive set of health, financial and other benefits that support your total well-being
  • Fulltime
Read More
Arrow Right

Applied Researcher I (AI Foundations)

At Capital One, we are creating trustworthy and reliable AI systems, changing ba...
Location
Location
United States , New York; San Francisco; San Jose; Cambridge; McLean
Salary
Salary:
218700.00 - 272300.00 USD / Year
capitalone.com Logo
Capital One
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Currently has, or is in the process of obtaining, a PhD in Electrical Engineering, Computer Engineering, Computer Science, AI, Mathematics, or related fields, with an exception that required degree will be obtained on or before the scheduled start date or M.S. in Electrical Engineering, Computer Engineering, Computer Science, AI, Mathematics, or related fields plus 2 years of experience in Applied Research
  • PhD in Computer Science, Machine Learning, Computer Engineering, Applied Mathematics, Electrical Engineering or related fields
  • PhD focus on NLP or Masters with 5 years of industrial NLP research experience
  • Multiple publications on topics related to the pre-training of large language models (e.g. technical reports of pre-trained LLMs, SSL techniques, model pre-training optimization)
  • Member of team that has trained a large language model from scratch (10B + parameters, 500B+ tokens)
  • Publications in deep learning theory
  • Publications at ACL, NAACL and EMNLP, Neurips, ICML or ICLR
  • PhD focused on topics related to optimizing training of very large deep learning models
  • Multiple years of experience and/or publications on one of the following topics: Model Sparsification, Quantization, Training Parallelism/Partitioning Design, Gradient Checkpointing, Model Compression
  • Experience optimizing training for a 10B+ model
Job Responsibility
Job Responsibility
  • Partner with a cross-functional team of data scientists, software engineers, machine learning engineers and product managers to deliver AI-powered products that change how customers interact with their money
  • Leverage a broad stack of technologies — Pytorch, AWS Ultraclusters, Huggingface, Lightning, VectorDBs, and more — to reveal the insights hidden within huge volumes of numeric and textual data
  • Build AI foundation models through all phases of development, from design through training, evaluation, validation, and implementation
  • Engage in high impact applied research to take the latest AI developments and push them into the next generation of customer experiences
  • Flex your interpersonal skills to translate the complexity of your work into tangible business goals
What we offer
What we offer
  • performance based incentive compensation, which may include cash bonus(es) and/or long term incentives (LTI)
  • comprehensive, competitive, and inclusive set of health, financial and other benefits that support your total well-being
  • Fulltime
Read More
Arrow Right

Applied Researcher I (AI Foundations)

At Capital One, we are creating trustworthy and reliable AI systems, changing ba...
Location
Location
United States , New York; San Francisco; San Jose; Cambridge; McLean
Salary
Salary:
218700.00 - 272300.00 USD / Year
capitalone.com Logo
Capital One
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Currently has, or is in the process of obtaining, a PhD in Electrical Engineering, Computer Engineering, Computer Science, AI, Mathematics, or related fields, with an exception that required degree will be obtained on or before the scheduled start date or M.S. in Electrical Engineering, Computer Engineering, Computer Science, AI, Mathematics, or related fields plus 2 years of experience in Applied Research
  • PhD in Computer Science, Machine Learning, Computer Engineering, Applied Mathematics, Electrical Engineering or related fields
  • PhD focus on NLP or Masters with 5 years of industrial NLP research experience
  • Multiple publications on topics related to the pre-training of large language models (e.g. technical reports of pre-trained LLMs, SSL techniques, model pre-training optimization)
  • Member of team that has trained a large language model from scratch (10B + parameters, 500B+ tokens)
  • Publications in deep learning theory
  • Publications at ACL, NAACL and EMNLP, Neurips, ICML or ICLR
  • PhD focused on topics related to optimizing training of very large deep learning models
  • Multiple years of experience and/or publications on one of the following topics: Model Sparsification, Quantization, Training Parallelism/Partitioning Design, Gradient Checkpointing, Model Compression
  • Experience optimizing training for a 10B+ model
Job Responsibility
Job Responsibility
  • Partner with a cross-functional team of data scientists, software engineers, machine learning engineers and product managers to deliver AI-powered products that change how customers interact with their money
  • Leverage a broad stack of technologies — Pytorch, AWS Ultraclusters, Huggingface, Lightning, VectorDBs, and more — to reveal the insights hidden within huge volumes of numeric and textual data
  • Build AI foundation models through all phases of development, from design through training, evaluation, validation, and implementation
  • Engage in high impact applied research to take the latest AI developments and push them into the next generation of customer experiences
  • Flex your interpersonal skills to translate the complexity of your work into tangible business goals
What we offer
What we offer
  • performance based incentive compensation, which may include cash bonus(es) and/or long term incentives (LTI)
  • comprehensive, competitive, and inclusive set of health, financial and other benefits that support your total well-being
  • Fulltime
Read More
Arrow Right

Applied Researcher I (AI Foundations)

At Capital One, we are creating trustworthy and reliable AI systems, changing ba...
Location
Location
United States , New York; San Francisco; San Jose; Cambridge; McLean
Salary
Salary:
218700.00 - 272300.00 USD / Year
capitalone.com Logo
Capital One
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Currently has, or is in the process of obtaining, a PhD in Electrical Engineering, Computer Engineering, Computer Science, AI, Mathematics, or related fields, with an exception that required degree will be obtained on or before the scheduled start date or M.S. in Electrical Engineering, Computer Engineering, Computer Science, AI, Mathematics, or related fields plus 2 years of experience in Applied Research
  • PhD in Computer Science, Machine Learning, Computer Engineering, Applied Mathematics, Electrical Engineering or related fields
  • LLM
  • PhD focus on NLP or Masters with 5 years of industrial NLP research experience
  • Multiple publications on topics related to the pre-training of large language models (e.g. technical reports of pre-trained LLMs, SSL techniques, model pre-training optimization)
  • Member of team that has trained a large language model from scratch (10B + parameters, 500B+ tokens)
  • Publications in deep learning theory
  • Publications at ACL, NAACL and EMNLP, Neurips, ICML or ICLR
  • PhD focused on topics related to optimizing training of very large deep learning models
  • Multiple years of experience and/or publications on one of the following topics: Model Sparsification, Quantization, Training Parallelism/Partitioning Design, Gradient Checkpointing, Model Compression
Job Responsibility
Job Responsibility
  • Partner with a cross-functional team of data scientists, software engineers, machine learning engineers and product managers to deliver AI-powered products that change how customers interact with their money
  • Leverage a broad stack of technologies — Pytorch, AWS Ultraclusters, Huggingface, Lightning, VectorDBs, and more — to reveal the insights hidden within huge volumes of numeric and textual data
  • Build AI foundation models through all phases of development, from design through training, evaluation, validation, and implementation
  • Engage in high impact applied research to take the latest AI developments and push them into the next generation of customer experiences
  • Flex your interpersonal skills to translate the complexity of your work into tangible business goals
What we offer
What we offer
  • performance based incentive compensation, which may include cash bonus(es) and/or long term incentives (LTI)
  • comprehensive, competitive, and inclusive set of health, financial and other benefits that support your total well-being
  • Fulltime
Read More
Arrow Right

Principal Product Manager, AI Multimodal

At Microsoft AI, we are on a mission to train the world’s most capable AI fronti...
Location
Location
United States , Mountain View
Salary
Salary:
139900.00 - 274800.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s Degree AND 8+ year's experience in product management or software development OR equivalent experience
  • Experience managing cross-functional and/or cross-team projects
  • Proven track record as a product manager with firsthand experience evaluating and deploying LLMs into production
  • Experience working side-by-side with Researchers and/or Engineers
  • Deeply understand the pipeline of collecting data, training and then serving language models and multimodal models
  • Have experience in working side-by-side with researchers and engineers
  • Thrive in a fast-paced, innovative environment
  • Are passionate about managing high stakes time-sensitive large-scale programs
  • Take the initiative and enjoys finding paths through complexity in a fast-paced environment
  • Are comfortable owning projects that span offices, teams and time zones, can co-ordinate different workstreams, and drive to relentlessly unblock progress
Job Responsibility
Job Responsibility
  • Identifying and prioritizing language and multimodal model issues and working with researchers to find a path to resolution
  • Creating novel data collection tasks for taskers to evaluate models and to collect training data for fine-tuning
  • Creating model prototypes to prove out new feature directions and scope projects
  • Engineering prompts to teach models how to behave across a wide range of scenarios
  • Working closely with researchers and engineers to define and manage engineering and research projects
  • Deploying and tracking AB model experiments in production
  • Foster a culture of collaboration, continuous improvement, and growth
  • Collaborate closely with teams on infrastructure, data engineering, pre-training, post-training, and product feedback
  • Advance the AI frontier responsibly
  • Embody our culture and values
  • Fulltime
Read More
Arrow Right