CrawlJobs Logo

Research Engineer, Frontier Evals & Environments

openai.com Logo

OpenAI

Location Icon

Location:
United States , San Francisco

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

205000.00 - 380000.00 USD / Year

Job Description:

The Frontier Evals & Environments team builds north star model environments to drive progress towards safe AGI/ASI. This team builds ambitious environments to measure and steer our models, and creates self-improvement loops to steer our training, safety, and launch decisions.

Job Responsibility:

  • Create ambitious RL environments to push our models to their limits
  • Work on measuring frontier model capabilities, skills, and behaviors
  • Develop new methodologies for automatically exploring the behavior of these models
  • Help steer training for our largest training runs, and see the future first
  • Design scalable systems and processes to support continuous evaluation
  • Build self-improvement loops to automate model understanding

Requirements:

  • Passionate and knowledgeable about AGI/ASI measurement
  • Strong engineering and statistical analysis skills
  • Able to think outside the box and have a robust “red-teaming mindset”
  • Experienced in ML research engineering, stochastic systems, observability and monitoring, LLM-enabled applications, and/or another technical domain applicable to AI evaluations
  • Able to operate effectively in a dynamic and extremely fast-paced research environment as well as scope and deliver projects end-to-end

Nice to have:

  • First-hand experience in red-teaming systems—be it computer systems or otherwise
  • An ability to work cross-functionally
  • Excellent communication skills
What we offer:
  • Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts
  • Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)
  • 401(k) retirement plan with employer match
  • Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)
  • Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees
  • 13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)
  • Mental health and wellness support
  • Employer-paid basic life and disability coverage
  • Annual learning and development stipend to fuel your professional growth
  • Daily meals in our offices, and meal delivery credits as eligible
  • Relocation support for eligible employees
  • Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided
  • Offers Equity
  • performance-related bonus(es) for eligible employees

Additional Information:

Job Posted:
February 21, 2026

Employment Type:
Fulltime
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Research Engineer, Frontier Evals & Environments

Research Engineer, Frontier Evals & Environments - Finance

The Frontier Evals team builds north star model evaluations to drive progress to...
Location
Location
United States , San Francisco
Salary
Salary:
205000.00 - 380000.00 USD / Year
openai.com Logo
OpenAI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong engineering and statistical analysis skills (with at least 2-3 years of full-time technical experience)
  • Passionate about evals for real world applications and knowledge work
  • Detail-oriented and thorough
  • Team player / willing to do a variety of tasks to move the team forward
  • Passionate and knowledgeable about AGI/ASI measurement
  • Able to operate effectively in a dynamic and extremely fast-paced research environment as well as scope and deliver projects end-to-end
Job Responsibility
Job Responsibility
  • Identify important model capabilities, skills, and behaviors that are crucial to financial workflows, and design methods to quantify performance in these areas
  • Own and pursue a research agenda to identify an important model capability (especially as it relates to financial reasoning) and build evals to measure it
  • Continuously refine evaluations of frontier AI models to assess the extent of frontier capabilities
What we offer
What we offer
  • Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts
  • Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)
  • 401(k) retirement plan with employer match
  • Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)
  • Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees
  • 13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)
  • Mental health and wellness support
  • Employer-paid basic life and disability coverage
  • Annual learning and development stipend to fuel your professional growth
  • Daily meals in our offices, and meal delivery credits as eligible
  • Fulltime
Read More
Arrow Right

AI Architect

We’re hiring an AI Architect to sit at the intersection of frontier AI research,...
Location
Location
United States , San Francisco; New York
Salary
Salary:
201600.00 - 241920.00 USD / Year
scale.com Logo
Scale
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Deep technical background in applied AI/ML: 5–10+ years in research, engineering, solutions engineering, or technical product roles working on LLMs or multimodal systems, ideally in high-stakes, customer-facing environments
  • Hands-on experience with model improvement workflows: demonstrated experience with post-training techniques, evaluation design, benchmarking, and model quality iteration
  • Ability to work on hard, ambiguous technical problems: proven track record of partnering directly with advanced customers or research teams to scope, reason through, and execute on deep technical challenges involving frontier models
  • Strong technical fluency: you can read papers, interrogate metrics, write or review complex Python/SQL for analysis, and reason about model-data trade-offs
  • Executive presence with world-class researchers and enterprise leaders
  • excellent writing and storytelling
  • Bias to action: you ship, learn, and iterate.
Job Responsibility
Job Responsibility
  • Translate research → product: work with client side researchers on post-training, evals, safety/alignment and build the primitives, data, and tooling they need
  • Partner deeply with core customers and frontier labs: work hands-on with leading AI teams and frontier research labs to tackle hard, open-ended technical problems related to frontier model improvement, performance, and deployment
  • Shape and propose model improvement work: translate customer and research objectives into clear, technically rigorous proposals—scoping post-training, evaluation, and safety work into well-defined statements of work and execution plans
  • Translate research into production impact: collaborate with customer-side researchers on post-training, evaluations, and alignment, and help design the data, primitives, and tooling required to improve frontier models in practice
  • Own the end-to-end lifecycle: lead discovery, write crisp PRDs and technical specs, prioritize trade-offs, run experiments, ship initial solutions, and scale successful pilots into durable, repeatable offerings
  • Lead complex, high-stakes engagements: independently run technical working sessions with senior customer stakeholders
  • define success metrics
  • surface risks early
  • and drive programs to measurable outcomes
  • Partner across Scale: collaborate closely with research (agents, browser/SWE agents), platform, operations, security, and finance to deliver reliable, production-grade results for demanding customers
What we offer
What we offer
  • Comprehensive health, dental and vision coverage
  • retirement benefits
  • a learning and development stipend
  • generous PTO
  • commuter stipend
  • equity based compensation.
  • Fulltime
Read More
Arrow Right

Researcher, Preparedness

The Preparedness team helps us prepare for the development of increasingly capab...
Location
Location
United States , San Francisco
Salary
Salary:
295000.00 - 445000.00 USD / Year
openai.com Logo
OpenAI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Passionate and knowledgeable about short-term and long-term AI safety risks
  • Ability to think outside the box and have a robust 'red-teaming mindset'
  • Experience in ML research engineering, ML observability and monitoring, creating large language model-enabled applications, and/or another technical domain applicable to AI risk
  • Able to operate effectively in a dynamic and extremely fast-paced research environment as well as scope and deliver projects end-to-end
Job Responsibility
Job Responsibility
  • Own the scientific validity of frontier preparedness capability evaluations—designing new evals grounded in real threat models (including high-consequence domains like CBRN as well as cyber and other frontier-risk areas), and maintaining existing evals so they don't stale or silently regress
  • Define datasets, graders, rubrics, and threshold guidance, and produce auditable artifacts (evaluation cards, capability reports, system-card inputs) that leadership can trust during high-stakes launches
  • Work on identifying emerging AI safety risks and new methodologies for exploring the impact of these risks
  • Build (and then continuously refine) evaluations of frontier AI models that assess the extent of identified risks
  • Design and build scalable systems and processes that can support these kinds of evaluations
  • Contribute to the refinement of risk management and the overall development of 'best practice' guidelines for AI safety evaluations
What we offer
What we offer
  • Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts
  • Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)
  • 401(k) retirement plan with employer match
  • Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)
  • Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees
  • 13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)
  • Mental health and wellness support
  • Employer-paid basic life and disability coverage
  • Annual learning and development stipend to fuel your professional growth
  • Daily meals in our offices, and meal delivery credits as eligible
  • Fulltime
Read More
Arrow Right

Research Engineering Manager - Model Training

Perplexity is seeking a Research Engineering Manager to lead the team of all-sta...
Location
Location
United States , San Francisco
Salary
Salary:
300000.00 - 470000.00 USD / Year
perplexity.ai Logo
Perplexity
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proven experience with large-scale LLMs and Deep Learning systems
  • Strong Python and PyTorch skills
  • Experience leading or managing research or engineering teams working on large-scale AI model development, including driving complex projects from idea to production
  • Self‑starter with a willingness to take ownership of tasks and navigate ambiguity in a fast‑moving environment
  • Passion for tackling challenging problems in AI model quality, speed, safety, and reliability
  • 10+ years of technical experience, with at least 2 of those years as a manager and at least 4 of those years working on large-scale AI model development
Job Responsibility
Job Responsibility
  • Lead a team of researchers and engineers focused on training SotA models for Perplexity-relevant use cases, leveraging the latest supervised and reinforcement learning techniques
  • Drive research and engineering efforts to develop production models through advanced model training and alignment techniques, including RL, SFT, and other approaches
  • Become deeply familiar with the team’s technical stack, leading from the front through hands-on technical contributions
  • Own the data, training, and eval pipelines required to train and continuously improve LLM models
  • Design and iterate on model training and finetuning algorithms (e.g., preference‑based methods, reinforcement learning from human or AI feedback) through an approach that balances scientific rigor and iteration velocity
  • Design evaluations and improve the production model training pipeline to reliably deliver models that lie on the Pareto frontier of speed and quality
  • Work closely with engineering teams to integrate in-house models into our product and rapidly iterate based on real‑world usage
  • Manage day‑to‑day execution, project planning, and prioritization for the model training team to hit ambitious quality and performance goals
What we offer
What we offer
  • Equity
  • Health
  • Dental
  • Vision
  • Retirement
  • Fitness
  • Commuter and dependent care accounts
  • Fulltime
Read More
Arrow Right
New

Head of Academic PE

Wetherby Senior School are looking to appoint a Head of Academic PE to start Sep...
Location
Location
United Kingdom , London
Salary
Salary:
Not provided
isbergamo.com Logo
International School of Bergamo
Expiration Date
March 16, 2026
Flip Icon
Requirements
Requirements
  • The ability to teach PE to A Level
  • Have outstanding subject knowledge, academic qualifications and to be able to communicate their enthusiasm
  • Be able to deliver dynamic and effective lessons to the full ability range of pupils at the School
  • Highly effective communication skills for interacting with all members of the School community
  • A genuine commitment to pastoral care and the welfare and safeguarding of pupils
  • Interests and abilities that can enhance the School’s co-curricular programme
  • Excellent inter-personal skills
  • Excellent administrative, organisational, and IT skills
Job Responsibility
Job Responsibility
  • Lead and manage the Academic PE curriculum, including schemes of work, assessment, academic standards and enrichment
  • Keep curriculum content current by integrating external subject developments and promoting academic enquiry
  • Organise and lead educational trips to enhance pupils’ learning experiences
  • Manage and support departmental staff through induction, observation, feedback and professional development
  • Ensure consistent use of rewards, sanctions and effective behaviour management across the department
  • Monitor the quality of teaching, pupil tracking, written reports and scrutiny of pupil work
  • Support staff in meeting the needs of pupils with learning, medical, social or other difficulties
  • Oversee pupil progress, option choices and university applications, and develop links with university PE departments
  • Teach practical PE and Games across all Key Stages, with a preferred specialism in Football, Rugby or Cricket
  • Promote enthusiasm for sport and contribute to high-quality teaching resources and programmes of study
What we offer
What we offer
  • Work in state-of-the-art facilities alongside industry-renowned educators and leaders in some of the world's most desirable locations
  • Industry-leading professional development
  • Exceptional career opportunities
  • Mobility across our group
  • Fulltime
Read More
Arrow Right
New

Principal Engineer

Wells Fargo is seeking a Principal Engineer.
Location
Location
India , Bengaluru
Salary
Salary:
Not provided
https://www.wellsfargo.com/ Logo
Wells Fargo
Expiration Date
March 03, 2026
Flip Icon
Requirements
Requirements
  • 7+ years of Engineering experience, or equivalent demonstrated through one or a combination of the following: work experience, training, military experience, education
  • 5+ years' hands on programming and/or scripting experience in one or more of the following: Java, Python, Shell scripting etc.
  • 5+ years’ experience with Infrastructure as code (IaC) implementation using Terraform, Cross plane or any other industry equivalent solutions.
  • 5+ years’ experience with OpenShift Container Platform and/or Google Cloud Platform, and/or Microsoft Azure hands on experience.
  • 5+ years’ experience with enterprise grade automation solutions design and implementation experience using tools such as Ansible, Harness CD, GitHub Actions, Playwright etc.
  • 2+ years of experience with AI, Gen AI, Agentic automation solutions design and development.
  • Excellent communication and stakeholder management skills.
  • Demonstrated ability to lead complex projects with limited supervision and high accountability.
  • Experience with Generative AI, RAG pipelines, agentic AI systems, sub-agents, Context graphs, Knowledge graph foundations, context engineering and Multi-agent orchestration.
  • Exposure to Google Cloud Platform: Vertex AI, Agentspace, MCP, A2A exposure.
Job Responsibility
Job Responsibility
  • Act as an advisor to leadership to develop or influence applications, network, information security, database, operating systems, or web technologies for highly complex business and technical needs across multiple groups
  • Lead the strategy and resolution of highly complex and unique challenges requiring in-depth evaluation across multiple areas or the enterprise, delivering solutions that are long-term, large-scale and require vision, creativity, innovation, advanced analytical and inductive thinking
  • Translate advanced technology experience, an in-depth knowledge of the organizations tactical and strategic business objectives, the enterprise technological environment, the organization structure, and strategic technological opportunities and requirements into technical engineering solutions
  • Provide vision, direction and expertise to leadership on implementing innovative and significant business solutions
  • Maintain knowledge of industry best practices and new technologies and recommends innovations that enhance operations or provide a competitive advantage to the organization
  • Strategically engage with all levels of professionals and managers across the enterprise and serve as an expert advisor to leadership
  • Strong communication with the ability to communicate on all levels of the organization
  • Demonstrate knowledge/understanding of emerging technologies, industry trends, and outside perspectives, and communicate relevance to the organizations strategic and tactical goals
  • Ability to dynamically engage, attend high impact production incidents and troubleshoot to resolution and provide immediate incident analysis both written and spoken
  • Ensure adherence to security, compliance, and cost optimization standards in all implementations including GenAI.
  • Fulltime
!
Read More
Arrow Right
New

Summer intern

GSK Summer Internships for 2026 are open to current students and recent graduate...
Location
Location
United Kingdom , Barnard Castle; Stevenage; London
Salary
Salary:
Not provided
srgtalent.com Logo
SRG
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Current student or recent graduate (within last two years)
  • Written and spoken fluency in English
  • Must be able to commit to the specific site location
  • Must be able to commit to the full internship duration (12 weeks or less)
  • Eligible degrees vary per role (e.g., Life Sciences, Chemistry, Engineering, Computer Science, Data Science, Law, UX Design, etc.)
Job Responsibility
Job Responsibility
  • Varies by role
  • Examples include: Supporting regulatory audit preparation and live audit support
  • Conducting market and trend analysis
  • Developing prototypes
  • Supporting campaigns and creating user-focused content
  • Learning core lab techniques and supporting data production
  • Supporting policy research and governance activities
  • Contributing to marketing projects and commercial execution
  • Delivering a full AI or machine learning project
  • Supporting OT security operations and risk assessments
What we offer
What we offer
  • Paid internship
  • Learn from experts
  • Develop skills
  • Gain valuable experience
  • Fulltime
Read More
Arrow Right
New

Pharmacy Intern - Grad

You’ve invested a lot of time and energy in your education. Now you want the cha...
Location
Location
United States , Waipahu
Salary
Salary:
20.25 - 42.00 USD / Hour
https://www.cvshealth.com/ Logo
CVS Health
Expiration Date
May 01, 2026
Flip Icon
Requirements
Requirements
  • PharmD graduate of a U.S. accredited program prior to beginning the Post-Graduate Training Program at CVS Health
  • Ability to obtain required pharmacist licensure within the required timeframe, per state guidelines. Failure to obtain required Pharmacist licensure within 120 days of graduation will result in separation of employment.
  • Must possess, or be in the process of obtaining, valid intern and/or technician licensure as required
  • Regular and predictable attendance, including nights and weekends
  • Ability to complete required training within designated timeframe
  • Attention and Focus: Ability to concentrate on a task over a period of time
  • Ability to pivot quickly from one task to another to meet patient and business needs
  • Ability to confirm prescription information and label accuracy, ensuring patient safety
  • Customer Service and Team Orientation: Actively look for ways to help people, and do so in a friendly manner
  • Notice and understand patients’ reactions, and respond appropriately
Job Responsibility
Job Responsibility
  • Living our purpose by following all company SOPs at each workstation to help our Pharmacists and Technicians manage and improve patient health
  • Following pharmacy workflow procedures at each pharmacy workstation (i.e., production, pick-up, drive-thru, and drop-off) for safe and accurate prescription fulfillment
  • Contributing to positive patient experiences by showing empathy and genuine care: creating heartfelt and personalized moments while serving patients at pick-up, drive-thru, and over the phone
  • keeping patients healthy by offering immunizations and other services at the register and over the phone
  • and demonstrating compassionate care by solving or escalating patient problems
  • Offering to counsel, fielding medical questions, and soliciting information on a patient’s medical history to provide optimal care, when appropriate under the direct supervision of a licensed pharmacist
  • Taking telephonic prescriptions from the prescriber, and calling the prescriber to clarify prescriptions or facilitate medication changes, where allowed by state regulation
  • Maintaining the highest level of self-awareness and providing in-the-moment coaching, training, and mentoring to pharmacy team members while sharing best practices
  • Completing basic inventory activities, as permitted by law, and as directed by the pharmacy leadership team, such as accurately putting away medication deliveries and completing cycle counts, returns-to-stocks, waiting bin inventories, etc.
  • Contributing to a high-performing team, embracing a growth mindset, and being receptive to feedback
Read More
Arrow Right