Research Engineer, Frontier Evals & Environments Job at OpenAI (San Francisco)

Research Engineer, Frontier Evals & Environments - Finance

The Frontier Evals team builds north star model evaluations to drive progress to...

Location

United States , San Francisco

Salary:

205000.00 - 380000.00 USD / Year

OpenAI

Expiration Date

Until further notice

Requirements

Strong engineering and statistical analysis skills (with at least 2-3 years of full-time technical experience)
Passionate about evals for real world applications and knowledge work
Detail-oriented and thorough
Team player / willing to do a variety of tasks to move the team forward
Passionate and knowledgeable about AGI/ASI measurement
Able to operate effectively in a dynamic and extremely fast-paced research environment as well as scope and deliver projects end-to-end

Job Responsibility

Identify important model capabilities, skills, and behaviors that are crucial to financial workflows, and design methods to quantify performance in these areas
Own and pursue a research agenda to identify an important model capability (especially as it relates to financial reasoning) and build evals to measure it
Continuously refine evaluations of frontier AI models to assess the extent of frontier capabilities

What we offer

Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts
Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)
401(k) retirement plan with employer match
Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)
Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees
13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)
Mental health and wellness support
Employer-paid basic life and disability coverage
Annual learning and development stipend to fuel your professional growth
Daily meals in our offices, and meal delivery credits as eligible

Fulltime

AI Architect

We’re hiring an AI Architect to sit at the intersection of frontier AI research,...

Location

United States , San Francisco; New York

Salary:

201600.00 - 241920.00 USD / Year

Scale

Expiration Date

Until further notice

Requirements

Deep technical background in applied AI/ML: 5–10+ years in research, engineering, solutions engineering, or technical product roles working on LLMs or multimodal systems, ideally in high-stakes, customer-facing environments
Hands-on experience with model improvement workflows: demonstrated experience with post-training techniques, evaluation design, benchmarking, and model quality iteration
Ability to work on hard, ambiguous technical problems: proven track record of partnering directly with advanced customers or research teams to scope, reason through, and execute on deep technical challenges involving frontier models
Strong technical fluency: you can read papers, interrogate metrics, write or review complex Python/SQL for analysis, and reason about model-data trade-offs
Executive presence with world-class researchers and enterprise leaders
excellent writing and storytelling
Bias to action: you ship, learn, and iterate.

Job Responsibility

Translate research → product: work with client side researchers on post-training, evals, safety/alignment and build the primitives, data, and tooling they need
Partner deeply with core customers and frontier labs: work hands-on with leading AI teams and frontier research labs to tackle hard, open-ended technical problems related to frontier model improvement, performance, and deployment
Shape and propose model improvement work: translate customer and research objectives into clear, technically rigorous proposals—scoping post-training, evaluation, and safety work into well-defined statements of work and execution plans
Translate research into production impact: collaborate with customer-side researchers on post-training, evaluations, and alignment, and help design the data, primitives, and tooling required to improve frontier models in practice
Own the end-to-end lifecycle: lead discovery, write crisp PRDs and technical specs, prioritize trade-offs, run experiments, ship initial solutions, and scale successful pilots into durable, repeatable offerings
Lead complex, high-stakes engagements: independently run technical working sessions with senior customer stakeholders
define success metrics
surface risks early
and drive programs to measurable outcomes
Partner across Scale: collaborate closely with research (agents, browser/SWE agents), platform, operations, security, and finance to deliver reliable, production-grade results for demanding customers

What we offer

Comprehensive health, dental and vision coverage
retirement benefits
a learning and development stipend
generous PTO
commuter stipend
equity based compensation.

Fulltime

Researcher, Preparedness

The Preparedness team helps us prepare for the development of increasingly capab...

Location

United States , San Francisco

Salary:

295000.00 - 445000.00 USD / Year

OpenAI

Expiration Date

Until further notice

Requirements

Passionate and knowledgeable about short-term and long-term AI safety risks
Ability to think outside the box and have a robust 'red-teaming mindset'
Experience in ML research engineering, ML observability and monitoring, creating large language model-enabled applications, and/or another technical domain applicable to AI risk
Able to operate effectively in a dynamic and extremely fast-paced research environment as well as scope and deliver projects end-to-end

Job Responsibility

Own the scientific validity of frontier preparedness capability evaluations—designing new evals grounded in real threat models (including high-consequence domains like CBRN as well as cyber and other frontier-risk areas), and maintaining existing evals so they don't stale or silently regress
Define datasets, graders, rubrics, and threshold guidance, and produce auditable artifacts (evaluation cards, capability reports, system-card inputs) that leadership can trust during high-stakes launches
Work on identifying emerging AI safety risks and new methodologies for exploring the impact of these risks
Build (and then continuously refine) evaluations of frontier AI models that assess the extent of identified risks
Design and build scalable systems and processes that can support these kinds of evaluations
Contribute to the refinement of risk management and the overall development of 'best practice' guidelines for AI safety evaluations

What we offer

Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts
Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)
401(k) retirement plan with employer match
Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)
Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees
13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)
Mental health and wellness support
Employer-paid basic life and disability coverage
Annual learning and development stipend to fuel your professional growth
Daily meals in our offices, and meal delivery credits as eligible

Fulltime

Research Engineering Manager - Model Training

Perplexity is seeking a Research Engineering Manager to lead the team of all-sta...

Location

United States , San Francisco

Salary:

300000.00 - 470000.00 USD / Year

Perplexity

Expiration Date

Until further notice

Requirements

Proven experience with large-scale LLMs and Deep Learning systems
Strong Python and PyTorch skills
Experience leading or managing research or engineering teams working on large-scale AI model development, including driving complex projects from idea to production
Self‑starter with a willingness to take ownership of tasks and navigate ambiguity in a fast‑moving environment
Passion for tackling challenging problems in AI model quality, speed, safety, and reliability
10+ years of technical experience, with at least 2 of those years as a manager and at least 4 of those years working on large-scale AI model development

Job Responsibility

Lead a team of researchers and engineers focused on training SotA models for Perplexity-relevant use cases, leveraging the latest supervised and reinforcement learning techniques
Drive research and engineering efforts to develop production models through advanced model training and alignment techniques, including RL, SFT, and other approaches
Become deeply familiar with the team’s technical stack, leading from the front through hands-on technical contributions
Own the data, training, and eval pipelines required to train and continuously improve LLM models
Design and iterate on model training and finetuning algorithms (e.g., preference‑based methods, reinforcement learning from human or AI feedback) through an approach that balances scientific rigor and iteration velocity
Design evaluations and improve the production model training pipeline to reliably deliver models that lie on the Pareto frontier of speed and quality
Work closely with engineering teams to integrate in-house models into our product and rapidly iterate based on real‑world usage
Manage day‑to‑day execution, project planning, and prioritization for the model training team to hit ambitious quality and performance goals

What we offer

Equity
Health
Dental
Vision
Retirement
Fitness
Commuter and dependent care accounts

Fulltime

New

Head of Academic PE

Wetherby Senior School are looking to appoint a Head of Academic PE to start Sep...

Location

United Kingdom , London

Salary:

Not provided

International School of Bergamo

Expiration Date

March 16, 2026

Requirements

The ability to teach PE to A Level
Have outstanding subject knowledge, academic qualifications and to be able to communicate their enthusiasm
Be able to deliver dynamic and effective lessons to the full ability range of pupils at the School
Highly effective communication skills for interacting with all members of the School community
A genuine commitment to pastoral care and the welfare and safeguarding of pupils
Interests and abilities that can enhance the School’s co-curricular programme
Excellent inter-personal skills
Excellent administrative, organisational, and IT skills

Job Responsibility

Lead and manage the Academic PE curriculum, including schemes of work, assessment, academic standards and enrichment
Keep curriculum content current by integrating external subject developments and promoting academic enquiry
Organise and lead educational trips to enhance pupils’ learning experiences
Manage and support departmental staff through induction, observation, feedback and professional development
Ensure consistent use of rewards, sanctions and effective behaviour management across the department
Monitor the quality of teaching, pupil tracking, written reports and scrutiny of pupil work
Support staff in meeting the needs of pupils with learning, medical, social or other difficulties
Oversee pupil progress, option choices and university applications, and develop links with university PE departments
Teach practical PE and Games across all Key Stages, with a preferred specialism in Football, Rugby or Cricket
Promote enthusiasm for sport and contribute to high-quality teaching resources and programmes of study

What we offer

Work in state-of-the-art facilities alongside industry-renowned educators and leaders in some of the world's most desirable locations
Industry-leading professional development
Exceptional career opportunities
Mobility across our group

Fulltime

New

Principal Engineer

Wells Fargo is seeking a Principal Engineer.

Location

India , Bengaluru

Salary:

Not provided

Wells Fargo

Expiration Date

March 03, 2026

Requirements

7+ years of Engineering experience, or equivalent demonstrated through one or a combination of the following: work experience, training, military experience, education
5+ years' hands on programming and/or scripting experience in one or more of the following: Java, Python, Shell scripting etc.
5+ years’ experience with Infrastructure as code (IaC) implementation using Terraform, Cross plane or any other industry equivalent solutions.
5+ years’ experience with OpenShift Container Platform and/or Google Cloud Platform, and/or Microsoft Azure hands on experience.
5+ years’ experience with enterprise grade automation solutions design and implementation experience using tools such as Ansible, Harness CD, GitHub Actions, Playwright etc.
2+ years of experience with AI, Gen AI, Agentic automation solutions design and development.
Excellent communication and stakeholder management skills.
Demonstrated ability to lead complex projects with limited supervision and high accountability.
Experience with Generative AI, RAG pipelines, agentic AI systems, sub-agents, Context graphs, Knowledge graph foundations, context engineering and Multi-agent orchestration.
Exposure to Google Cloud Platform: Vertex AI, Agentspace, MCP, A2A exposure.

Job Responsibility

Act as an advisor to leadership to develop or influence applications, network, information security, database, operating systems, or web technologies for highly complex business and technical needs across multiple groups
Lead the strategy and resolution of highly complex and unique challenges requiring in-depth evaluation across multiple areas or the enterprise, delivering solutions that are long-term, large-scale and require vision, creativity, innovation, advanced analytical and inductive thinking
Translate advanced technology experience, an in-depth knowledge of the organizations tactical and strategic business objectives, the enterprise technological environment, the organization structure, and strategic technological opportunities and requirements into technical engineering solutions
Provide vision, direction and expertise to leadership on implementing innovative and significant business solutions
Maintain knowledge of industry best practices and new technologies and recommends innovations that enhance operations or provide a competitive advantage to the organization
Strategically engage with all levels of professionals and managers across the enterprise and serve as an expert advisor to leadership
Strong communication with the ability to communicate on all levels of the organization
Demonstrate knowledge/understanding of emerging technologies, industry trends, and outside perspectives, and communicate relevance to the organizations strategic and tactical goals
Ability to dynamically engage, attend high impact production incidents and troubleshoot to resolution and provide immediate incident analysis both written and spoken
Ensure adherence to security, compliance, and cost optimization standards in all implementations including GenAI.

Fulltime

!

New

Summer intern

GSK Summer Internships for 2026 are open to current students and recent graduate...

Location

United Kingdom , Barnard Castle; Stevenage; London

Salary:

Not provided

SRG

Expiration Date

Until further notice

Requirements

Current student or recent graduate (within last two years)
Written and spoken fluency in English
Must be able to commit to the specific site location
Must be able to commit to the full internship duration (12 weeks or less)
Eligible degrees vary per role (e.g., Life Sciences, Chemistry, Engineering, Computer Science, Data Science, Law, UX Design, etc.)

Job Responsibility

Varies by role
Examples include: Supporting regulatory audit preparation and live audit support
Conducting market and trend analysis
Developing prototypes
Supporting campaigns and creating user-focused content
Learning core lab techniques and supporting data production
Supporting policy research and governance activities
Contributing to marketing projects and commercial execution
Delivering a full AI or machine learning project
Supporting OT security operations and risk assessments

What we offer

Paid internship
Learn from experts
Develop skills
Gain valuable experience

Fulltime

New

Pharmacy Intern - Grad

You’ve invested a lot of time and energy in your education. Now you want the cha...

Location

United States , Waipahu

Salary:

20.25 - 42.00 USD / Hour

CVS Health

Expiration Date

May 01, 2026

Requirements

PharmD graduate of a U.S. accredited program prior to beginning the Post-Graduate Training Program at CVS Health
Ability to obtain required pharmacist licensure within the required timeframe, per state guidelines. Failure to obtain required Pharmacist licensure within 120 days of graduation will result in separation of employment.
Must possess, or be in the process of obtaining, valid intern and/or technician licensure as required
Regular and predictable attendance, including nights and weekends
Ability to complete required training within designated timeframe
Attention and Focus: Ability to concentrate on a task over a period of time
Ability to pivot quickly from one task to another to meet patient and business needs
Ability to confirm prescription information and label accuracy, ensuring patient safety
Customer Service and Team Orientation: Actively look for ways to help people, and do so in a friendly manner
Notice and understand patients’ reactions, and respond appropriately

Job Responsibility

Living our purpose by following all company SOPs at each workstation to help our Pharmacists and Technicians manage and improve patient health
Following pharmacy workflow procedures at each pharmacy workstation (i.e., production, pick-up, drive-thru, and drop-off) for safe and accurate prescription fulfillment
Contributing to positive patient experiences by showing empathy and genuine care: creating heartfelt and personalized moments while serving patients at pick-up, drive-thru, and over the phone
keeping patients healthy by offering immunizations and other services at the register and over the phone
and demonstrating compassionate care by solving or escalating patient problems
Offering to counsel, fielding medical questions, and soliciting information on a patient’s medical history to provide optimal care, when appropriate under the direct supervision of a licensed pharmacist
Taking telephonic prescriptions from the prescriber, and calling the prescriber to clarify prescriptions or facilitate medication changes, where allowed by state regulation
Maintaining the highest level of self-awareness and providing in-the-moment coaching, training, and mentoring to pharmacy team members while sharing best practices
Completing basic inventory activities, as permitted by law, and as directed by the pharmacy leadership team, such as accurately putting away medication deliveries and completing cycle counts, returns-to-stocks, waiting bin inventories, etc.
Contributing to a high-performing team, embracing a growth mindset, and being receptive to feedback

Research Engineer, Frontier Evals & Environments

OpenAI

Location:
United States , San Francisco

Category:
IT - Software Development

Contract Type:
Not provided

Salary:

Job Description:

Job Responsibility:

Requirements:

Nice to have:

Additional Information:

Job Posted:
February 21, 2026

Looking for more opportunities? Search for other job offers that match your skills and interests.

Similar Jobs for Research Engineer, Frontier Evals & Environments