CrawlJobs Logo

Software Engineer, Load Balancing - Inference

openai.com Logo

OpenAI

Location Icon

Location:
United States , San Francisco

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

293000.00 - 490000.00 USD / Year

Job Description:

We’re looking for a senior engineer to design and build the load balancer that will sit at the very front of our research inference stack - routing the world’s largest AI models with millisecond precision and bulletproof reliability. This system will serve research jobs where requests must stay “sticky” to the same model instance for hours or days and where even subtle errors can directly degrade model performance.

Job Responsibility:

  • Architect and build the gateway / network load balancer that fronts all research jobs, ensuring long-lived connections remain consistent and performant
  • Design traffic stickiness and routing strategies that optimize for both reliability and throughput
  • Instrument and debug complex distributed systems — with a focus on building world-class observability and debuggability tools (distributed tracing, logging, metrics)
  • Collaborate closely with researchers and ML engineers to understand how infrastructure decisions impact model performance and training dynamics
  • Own the end-to-end system lifecycle: from design and code to deploy, operate, and scale
  • Work in an outcome-oriented environment where everyone contributes across layers of the stack, from infra plumbing to performance tuning

Requirements:

  • Deep experience designing and operating large-scale distributed systems, particularly load balancers, service gateways, or traffic routing layers
  • 5+ years of experience designing in theory for and debugging in practice for the algorithmic and systems challenges of consistent hashing, sticky routing, and low-latency connection management
  • 5+ years of experience as a software engineer and systems architect working on high-scale, high-reliability infrastructure
  • Strong debugging mindset and enjoy spending time in tracing, logs, and metrics to untangle distributed failures
  • Comfortable writing and reviewing production code in Rust or similar systems languages (C/C++, Java, Go, Zig, etc)
  • Operated in big tech or high-growth environments and are excited to apply that experience in a faster-moving setting
  • Take ownership of problems end-to-end and are excited to build something foundational to how our models interact with the world

Nice to have:

  • Experience with gateway or load balancing systems (e.g., Envoy, gRPC, custom LB implementations)
  • Familiarity with inference workloads (e.g., reinforcement learning, streaming inference, KV cache management, etc)
  • Exposure to debugging and operational excellence practices in large production environments
What we offer:
  • Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts
  • Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)
  • 401(k) retirement plan with employer match
  • Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)
  • Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees
  • 13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)
  • Mental health and wellness support
  • Employer-paid basic life and disability coverage
  • Annual learning and development stipend to fuel your professional growth
  • Daily meals in our offices, and meal delivery credits as eligible
  • Relocation support for eligible employees
  • Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided
  • Offers Equity
  • Performance-related bonus(es) for eligible employees

Additional Information:

Job Posted:
February 21, 2026

Employment Type:
Fulltime
Work Type:
On-site work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Software Engineer, Load Balancing - Inference

New

Software Engineer, Networking - Inference

We’re looking for a senior engineer to design and build the load balancer that w...
Location
Location
United States , San Francisco
Salary
Salary:
325000.00 - 490000.00 USD / Year
openai.com Logo
OpenAI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Deep experience designing and operating large-scale distributed systems, particularly load balancers, service gateways, or traffic routing layers
  • 5+ years of experience designing in theory for and debugging in practice for the algorithmic and systems challenges of consistent hashing, sticky routing, and low-latency connection management
  • 5+ years of experience as a software engineer and systems architect working on high-scale, high-reliability infrastructure
  • Strong debugging mindset and enjoy spending time in tracing, logs, and metrics to untangle distributed failures
  • Comfortable writing and reviewing production code in Rust or similar systems languages (C/C++, Java, Go, Zig, etc)
  • Operated in big tech or high-growth environments and are excited to apply that experience in a faster-moving setting
  • Take ownership of problems end-to-end and are excited to build something foundational to how our models interact with the world
Job Responsibility
Job Responsibility
  • Architect and build the gateway / network load balancer that fronts all research jobs, ensuring long-lived connections remain consistent and performant
  • Design traffic stickiness and routing strategies that optimize for both reliability and throughput
  • Instrument and debug complex distributed systems — with a focus on building world-class observability and debuggability tools (distributed tracing, logging, metrics)
  • Collaborate closely with researchers and ML engineers to understand how infrastructure decisions impact model performance and training dynamics
  • Own the end-to-end system lifecycle: from design and code to deploy, operate, and scale
  • Work in an outcome-oriented environment where everyone contributes across layers of the stack, from infra plumbing to performance tuning
What we offer
What we offer
  • Offers Equity
  • Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts
  • Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)
  • 401(k) retirement plan with employer match
  • Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)
  • Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees
  • 13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)
  • Mental health and wellness support
  • Employer-paid basic life and disability coverage
  • Annual learning and development stipend to fuel your professional growth
  • Fulltime
Read More
Arrow Right
New

Software Engineer, Caching Infrastructure

The Caching Infrastructure team is responsible for building a caching layer that...
Location
Location
United States , San Francisco
Salary
Salary:
230000.00 - 385000.00 USD / Year
openai.com Logo
OpenAI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of experience building and scaling distributed systems, with a strong focus on caching, load balancing, or storage systems
  • Deep expertise with Redis, Memcached, or similar solutions, including clustering, durability configurations, client-side connection patterns, and performance tuning
  • Production experience with Kubernetes, service meshes (e.g., Envoy), and autoscaling systems
  • Think rigorously about latency, reliability, throughput, and cost in designing platform capabilities
  • Thrive in a fast-paced environment and enjoy balancing pragmatic engineering with long-term technical excellence
Job Responsibility
Job Responsibility
  • Design, build, and operate OpenAI’s multi-tenant caching platform used across inference, identity, quota, and product experiences
  • Define the long-term vision and roadmap for caching as a core infra capability, balancing performance, durability, and cost
  • Collaborate with other infra teams (e.g., networking, observability, databases) and product teams to ensure our caching platform meets their needs
What we offer
What we offer
  • Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts
  • Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)
  • 401(k) retirement plan with employer match
  • Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)
  • Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees
  • 13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)
  • Mental health and wellness support
  • Employer-paid basic life and disability coverage
  • Annual learning and development stipend to fuel your professional growth
  • Daily meals in our offices, and meal delivery credits as eligible
  • Fulltime
Read More
Arrow Right

Site Reliability Engineer

As a Site Reliability Engineer (SRE), you will be a key player in ensuring our p...
Location
Location
Portugal , Lisboa
Salary
Salary:
Not provided
tekever.com Logo
Tekever
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Computer Science, Information Technology, Engineering, or a related field
  • 3+ years of experience in Site Reliability Engineering, DevOps, or a related software/systems engineering role
  • Proficiency in one or more programming languages such as Python, Go, or Bash for automation and tooling
  • Deep understanding of Linux/Unix operating systems and networking fundamentals (TCP/IP, DNS, HTTP, load balancing)
  • Experience with cloud platforms such as AWS, Azure, or Google Cloud, with a focus on Google Cloud
  • Strong knowledge of CI/CD tools like Jenkins, GitLab CI, or CircleCI
  • Strong hands-on experience operating Kubernetes in production, including troubleshooting of networking, storage, scheduling, autoscaling, and stateful workloads
  • Experience with Infrastructure as Code (IaC) tools such as Terraform and Ansible
  • Understanding of version control systems (e.g., Git) and with CI/CD principles and tools (e.g., GitLab CI, Jenkins)
  • Knowledge of monitoring, logging and tracing tools (e.g., Prometheus, Grafana, ELK stack)
Job Responsibility
Job Responsibility
  • Design, build, and maintain highly available, scalable infrastructure for distributed and stateful workloads, supporting real-time data ingestion, AI inference pipelines, and hybrid cloud/edge deployment
  • Automate repetitive manual tasks, infrastructure provisioning, and operational workflows to reduce toil and improve system efficiency
  • Implement and manage robust monitoring, logging, and alerting solutions to proactively detect and address issues
  • Define and track Service Level Indicators (SLIs) and Service Level Objectives (SLOs)
  • Participate in an on-call rotation to respond to production incidents
  • Lead blameless post-mortem analyses for incidents in complex distributed systems, identifying root causes, systemic weaknesses, and implementing long-term preventative measures
  • Manage and provision cloud and on-premise infrastructure using IaC principles and tools like Terraform and Ansible
  • Conduct performance analysis, system tuning, and capacity planning to ensure our services meet performance and cost-efficiency goals
  • Develop, test, and maintain disaster recovery plans and business continuity strategies to ensure service resilience
  • Work closely with software development teams to consult on system design, platform choices, and reliability best practices for new features and services
What we offer
What we offer
  • An excellent work environment and an opportunity to create a real impact in the world
  • A truly high-tech, state-of-the-art engineering company with flat structure and no politics
  • Working with the very latest technologies in Data & AI, including Edge AI, Swarming - both within our software platforms and within our embedded on-board systems
  • Flexible work arrangements
  • Professional development opportunities
  • Collaborative and inclusive work environment
  • Salary compatible with the level of proven experience
  • Fulltime
Read More
Arrow Right
New

Client Services & Administration Officer

Randstad is proud to partner with a premier, privately-owned non-bank lender in ...
Location
Location
Australia , Sydney
Salary
Salary:
80000.00 - 95000.00 AUD / Year
https://www.randstad.com Logo
Randstad
Expiration Date
March 11, 2026
Flip Icon
Requirements
Requirements
  • 3–5+ years in a senior administration role, ideally within Financial Services, Lending, Funds Management, or Compliance
  • A proven track record of working autonomously and taking full responsibility for outcomes
  • High proficiency in Excel
  • experience with Salesforce is highly desirable
  • Exceptional written and verbal communication skills for professional client liaison
  • A "zero-error" mindset with a strong understanding of compliance impacts
Job Responsibility
Job Responsibility
  • Act as the primary point of contact for investors
  • manage end-to-end administration, prepare regular reports, and perform monthly reconciliations of payments
  • Maintain accurate client loan records and prepare detailed statements
  • Ensure all documentation meets strict audit requirements and manage records within Salesforce and Excel
  • Assist with marketing events, business expansion projects, and automation initiatives
  • Proactively identify opportunities for system enhancements and operational efficiencies
What we offer
What we offer
  • Supperanuation
  • Stability: Join a well-structured firm with 20+ years of market success
  • Growth: Be part of a business currently investing in automation and expansion
  • Culture: Work in a professional, collaborative environment where initiative is rewarded
  • Location: Prime Sydney CBD offices with a flexible hybrid work model
Read More
Arrow Right
New

Senior Customer Service Specialist

Are you looking to transition into a high-impact corporate role with a Big 4 Ban...
Location
Location
Australia , Sydney
Salary
Salary:
40.00 - 41.00 AUD / Hour
https://www.randstad.com Logo
Randstad
Expiration Date
March 22, 2026
Flip Icon
Requirements
Requirements
  • Analytical mindset: ability to review complex programs and procedures
  • Corporate professionalism: experience in a banking or high-level administrative environment is highly regarded
  • Technical aptitude: comfort navigating back-office systems and maintaining detailed business accounts
  • Leadership potential: a natural ability to mentor and train others to achieve team goals
  • Banking experience essential
  • Customer Service Specialist experience essential
Job Responsibility
Job Responsibility
  • Strategic oversight: develop and review policies and procedures regarding customer relations and service delivery
  • Account management: open and maintain 'New Business Traction' accounts, ensuring seamless integration with the Open Banking initiative
  • Leadership: manage and train staff to maintain high service standards
  • Operational excellence: plan and implement after-sales services, including the management of complex refunds and feedback
  • Stakeholder liaison: collaborate with internal units and service agents to identify and meet evolving customer expectations
What we offer
What we offer
  • Superannuation
  • Industry-leading training
  • Career growth: gain exposure to the 'Open Banking' movement
  • Work from home (WFH) options available
Read More
Arrow Right
New

Broker Support Specialist

On behalf of our client, a successful mortgage broking firm, we are searching fo...
Location
Location
Australia , Sydney
Salary
Salary:
85000.00 - 110000.00 AUD / Year
https://www.randstad.com Logo
Randstad
Expiration Date
March 20, 2026
Flip Icon
Requirements
Requirements
  • A minimum of 3 years of experience in a similar mortgage support role (e.g., Loan Processor, Broker Support, or Loans Administrator)
  • Full cycle experience from application initiation to settlement
  • Proven, hands-on experience using loan lodging platforms, specifically ApplyOnline and/or Loanapp
  • A strong understanding of the residential mortgage application process, from application to settlement
  • Exceptional attention to detail and accuracy
  • Outstanding written and verbal communication skills, with the confidence to liaise with clients, brokers, and lenders
  • A proactive, "can-do" attitude and the ability to manage multiple applications simultaneously
  • A collaborative, team-player mindset well-suited to a small team environment
Job Responsibility
Job Responsibility
  • Managing the end-to-end loan application process from initial submission through to formal approval and settlement
  • Liaising directly with clients to request, collect, and verify all necessary supporting documents (e.g., payslips, bank statements, identification)
  • Preparing, packaging, and lodging loan applications accurately via industry platforms, primarily ApplyOnline and Loanapp
  • Communicating professionally with lenders, solicitors, and valuers to track application progress and ensure timely settlements
  • Conducting data entry and maintaining meticulous client records in the CRM
  • Assisting with compliance checks and preparing loan documentation for client signing
  • Handling client enquiries and providing exceptional post-settlement service
What we offer
What we offer
  • A competitive salary package that reflects your experience
  • Work close to home with on site parking in Norwest Business Park
  • Work life balance, this position is 9 am-5pm
  • The opportunity to be part of a supportive, collaborative, and successful team
  • Direct mentorship from highly experienced senior brokers
  • A clear path for career development and growth within a reputable firm
  • A valued, autonomous role where you are more than just a number
Read More
Arrow Right
New

Store Delivery Driver

This is a Delivery Driver opportunity that truly delivers on being a NAPA brand ...
Location
Location
United States , Columbus
Salary
Salary:
Not provided
allianceautomotive.co.uk Logo
Alliance Automotive UK LV Ltd
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Agility to bend to floor-level shelves and reach to upper shelves (eight feet) with use of stool or ladder when necessary
  • Stamina to stand and walk for entire work shift
  • Lift up to 60 lbs of merchandise
  • Able to handle cash charge transactions correctly and core/part returns appropriately
  • Maintaining a distribution log or tracking system to record all deliveries/pickups made
  • Inspecting, protecting and maintaining company assets, merchandise, vehicles, building and people
  • Clear speaking and attentive listening skills
  • Driving throughout the metropolitan area using maps and directions
  • Able to be flexible with your schedule including evenings, weekends and holidays
  • Valid Driver’s License
Job Responsibility
Job Responsibility
  • Delivering parts to our Customers with a passion for developing relationships with our customers
  • Picks up parts from vendors, ensure stock room parts are accurately stocked and maintain/check inventory
  • Consistently focused on safety while driving and delivering our parts
  • Serving as a NAPA Brand Ambassador as you meet customers during your deliveries
  • Building long-term relationships with the customers you deliver to
  • Maintain store delivery truck through adherence maintained to safety checklists, ensuring vehicle is clean, and that basic maintenance is done (e.g., correct tire pressure)
  • Other duties as needed
What we offer
What we offer
  • Awesome people and brand
  • Outstanding health benefits and 401K
  • Stable company. Fortune 200 with a “family” feel
  • Family Culture where no 2 days or career paths are the same
  • Opportunity for accessing multiple career paths, ongoing development, with support from leaders and your team
  • Fulltime
Read More
Arrow Right
New

Client Services & Administration Officer

Randstad is proud to partner with a premier, privately-owned non-bank lender in ...
Location
Location
Australia , Sydney
Salary
Salary:
80000.00 - 95000.00 AUD / Year
https://www.randstad.com Logo
Randstad
Expiration Date
March 22, 2026
Flip Icon
Requirements
Requirements
  • 3–5+ years in a senior administration role, ideally within Financial Services, Lending, Funds Management, or Compliance
  • A proven track record of working autonomously and taking full responsibility for outcomes
  • High proficiency in Excel
  • experience with Salesforce is highly desirable
  • Exceptional written and verbal communication skills for professional client liaison
  • A "zero-error" mindset with a strong understanding of compliance impacts
Job Responsibility
Job Responsibility
  • Act as the primary point of contact for investors
  • manage end-to-end administration, prepare regular reports, and perform monthly reconciliations of payments
  • Maintain accurate client loan records and prepare detailed statements
  • Ensure all documentation meets strict audit requirements and manage records within Salesforce and Excel
  • Assist with marketing events, business expansion projects, and automation initiatives
  • Proactively identify opportunities for system enhancements and operational efficiencies
What we offer
What we offer
  • Supperanuation
  • Stability: Join a well-structured firm with 20+ years of market success
  • Growth: Be part of a business currently investing in automation and expansion
  • Culture: Work in a professional, collaborative environment where initiative is rewarded
  • Location: Prime Sydney CBD offices with a flexible hybrid work model
Read More
Arrow Right