LLM Inference Performance & Evals Engineer Job at Cerebras Systems (Toronto)

Principal AI Engineer

We are looking for a Principal AI Engineer to lead the design and deployment of ...

Location

United States

Salary:

200000.00 - 300000.00 USD / Year

Apollo.io

Expiration Date

Until further notice

Requirements

10+ years of software engineering experience
at least 3 years in applied LLM or agentic AI systems (2023–present)
proven success in deploying LLM-powered products used by real users at scale
deep backend & systems engineering expertise with Python, distributed systems, and scalable APIs
familiarity with LangChain, LlamaIndex, or similar orchestration frameworks
experience with RAG pipelines, vector DBs, embedding models, and semantic search tuning
experience managing performance across cloud providers (e.g., AWS Bedrock, OpenAI, Anthropic, etc.)
demonstrated experience building multi-step agents, planning workflows, chaining reasoning steps, and integrating APIs with agent memory/state
comfort with advanced prompting strategies, few-shot and chain-of-thought reasoning, and embedding retrieval setups
strong understanding of AI system evaluation, human ratings, A/B experimentation, and feedback loop pipelines

Job Responsibility

Architect and lead the development of multi-agent systems capable of long-horizon planning, reasoning, and API orchestration
build reusable agentic components that integrate deeply into sales and marketing processes
own and evolve our in-house platform for scalable, low-latency, and cost-efficient LLM and agent deployments
lead design of interfaces powered by natural language understanding and retrieval-augmented generation (RAG)
build embedding-based, intent-aware search and personalization systems tuned to business user needs
drive innovation in personalized outreach generation using context-aware generation pipelines
tune inference pipelines, caching layers, and model selection logic for high-scale, cost-aware performance
define and drive robust offline and online testing methodologies (A/B, sandboxing, human evals) across agents and LLM flows
architect human-in-the-loop systems and telemetry to improve accuracy, UX, and explainability over time

What we offer

equity
company bonus or sales commissions/bonuses
401(k) plan
at least 10 paid holidays per year
flex PTO
parental leave
employee assistance program
wellbeing benefits
global travel coverage
life/AD&D/STD/LTD insurance

Fulltime

New

Principal Engineer - Generative AI Infra Capabilities

Wells Fargo is seeking a Principal Engineer - Generative Gen AI GPU Infrastructu...

Location

India , BENGALURU

Salary:

Not provided

Wells Fargo

Expiration Date

February 20, 2026

Requirements

7+ years of Engineering experience, or equivalent demonstrated through one or a combination of the following: work experience, training, military experience, education
Design GPU cluster topologies (H100/H200, NVLink/NVSwitch), networking, and storage paths for high‑throughput inferencing
document sizing and perf baselines.
Implement Run: AI constructs (Collections/Departments/Projects/workloads) for MDEV/MDEP/UCEP/MRM
codify quota, priority, and fair‑share policies.
POC & benchmark disaggregated inferencing (prefill/decode) with vLLM/TensorRT‑LLM
publish guidance for H100/H200 tuning (FP8/INT8/AWQ) and KV‑transfer behavior over NVLink.
Operationalize OpenShift AI parity for GPU scheduling, time slicing/MIG profiles, and preemption
validate upgrade paths and helm/kustomize packaging.
Integrate Triton Inference Server for multi‑model serving

Job Responsibility

Act as an advisor to leadership to develop or influence applications, network, information security, database, operating systems, or web technologies for highly complex business and technical needs across multiple groups
Lead the strategy and resolution of highly complex and unique challenges requiring in-depth evaluation across multiple areas or the enterprise, delivering solutions that are long-term, large-scale and require vision, creativity, innovation, advanced analytical and inductive thinking
Translate advanced technology experience, an in-depth knowledge of the organizations tactical and strategic business objectives, the enterprise technological environment, the organization structure, and strategic technological opportunities and requirements into technical engineering solutions
Provide vision, direction and expertise to leadership on implementing innovative and significant business solutions
Maintain knowledge of industry best practices and new technologies and recommends innovations that enhance operations or provide a competitive advantage to the organization
Strategically engage with all levels of professionals and managers across the enterprise and serve as an expert advisor to leadership

Fulltime

!

AI Engineer

Our next frontier is a strategic shift: We're evolving beyond traditional analyt...

Location

United Kingdom , London

Salary:

Not provided

MVF

Expiration Date

Until further notice

Requirements

Python and service development: write clean, typed, production-ready code
comfortable with Pydantic, Asyncio, and FastAPI
treat prompts as code: versioned, tested, and decoupled from business logic
Cloud-native experience: hands-on experience deploying and operating containerised services on AWS (or GCP/Azure) using CI/CD platforms (Jenkins, GitHub Actions, CircleCI, BuildKite), cloud monitoring tools (Datadog, Sumologic, NewRelic), and container orchestrators (EKS, ECS)
comfortable with Terraform for infrastructure as code
Hands-on LLM experience: built something real with language models, whether production systems, serious side projects, or internal tools
understand that prompting is engineering, not magic

Job Responsibility

Architect & Engineer Agentic Systems: Build agents that act, not just answer
design agents that perform deterministic actions based on probabilistic reasoning
build systems that can reliably analyse data, execute function calls, and manage state across multi-step workflows without getting stuck in loops
Production-Grade RAG: go beyond basic vector search
implement hybrid search (keyword + semantic), re-ranking strategies, and metadata filtering
Structured Data Extraction: build pipelines that turn unstructured conversations into structured data that our downstream systems can use
Establish AI Engineering Foundations: Observability First: implement the "nervous system" of our AI
choose and set up tools (e.g., LangSmith, LangFuse, ADK, or custom) to trace execution chains
Evals as a Service: build the testing harness
create automated evaluation pipelines that test prompts against "Golden Datasets"

What we offer

Summer Fridays
Competitive holiday benefits - 25 days a year paid holiday, plus 8 bank holidays (increases 1 day a year up to 30 days)
Hybrid working - 3 days a week in the office
Closed for Christmas holidays - Extra days not taken from your annual holiday allowance
Work from anywhere for 2 weeks a year
Life Assurance and Income Protection to protect your loved ones
Benefits allowance for health, dental, and vision coverage
Six months paid maternity leave, and one month paid paternity leave (subject to qualifying conditions) inclusive of same-sex and adoptive parents
Defined Contribution Pension and Salary Sacrifice Scheme
Be Well: Our award-winning wellbeing and mental health programme to support all MVFers and their families

Fulltime

Senior AI Software Developer

The Senior AI Engineer owns end-to-end delivery of AI features—from design to pr...

Location

Salary:

Not provided

Hewlett Packard Enterprise

Expiration Date

Until further notice

Requirements

Bachelor's or master’s degree in computer science, engineering, data science, machine learning, artificial intelligence, or closely related quantitative discipline
Typically, 7-10 years’ experience
LLMs & Agents: Prompt engineering, function/tool calling, orchestration frameworks, RAG
ML/DS: Evaluation metrics (precision/recall, BLEU/ROUGE where relevant), error analysis
Data/RAG: Embeddings, similarity (cosine/IP), chunking, rerankers, vector DB operations
Backend: Python (FastAPI/Flask), microservices patterns
MLOps/Infra: Docker, Kubernetes, CI/CD, artifact management, GPU scheduling
Observability: Metrics/logging/tracing, dashboards, automated evaluation pipelines
Frameworks: PyTorch/TensorFlow, Hugging Face, LangChain/LlamaIndex
Data: Pandas, SQL/NoSQL, Parquet/Arrow, Kafka/queues

Job Responsibility

Translate high-level designs into clear component contracts, APIs, and service boundaries
Implement LLM integrations, RAG pipelines, agents, tool/function calling, and prompt strategies
Own feature delivery for sprints/releases
maintain high code quality and documentation
Fine-tune models when needed
design evaluation harnesses and metrics
Build A/B testing setups
track accuracy, latency, robustness, and task success rates
Conduct error analysis
iterate using feedback efficacy loops and prompt refinement

What we offer

Health & Wellbeing
Personal & Professional Development
Unconditional Inclusion

New

Loss Prevention Supervisor - Security

Patrol all areas of the property; secure rooms; assist guests with room access. ...

Location

United Arab Emirates , Dubai

Salary:

Not provided

Marriott Bonvoy

Expiration Date

Until further notice

Requirements

High school diploma or G.E.D. equivalent
At least 2 years of related work experience
At least 1 year of supervisory experience

Job Responsibility

Patrol all areas of the property
secure rooms
assist guests with room access
Conduct emergency response drills, daily physical hazard/safety inspections, investigations, interviews, and key control audit
Monitor Closed Circuit Televisions and alarm systems
Authorize, monitor, and document access to secured areas
Assist guests/employees during emergency situations
Respond to accidents, contact EMS or administer first aid/CPR as required
Gather information and complete reports
Maintain confidentiality of reports/documents, release information to authorized individuals

Fulltime

New

Senior Solutions Engineer

As a Solution Engineer, you’ll serve as the technical lead during the sales proc...

Location

United Kingdom , London

Salary:

Not provided

Heidi

Expiration Date

Until further notice

Requirements

5+ years of experience in Solution Engineering, preferably in SaaS, healthcare, or enterprise software
Excellent communication skills
Strong working knowledge of integration protocols (REST APIs, SAML/OIDC, SCIM), enterprise architecture, and security standards
Experience supporting sales cycles with large healthcare providers, health systems, or EMR vendors is highly valued (FHIR/HL7 familiarity a plus)
Ability to synthesize complexity and communicate clearly to both technical and non-technical audiences
Comfortable operating autonomously in a fast-paced, early-stage environment
A trusted partner to sales and a credible voice in the room with technical leaders on the customer side

Job Responsibility

Technical Discovery & Qualification: Lead deep technical discovery with enterprise and mid-market prospects to uncover integration, security, and compliance needs
Solution Design: Collaborate with product and engineering to design tailored solutions that meet customer requirements while aligning with Heidi’s platform roadmap
Product Demos & Technical Presentations: Deliver compelling demos and architecture walkthroughs to technical stakeholders including IT, InfoSec, and engineering teams
RFP & Security Review Support: Own technical responses for RFPs, security assessments, and due diligence requests with attention to detail and accuracy
Deal Acceleration & Objection Handling: Proactively surface and resolve technical concerns that slow down the sales process, acting as a trusted technical advisor to the customer
Post-Sales Handoff & Feedback Loop: Partner with implementation teams to ensure a smooth transition post-signature and provide feedback to product and engineering from the field

What we offer

Flexible work with a hybrid environment
Additional paid day off for your birthday and wellness days
Discounted corporate gym memberships
A generous personal development budget of $500 per annum
Learn from some of the best engineers and creatives, joining a diverse team
Become an owner, with shares (equity) in the company, if Heidi wins, we all win
The rare chance to create a global impact as you immerse yourself in one of Australia’s leading healthtech startups
If you have an impact quickly, the opportunity to fast track your startup career

Fulltime

New

Workplace Services Coordinator II

The Workplace Services Coordinator II helps create a welcoming and positive offi...

Location

Poland , Warsaw

Salary:

Not provided

Exact Sciences

Expiration Date

Until further notice

Requirements

High School Diploma or General Education Degree (GED)
1 year of experience in an administrative, hospitality or role within an office environment
Proficient with office equipment (e.g., fax machines and printers)
Basic computer skills including Internet navigation, email usage, and word processing
Proficient in Microsoft Outlook, Excel macros and pivot tables, and Word mail merge
Demonstrated ability to perform the Essential Duties of the position with or without accommodation
Applicants must be currently authorized to work in country where work will be performed on a full or part-time basis. We are unable to sponsor or take over sponsorship of employment visas at this time

Job Responsibility

Establish offices as a pleasant and efficient work environment while ensuring smooth office operations
Welcome guests and address their needs promptly
Act as the internal and external primary point of contact as it relates to local office matters and supporting visitors
Maintain security and control access at the reception desk
Define processes to manage the day-to-day office life
Partner with HR and employees to help shape the local office culture in line with our corporate values
Assist with local events and celebrations in close collaboration with other team members
Assist new employees with their orientation to the organization by managing logistics and orchestrating the on-boarding process for newcomers. This will include training on processes and office systems
Support management team with requests relating to office space, office environment, and space allocation
Support general administration and provide guidance to administrative support for senior management/functions

Fulltime

New

Customer Success Manager

Location

India , Hyderabad

Salary:

Not provided

Highspot

Expiration Date

Until further notice

Requirements

core customer success manager experience
experience in improving product adoption
experience in building success plans relating to customer business objectives
KPI includes upsell or cross-sell

Fulltime

LLM Inference Performance & Evals Engineer

Cerebras Systems

Location:
Canada , Toronto

Category:
IT - Software Development

Contract Type:
Not provided

Salary:

Job Description:

Job Responsibility:

Requirements:

Nice to have:

Additional Information:

Job Posted:
February 17, 2026

Looking for more opportunities? Search for other job offers that match your skills and interests.

Similar Jobs for LLM Inference Performance & Evals Engineer

Principal AI Engineer

Principal Engineer - Generative AI Infra Capabilities

AI Engineer

Senior AI Software Developer

Loss Prevention Supervisor - Security

Senior Solutions Engineer

Workplace Services Coordinator II

Customer Success Manager

LLM Inference Performance & Evals Engineer

Cerebras Systems

Location:Canada , Toronto

Category:IT - Software Development

Contract Type:Not provided

Salary:

Job Description:

Job Responsibility:

Requirements:

Nice to have:

Additional Information:

Job Posted:February 17, 2026

Looking for more opportunities? Search for other job offers that match your skills and interests.

Similar Jobs for LLM Inference Performance & Evals Engineer

Principal AI Engineer

Principal Engineer - Generative AI Infra Capabilities

AI Engineer

Senior AI Software Developer

Loss Prevention Supervisor - Security

Senior Solutions Engineer

Workplace Services Coordinator II

Customer Success Manager

Location:
Canada , Toronto

Category:
IT - Software Development

Contract Type:
Not provided

Job Posted:
February 17, 2026