Deployment Engineer, AI Inference Job at Cerebras Systems (Sunnyvale)

Deployment Engineer, AI Inference

Cerebras Systems

Location:
United States; Canada , Sunnyvale ▼
Toronto

Category:
IT - Software Development

Contract Type:
Not provided

Salary:

Not provided

Save Job

Apply Position

Job Description:

Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry-leading training and inference speeds and empowers machine learning users to effortlessly run large-scale ML applications, without the hassle of managing hundreds of GPUs or TPUs. Cerebras' current customers include top model labs, global enterprises, and cutting-edge AI-native startups. OpenAI recently announced a multi-year partnership with Cerebras, to deploy 750 megawatts of scale, transforming key workloads with ultra high-speed inference. Thanks to the groundbreaking wafer-scale architecture, Cerebras Inference offers the fastest Generative AI inference solution in the world, over 10 times faster than GPU-based hyperscale cloud inference services. This order of magnitude increase in speed is transforming the user experience of AI applications, unlocking real-time iteration and increasing intelligence via additional agentic computation.

Job Responsibility:

Deploy AI inference replicas and cluster software across multiple datacenters
Operate across heterogeneous datacenter environments undergoing rapid 10x growth
Maximize capacity allocation and optimize replica placement using constraint-solver algorithms
Operate bare-metal inference infrastructure while supporting transition to K8S-based platform
Develop and extend telemetry, observability and alerting solutions to ensure deployment reliability at scale
Develop and extend a fully automated deployment pipeline to support fast software updates and capacity reallocation at scale
Translate technical and customer needs into actionable requirements for the Dev Infra, Cluster, Platform and Core teams
Stay up to date with the latest advancements in AI compute infrastructure and related technologies

Requirements:

2-5 years of experience in operating on-prem compute infrastructure (ideally in Machine Learning or High-Performance Compute) or in developing and managing complex AWS plane infrastructure for hybrid deployments
Strong proficiency in Python for automation, orchestration, and deployment tooling
Solid understanding of Linux-based systems and command-line tools
Extensive knowledge of Docker containers and container orchestration platforms like K8S
Familiarity with spine-leaf (Clos) networking architecture
Proficiency with telemetry and observability stacks such as Prometheus, InfluxDB and Grafana
Strong ownership mindset and accountability for complex deployments
Ability to work effectively in a fast-paced environment

What we offer:

Build a breakthrough AI platform beyond the constraints of the GPU
Publish and open source their cutting-edge AI research
Work on one of the fastest AI supercomputers in the world
Enjoy job stability with startup vitality
Our simple, non-corporate work culture that respects individual beliefs

Additional Information:

Job Posted:
February 17, 2026

Employment Type:

Fulltime

Work Type:

On-site work

Cerebras Systems - All Job Offers

Job Link Share:

Deployment Engineer, AI Inference

Cerebras Systems

Location:
United States; Canada , Sunnyvale ▼
Toronto

Category:
IT - Software Development

Contract Type:
Not provided

Salary:

Job Description:

Job Responsibility:

Requirements:

Additional Information:

Job Posted:
February 17, 2026

Looking for more opportunities? Search for other job offers that match your skills and interests.

Similar Jobs for Deployment Engineer, AI Inference

Director of AI Engineering

Senior Software Engineer – AI

AI Software Engineer

AI Software Engineer III

Senior Devops & AI Engineer

Principal AI Engineer

Research Engineer AI

Artificial (AI) Engineer

Deployment Engineer, AI Inference

Cerebras Systems

Location:United States; Canada , Sunnyvale ▼Toronto

Category:IT - Software Development

Contract Type:Not provided

Salary:

Job Description:

Job Responsibility:

Requirements:

Additional Information:

Job Posted:February 17, 2026

Looking for more opportunities? Search for other job offers that match your skills and interests.

Similar Jobs for Deployment Engineer, AI Inference

Director of AI Engineering

Senior Software Engineer – AI

AI Software Engineer

AI Software Engineer III

Senior Devops & AI Engineer

Principal AI Engineer

Research Engineer AI

Artificial (AI) Engineer

Location:
United States; Canada , Sunnyvale ▼
Toronto

Category:
IT - Software Development

Contract Type:
Not provided

Job Posted:
February 17, 2026