Engineering Manager, Inference Platform Job at Cerebras Systems (Sunnyvale)

Engineering Manager, Inference Platform

Cerebras Systems

Location:
United States; Canada , Sunnyvale ▼
Toronto

Category:
IT - Software Development

Contract Type:
Not provided

Salary:

Not provided

Save Job

Apply Position

Job Description:

Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry-leading training and inference speeds and empowers machine learning users to effortlessly run large-scale ML applications, without the hassle of managing hundreds of GPUs or TPUs. Cerebras' current customers include top model labs, global enterprises, and cutting-edge AI-native startups. OpenAI recently announced a multi-year partnership with Cerebras, to deploy 750 megawatts of scale, transforming key workloads with ultra high-speed inference. Thanks to the groundbreaking wafer-scale architecture, Cerebras Inference offers the fastest Generative AI inference solution in the world, over 10 times faster than GPU-based hyperscale cloud inference services. This order of magnitude increase in speed is transforming the user experience of AI applications, unlocking real-time iteration and increasing intelligence via additional agentic computation.

Job Responsibility:

Provide hands-on technical leadership, owning the technical vision and roadmap for the Cerebras Inference Platform, from internal scaling to on-prem customer solutions
Lead the end-to-end development of distributed inference systems, including request routing, autoscaling, and resource orchestration on Cerebras' unique hardware
Drive a culture of operational excellence, guaranteeing platform reliability (>99.9% uptime), performance, and efficiency
Lead, mentor, and grow a high-caliber team of engineers, fostering a culture of technical excellence and rapid execution
Productize the platform into an enterprise-ready, on-prem solution, collaborating closely with product, ops, and customer teams to ensure successful deployments

Requirements:

6+ years in high-scale software engineering
3+ years leading distributed systems or ML infra teams
strong coding and review skills
Proven track record scaling LLM inference: optimizing latency (<100ms P99), throughput, batching, memory/IO efficiency and resources utilization
Expertise in distributed inference/training for modern LLMs
understanding of AI/ML ecosystems, including public clouds (AWS/GCP/Azure)
Hands-on with model-serving frameworks (e.g. vLLM, TensorRT-LLM, Triton or similar) and ML stacks (PyTorch, Hugging Face, SageMaker)
Deep experience with orchestration (Kubernetes/EKS, Slurm), large clusters, and low-latency networking
Strong background in monitoring and reliability engineering (Prometheus/Grafana, incident response, post-mortems)
Demonstrated ability to recruit and retain high-performing teams, mentor engineers, and partner cross-functionally to deliver customer-facing products

Nice to have:

Experience with on-prem/private cloud deployments
Background in edge or streaming inference, multi-region systems, or security/privacy in AI
Customer-facing experience with enterprise deployments

What we offer:

Build a breakthrough AI platform beyond the constraints of the GPU
Publish and open source their cutting-edge AI research
Work on one of the fastest AI supercomputers in the world
Enjoy job stability with startup vitality
Our simple, non-corporate work culture that respects individual beliefs

Additional Information:

Job Posted:
February 17, 2026

Employment Type:

Fulltime

Work Type:

On-site work

Cerebras Systems - All Job Offers

Job Link Share:

Engineering Manager, Inference Platform

Cerebras Systems

Location:
United States; Canada , Sunnyvale ▼
Toronto

Category:
IT - Software Development

Contract Type:
Not provided

Salary:

Job Description:

Job Responsibility:

Requirements:

Nice to have:

Additional Information:

Job Posted:
February 17, 2026

Looking for more opportunities? Search for other job offers that match your skills and interests.

Similar Jobs for Engineering Manager, Inference Platform

Engineering Manager - Machine Learning Infrastructure

Senior Principal Technical Program Manager - ML Platform

Senior ML Platform Engineer

Senior Machine Learning Engineering Manager, Gen AI

ML Platform Engineer

Staff Product Manager, Managed Inference

Engineering Manager, GenAI Platform

Director of Engineering, Cloud Availability

Engineering Manager, Inference Platform

Cerebras Systems

Location:United States; Canada , Sunnyvale ▼Toronto

Category:IT - Software Development

Contract Type:Not provided

Salary:

Job Description:

Job Responsibility:

Requirements:

Nice to have:

Additional Information:

Job Posted:February 17, 2026

Looking for more opportunities? Search for other job offers that match your skills and interests.

Similar Jobs for Engineering Manager, Inference Platform

Engineering Manager - Machine Learning Infrastructure

Senior Principal Technical Program Manager - ML Platform

Senior ML Platform Engineer

Senior Machine Learning Engineering Manager, Gen AI

ML Platform Engineer

Staff Product Manager, Managed Inference

Engineering Manager, GenAI Platform

Director of Engineering, Cloud Availability

Location:
United States; Canada , Sunnyvale ▼
Toronto

Category:
IT - Software Development

Contract Type:
Not provided

Job Posted:
February 17, 2026