Senior Software Engineer, AI Inference Platform Job at Cerebras Systems (Sunnyvale)

Senior Software Engineer (TypeScript) - AI/ML

We are looking for a Senior Software Engineer to drive the development of AI/ML-...

Location

The Netherlands

Salary:

Not provided

ClickHouse

Expiration Date

Until further notice

Requirements

5+ years of software engineering experience in production environments
Exposure to working directly with AI/ML technologies
Strong frontend skills with TypeScript/JavaScript and React
Backend development experience in TypeScript or Python, with a focus on API design and service architecture
You have a high level of ownership and can drive features from concept to production with minimal supervision
You thrive in collaborative environments and can effectively communicate technical concepts to diverse stakeholders

Job Responsibility

Feature Development: Design and implement AI-powered features across the full stack, from backend inference services to intuitive frontend interfaces within the ClickHouse Cloud platform
API Architecture: Create robust, scalable APIs that connect ClickHouse's database capabilities with modern AI/ML inference systems and external/internal AI services
UI/UX Implementation: Build responsive, intuitive user interfaces that make complex AI functionalities accessible and valuable to users of all technical backgrounds
Ecosystem Integrations: Implement and maintain integrations with the broader AI/ML ecosystem and standards, ensuring that ClickHouse as a technology works seamlessly with popular frameworks and tools
Technical Integration: Integrate models into production systems with proper monitoring, versioning, observability, and evaluation

What we offer

Flexible work environment - ClickHouse is a globally distributed company and remote-friendly. We currently operate in 20 countries
Healthcare - Employer contributions towards your healthcare
Equity in the company - Every new team member who joins our company receives stock options
Time off - Flexible time off in the US, generous entitlement in other countries
A $500 Home office setup if you’re a remote employee
Global Gatherings – We believe in the power of in-person connection and offer opportunities to engage with colleagues at company-wide offsites

Senior Software Engineer - ML Infrastructure

We build simple yet innovative consumer products and developer APIs that shape h...

Location

United States , San Francisco

Salary:

180000.00 - 270000.00 USD / Year

Plaid

Expiration Date

Until further notice

Requirements

5+ years of industry experience as a software engineer, with strong focus on ML/AI infrastructure or large-scale distributed systems
Hands-on expertise in building and operating ML platforms (e.g., feature stores, data pipelines, training/inference frameworks)
Proven experience delivering reliable and scalable infrastructure in production
Solid understanding of ML Ops concepts and tooling, as well as best practices for observability, security, and reliability
Strong communication skills and ability to collaborate across teams

Job Responsibility

Design and implement large-scale ML infrastructure, including feature stores, pipelines, deployment tooling, and inference systems
Drive the rollout of Plaid’s next-generation feature store to improve reliability and velocity of model development
Help define and evangelize an ML Ops “golden path” for secure, scalable model training, deployment, and monitoring
Ensure operational excellence of ML pipelines and services, including reliability, scalability, performance, and cost efficiency
Collaborate with ML product teams to understand requirements and deliver solutions that accelerate experimentation and iteration
Contribute to technical strategy and architecture discussions within the team
Mentor and support other engineers through code reviews, design discussions, and technical guidance

What we offer

medical, dental, vision, and 401(k)

Fulltime

Senior Software Engineer – AI

NStarX is seeking a highly skilled Senior Software Engineer – AI with a strong f...

Location

India , Hyderabad

Salary:

Not provided

NStarX

Expiration Date

Until further notice

Requirements

Bachelor’s or Master’s degree in Computer Science, Machine Learning, Data Science, or a related field (PhD is a plus)
9+ years of experience in AI/ML engineering or related roles
3+ years of experience in Generative AI with team leadership responsibilities
Proven track record of production-grade ML and GenAI model development and deployment
Programming: Python (preferred)
GenAI Frameworks: Hugging Face Transformers, Diffusers, LangChain, TGI
Serving & Inference: FastAPI, gRPC, NVIDIA Triton, TorchServe
Cloud Platforms: AWS (SageMaker, EKS), GCP (Vertex AI, GKE), Azure (Azure ML, AKS)
MLOps & DevOps: Kubeflow, MLflow, GitHub Actions, Jenkins, Helm, Terraform
Optimization Techniques: Model quantization, distillation, pipeline and tensor parallelism

Job Responsibility

Design, develop, and deploy machine learning models and AI algorithms to address complex business challenges
Lead and mentor a team of AI/ML engineers, ensuring quality and scalability in solution design and implementation
Collaborate closely with cross-functional teams including data scientists, software engineers, product managers, and UX designers
Lead the development and deployment of Generative AI applications across text, code, image, and audio modalities using state-of-the-art LLMs
Design and implement CI/CD pipelines for the GenAI model lifecycle including training, validation, packaging, and deployment
Apply best practices for model performance tuning, cost optimization, and scalable deployment in cloud and hybrid environments
Develop prompt engineering, fine-tuning strategies (LoRA, QLoRA, PEFT), and evaluation protocols tailored to business use cases
Stay current with emerging trends in AI, ML, and Generative AI and drive adoption across teams
Document processes, model architectures, and deployment strategies for traceability and knowledge sharing
Work closely with cross-functional teams to gather requirements and deliver high-quality solutions

What we offer

Competitive salary aligned with market standards
Opportunities for professional development and skill enhancement
A collaborative and innovative work environment

Fulltime

Senior Software Engineer (TypeScript) - AI/ML

We are looking for a Senior Software Engineer to drive the development of AI/ML-...

Location

United States

Salary:

131000.00 - 185000.00 USD / Year

ClickHouse

Expiration Date

Until further notice

Requirements

5+ years of software engineering experience in production environments
Exposure to working directly with AI/ML technologies
Strong frontend skills with TypeScript/JavaScript and React
Backend development experience in TypeScript or Python, with a focus on API design and service architecture
You have a high level of ownership and can drive features from concept to production with minimal supervision
You thrive in collaborative environments and can effectively communicate technical concepts to diverse stakeholders

Job Responsibility

Feature Development: Design and implement AI-powered features across the full stack, from backend inference services to intuitive frontend interfaces within the ClickHouse Cloud platform
API Architecture: Create robust, scalable APIs that connect ClickHouse's database capabilities with modern AI/ML inference systems and external/internal AI services
UI/UX Implementation: Build responsive, intuitive user interfaces that make complex AI functionalities accessible and valuable to users of all technical backgrounds
Ecosystem Integrations: Implement and maintain integrations with the broader AI/ML ecosystem and standards, ensuring that ClickHouse as a technology works seamlessly with popular frameworks and tools
Technical Integration: Integrate models into production systems with proper monitoring, versioning, observability, and evaluation

What we offer

Flexible work environment - ClickHouse is a globally distributed company and remote-friendly. We currently operate in 20 countries
Healthcare - Employer contributions towards your healthcare
Equity in the company - Every new team member who joins our company receives stock options
Time off - Flexible time off in the US, generous entitlement in other countries
A $500 Home office setup if you’re a remote employee
Global Gatherings – We believe in the power of in-person connection and offer opportunities to engage with colleagues at company-wide offsites

Fulltime

Senior ML Platform Engineer

At WHOOP, we're on a mission to unlock human performance and healthspan. WHOOP e...

Location

United States , Boston

Salary:

150000.00 - 210000.00 USD / Year

Whoop

Expiration Date

Until further notice

Requirements

Bachelor’s or Master’s Degree in Computer Science, Engineering, or a related field
or equivalent practical experience
5+ years of experience in software engineering with a focus on ML infrastructure, cloud platforms, or MLOps
Strong programming skills in Python, with experience in building distributed systems and REST/gRPC APIs
Deep knowledge of cloud-native services and infrastructure-as-code (e.g., AWS CDK, Terraform, CloudFormation)
Hands-on experience with model deployment platforms such as AWS SageMaker, Vertex AI, or Kubernetes-based serving stacks
Proficiency in ML lifecycle tools (MLflow, Weights & Biases, BentoML) and containerization strategies (Docker, Kubernetes)
Understanding of data engineering and ingestion pipelines, with ability to interface with data lakes, feature stores, and streaming systems
Proven ability to work cross-functionally with Data Science, Data Platform, and Software Engineering teams, influencing decisions and driving alignment
Passion for AI and automation to solve real-world problems and improve operational workflows

Job Responsibility

Architect, build, own, and operate scalable ML infrastructure in cloud environments (e.g., AWS), optimizing for speed, observability, cost, and reproducibility
Create, support, and maintain core MLOps infrastructure (e.g., MLflow, feature store, experiment tracking, model registry), ensuring reliability, scalability, and long-term sustainability
Develop, evolve, and operate MLOps platforms and frameworks that standardize model deployment, versioning, drift detection, and lifecycle management at scale
Implement and continuously maintain end-to-end CI/CD pipelines for ML models using orchestration tools (e.g., Prefect, Airflow, Argo Workflows), ensuring robust testing, reproducibility, and traceability
Partner closely with Data Science, Sensor Intelligence, and Data Platform teams to operationalize and support model development, deployment, and monitoring workflows
Build, manage, and maintain both real-time and batch inference infrastructure, supporting diverse use cases from physiological analytics to personalized feedback loops for WHOOP members
Design, implement, and own automated observability tooling (e.g., for model latency, data drift, accuracy degradation), integrating metrics, logging, and alerting with existing platforms
Leverage AI-powered tools and automation to reduce operational overhead, enhance developer productivity, and accelerate model release cycles
Contribute to and maintain internal platform documentation, SDKs, and training materials, enabling self-service capabilities for model deployment and experimentation
Continuously evaluate and integrate emerging technologies and deployment strategies, influencing WHOOP’s roadmap for AI-driven platform efficiency, reliability, and scale

What we offer

equity
benefits

Fulltime

Director of AI Engineering

We are entering a hyper-growth phase of AI innovation and are hiring a Director ...

Location

Canada; United States

Salary:

300000.00 - 450000.00 USD / Year

Apollo.io

Expiration Date

Until further notice

Requirements

10–15+ years in software engineering, with significant leadership experience owning AI/ML or applied LLM systems at scale
Proven history shipping LLM-powered features, agentic workflows, or AI assistants used by real customers in production
Deep understanding of LLM orchestration frameworks (LangChain, LlamaIndex), RAG pipelines, vector search, embeddings, and prompt engineering
Expert in backend & distributed systems (Python strongly preferred) and cloud infrastructure (AWS/GCP)
Strong experience with telemetry, observability, and cost-aware real-time inference optimizations
Demonstrated ability to lead senior engineers, define technical roadmaps, and deliver outcomes aligned to business metrics
Experience building or scaling teams working on experimentation, optimization, personalization, or ML-powered growth systems
Exceptional ability to simplify complex problems, set clear standards, and drive alignment across Product, Data, Design, and Engineering
Strong product sense, ability to weigh novelty vs. impact, focus on user value, and prioritize speed with guardrails
Fluent in integrating AI tools into engineering workflows for code generation, debugging, delivery velocity, and operational efficiency

Job Responsibility

Define the multi-year technical vision for Apollo’s AI stack, spanning agents, orchestration, inference, retrieval, and platformization
Prioritize high-impact AI investments by partnering with Product, Design, Research, and Data leaders to align engineering outcomes with business goals
Establish technical standards, evaluation criteria, and success metrics for every AI-powered feature shipped
Lead the architecture and deployment of long-horizon autonomous agents, multi-agent workflows, and API-driven orchestration frameworks
Build reusable, scalable agentic components that power GTM workflows like research, enrichment, sequencing, lead scoring, routing, and personalization
Own the evolution of Apollo’s internal LLM platform for high-scale, low-latency, cost-optimized inference
Oversee model-driven experiences for natural-language interfaces, RAG pipelines, semantic search, personalized recommendations, and email intelligence
Partner with Product & Design to build intuitive conversational UX that hides underlying complexity while elevating user productivity
Implement rigorous evaluation frameworks, including offline benchmarking, human-in-the-loop review, and online A/B experimentation
Ensure robust observability, monitoring, and safety guardrails for all AI systems in production

What we offer

Equity
Company bonus or sales commissions/bonuses
401(k) plan
At least 10 paid holidays per year
Flex PTO
Parental leave
Employee assistance program and wellbeing benefits
Global travel coverage
Life/AD&D/STD/LTD insurance
FSA/HSA

Fulltime

Senior Staff Machine Learning Engineer

Help design our AI platform and develop our next generation of machine learning ...

Location

United States , San Francisco

Salary:

216500.00 - 324500.00 USD / Year

GoFundMe

Expiration Date

Until further notice

Requirements

9+ years of hands-on experience in machine learning engineering, AI development, software engineering, or related fields
Experience emphasizing secure, large-scale, distributed system design, AI/ML pipeline development, and implementation
Extensive experience designing, developing, and operating scalable backend systems
Experience applying software engineering best practices such as domain-driven design, event-driven architectures, and microservices
Deep expertise in agentic workflows, AI evaluation solutions, prompt management, and secure AI development and testing practices
Strong knowledge of relational and document-based databases, data storage paradigms, and efficient RESTful API design
Experience establishing robust CI/CD pipelines, automated testing (unit and integration), and deployment practices
Strong leadership skills, including effective planning and management of complex projects, mentoring of team members, and fostering a collaborative, high-performing engineering culture
Excellent communicator, able to articulate complex technical concepts clearly to both technical and non-technical stakeholders
Bachelor's degree in Computer Science, Software Engineering, or a related technical field (preferred)

Job Responsibility

Design and implement AI platforms to enable scalable and secure access to LLMs from multiple model providers for diverse use cases
Design and implement agentic workflows, agentic tool ecosystems, and LLM prompt management solutions
Design, build, and optimize scalable model training, fine tuning, and inference pipelines, ensuring robust integration with production systems
Influence technical strategy and approach to developing embedding stores, vector databases, and other reusable assets
Lead initiatives to streamline ML and AI workflows, improve operational efficiency, and establish standardized procedures to achieve consistent, high-quality results across our AI systems
Design and develop backend services and RESTful APIs using Python and FastAPI, integrating seamlessly with ML pipelines and services
Take operational responsibility for team-owned services, including performance monitoring, optimization, troubleshooting, and participation in an on-call rotation
Collaborate with both technical and non-technical colleagues, including data and applied scientists, software engineers, product managers, and business stakeholders, to deliver reliable and scalable ML-driven products
Coach and mentor fellow ML engineers, promoting a culture of collaboration, continuous improvement, and engineering excellence within the team
Employ a diverse set of tools and platforms including Python, AWS, Databricks, Docker, Kubernetes, FastAPI, Terraform, Snowflake, Coralogix, and GitHub to build, deploy, and maintain scalable, highly available machine learning infrastructure

What we offer

Competitive pay
Comprehensive healthcare benefits
Financial assistance for things like hybrid work, family planning
Generous parental leave
Flexible time-off policies
Mental health and wellness resources
Learning, development, and recognition programs

Fulltime

Senior Principal Technical Program Manager - ML Platform

Location

Salary:

231300.00 - 301975.00 USD / Year

Atlassian

Expiration Date

Until further notice

Requirements

8+ years of experience on software teams as Development Manager, Technical Product Manager or TPM leading technical platforms areas
Deep domain experience in AI and/or Search. Example: Model Inference, Model Evaluation, Model Training, LLM Ops, Semantic Search, Search Relevance, etc.
Partner with Engineering in defining direction, strategy and execution at Platform level
Strategic thinking and ability to understand business objectives to translate them into technical problems and programs.
Technical understanding of systems involved. Willingness to develop domain expertise in the area they operate - storage, networking, authentication, capacity management, service deployments, etc.
TPMs are not expected to write or read code, but are expected to understand system flows, block architectures, APIs and such.
Experience defining and running end-to-end complex technical programs
Strong leadership, organizational, and communication skills

Job Responsibility

Understand and stay up-to-date on latest innovations in AI and Search. Partner closely with engineering teams to translate these into practical platform evolution for Atlassian bringing value to our customers.
Analyze business objectives, customer needs, product adoption inhibitors and opportunities, industry trends, and based on these, in close collaboration with your stakeholders, define a long-term strategy and roadmap for your platform and product components.
Understand business objectives and translate them into technical systems problems that need to be prioritized solved in the current business environment.
Define specific systems programs and create a plan of action for realizing those programs. Such programs could be around capacity planning, migration efforts, high availability, network architecture, performance optimization, reliability improvements and more.
Use your technical understanding of Atlassian and related systems to partner with and influence engineers and architects in making progress on these problems.
Responsible for taking a systematic approach to engineering problems. This includes: prioritizing tasks, scoping out the project, defining objectives, and making consistent progress against each of these.
Be accountable for the success of these technical programs by managing the entire lifecycle from initiation to forecasting, budgeting, scheduling, etc.
Manage complex dependencies and projects with a broad scope across the company

What we offer

health and wellbeing resources
paid volunteer days

Senior Software Engineer, AI Inference Platform

Cerebras Systems

Location:
United States; Canada , Sunnyvale ▼
Toronto

Category:
IT - Software Development

Contract Type:
Not provided

Salary:

Job Description:

Job Responsibility:

Requirements:

Nice to have:

Additional Information:

Job Posted:
February 17, 2026

Looking for more opportunities? Search for other job offers that match your skills and interests.

Similar Jobs for Senior Software Engineer, AI Inference Platform

Senior Software Engineer (TypeScript) - AI/ML