CrawlJobs Logo

Senior Full Stack LLM Engineer - Training

cerebras.net Logo

Cerebras Systems

Location Icon

Location:
United States; Canada , Sunnyvale

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry-leading training and inference speeds and empowers machine learning users to effortlessly run large-scale ML applications, without the hassle of managing hundreds of GPUs or TPUs. We are seeking a versatile and experienced engineer to join our SOTA Training Platform team. This team is responsible to rapidly bring up state-of-the-art open-source models (like LLaMA, Qwen, etc) or customer-provided proprietary models on our Cerebras CSX systems. Success in this role requires a system-minded generalist who thrives in fast-paced bringup environments and is comfortable working across the entire Cerebras software stack. Your work will play a critical role in achieving unprecedented levels of performance, efficiency, and scalability for AI applications.

Job Responsibility:

  • Contribute to the end-to-end bring up of ML models on Cerebras CSX systems
  • Work across the stack: model architecture translation, graph lowering, compiler optimizations, runtime integration, and performance tuning
  • Debug performance and correctness issues spanning model code, compiler IRs, runtime behavior, and hardware utilization
  • Propose and prototype improvements across tools, APIs, or automation flows to accelerate future bring ups

Requirements:

  • Bachelor’s, Master’s, or PhD in Computer Science, Engineering, or a related field
  • 5+ years of relevant industry experience (internship/co-op experience included)
  • Comfort navigating the full AI toolchain: Python modeling code, compiler IRs, performance profiling, etc.
  • Strong debugging skills across performance, numerical accuracy, and runtime integration
  • Experience with deep learning frameworks (e.g., PyTorch, TensorFlow) and familiarity with model internals (e.g., attention, MoE, diffusion)
  • Proficiency in C/C++ programming and experience with low-level optimization
  • Proven experience in compiler development, particularly with LLVM and/or MLIR
  • Strong background in optimization techniques, particularly those involving NP-hard problems
What we offer:
  • Competitive salary and benefits package
  • Opportunities for professional growth and career advancement
  • A dynamic and innovative work environment
  • The chance to work on cutting-edge technologies and make a significant impact on the future of AI

Additional Information:

Job Posted:
February 17, 2026

Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Senior Full Stack LLM Engineer - Training

Senior Machine Learning Engineering Manager, Gen AI

We're seeking a Senior Machine Learning Manager (M60) to lead a cross-functional...
Location
Location
United States
Salary
Salary:
193500.00 - 303150.00 USD / Year
https://www.atlassian.com Logo
Atlassian
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years in ML, search, or backend engineering roles, with 3+ years leading teams
  • Strong track record of shipping ML-powered or LLM-integrated user-facing products
  • Experience with RAG systems (vector search, hybrid retrieval, LLM orchestration)
  • Deep experience in either modeling (e.g., LLMs, search, NLP) or engineering (e.g., backend infra, full-stack), with the ability to lead end-to-end
  • Deep understanding of LLM ecosystems (OpenAI, Claude, Mistral, OSS), orchestration frameworks (LangChain, LlamaIndex), and vector databases (Weaviate, Pinecone, FAISS, etc.)
  • Strong product intuition and ability to translate complex tech into valuable user features
  • Familiarity with GenAI evaluation methods: hallucination detection, groundedness scoring, and human-in-the-loop feedback loops
  • Master’s or PhD in Computer Science, Machine Learning, or related field preferred—or equivalent practical experience
Job Responsibility
Job Responsibility
  • Lead the vision, design, and execution of LLM-powered AI products, leveraging advance AI modeling (e.g. SLM post-training/fine-tuning), RAG architectures and hybrid ranking system
  • Define system architecture across retrievers, rankers, orchestration layers, prompt templates, and feedback mechanisms
  • Work closely with product and design teams to ensure delightful, fast, and grounded user experiences
  • Build and manage a cross-disciplinary team including ML engineers, backend/frontend engineers, and applied scientists
  • Foster a culture of E2E ownership — empowering the team to move from prototype to production quickly and iteratively
  • Mentor individuals to grow in both technical depth and product acumen
  • Shape the technical roadmap and long-term strategy for GenAI search across Atlassian’s product suite
  • Partner with platform and infra teams to scale inference, evaluate performance, and integrate usage signals for continuous improvement
  • Champion data quality, grounding, and responsible AI practices in all deployed features
What we offer
What we offer
  • health and wellbeing resources
  • paid volunteer days
  • Fulltime
Read More
Arrow Right

Senior Product AI Engineer

You’ll join us to bring this framework to the next level, leading projects (AI, ...
Location
Location
Salary
Salary:
Not provided
wetravel.com Logo
WeTravel
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years of software engineering experience (ideally full-stack)
  • Strong engineering skills and proven experience in GenAI / LLM applications (i.e. built and launched LLM-enabled products to customers)
  • Proficiency with Ruby on Rails, or in at least two other languages Python/Go/Java/Kotlin/Node.js or .NET with desire to learn Ruby
  • Have experience and the desire to build user experiences (e.g. web front-ends)
  • Have experience building and working with distributed systems, microservices and event-driven architecture and demonstrate strong systems thinking and can design for scalability
  • Have experience with production systems, monitoring, and on-call responsibilities
  • Have excellent communication skills and experience working in multicultural, distributed teams
  • Take a pragmatic approach to using AI tools to improve productivity
  • Have experience leading projects
  • Proficiency with LLM providers and SDKs (e.g., OpenAI, Anthropic, Google, Meta)
Job Responsibility
Job Responsibility
  • Lead and build features end-to-end: from reviewing user interviews and product design, through architection and building systems to deployment and monitoring in production
  • Partner closely with our product team to discover user problems and shape their solutions - creating a world class experience for our organizers and travelers
  • Write high-quality, maintainable code across both backend (Ruby on Rails), and frontend (TypeScript/React)
  • Ensure our services are always on by building resilient applications, ensuring they are well monitored and mitigating incidents as an on-call/incident responder
  • Mentor teammates, grow the team's AI capacity, and contribute to WeTravel’s engineering practices and excellence
What we offer
What we offer
  • Generous "Time to Recharge" policy — enjoy unlimited paid time off to rest, recharge, and show up as your best self
  • Amsterdam Program – visit us in Amsterdam (HQ) for 2-4 weeks every year, staying in one of our WeTravel apartments
  • Work remotely for a maximum of 4 weeks per calendar year
  • Extensive paid family leave
  • Three paid volunteer days per year — take time to give back to causes you care about, on us
  • 2-week cross-functional onboarding program
  • Cutting-edge equipment and tools to set you up for success. Coverage for certain work-from-home (WFH) equipment
  • Cambly for colleagues for whom English is not their first language
  • Join an international, travel-loving team with a passion for adventure and innovation
  • Fulltime
Read More
Arrow Right

Senior Machine Learning Engineer

LMArena is seeking a Senior Machine Learning Engineer to help scale and strength...
Location
Location
United States , Bay Area
Salary
Salary:
Not provided
arena.ai Logo
Arena Intelligence, Inc.
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong programming skills with the ability to work across the stack in a typical recommendation system or LLM stack
  • Experience in deep learning, language models or reward model training
  • Experience in working with LLM for fine tuning, prompt engineering, function calling etc
  • Self-motivated with a willingness to take ownership of tasks
  • A passion for shipping quality products
  • 4+ years of industry experience or relevant projects
  • Solid understanding of statistics, and various tools and methodologies for evaluating uncertainty in a way that is specific to the given product being shipped
Job Responsibility
Job Responsibility
  • Architect and build what will become our core modeling for data and evaluation products
  • Own the full stack data, model training, and eval pipelines
  • Help grow a culture of feedback and rapid product iteration as we build new features as a tight-nit team
  • Conduct research into state-of-the-art evaluation methods and contribute to the long-term vision for a centralized, scalable evaluation platform
What we offer
What we offer
  • Comprehensive health and wellness benefits, including medical, dental, vision, and additional support programs
  • The opportunity to work on cutting-edge AI with a small, mission-driven team
  • A culture that values transparency, trust, and community impact
  • Fulltime
Read More
Arrow Right

Senior Solutions Engineer - Full Stack Developer

The role is critical to the firm’s success. You design and deliver enterprise so...
Location
Location
United States , Cleveland
Salary
Salary:
140000.00 - 150000.00 USD / Year
signifytechnology.com Logo
Signify Technology
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of experience as a Solutions Engineer or similar role
  • Bachelor’s degree in Computer Science, Information Systems, or related field - or equivalent hands-on experience
  • Strong experience configuring and integrating enterprise applications such as Salesforce, Workday, and Microsoft Power Platform
  • Proficiency with APIs, middleware, and integration frameworks to connect systems and enable data flow
  • Familiarity with AI/ML concepts and experience embedding AI services into enterprise workflows (e.g., Copilot, Azure AI, LLM APIs)
  • Demonstrated ability to translate business needs into technical solutions from ideation through delivery
  • Strong understanding of data modeling, reporting tools, workflow automation, and security configuration
  • Experience working in Agile/Scrum delivery frameworks, with a focus on adaptability and continuous improvement
  • Excellent problem-solving, collaboration, and communication skills
  • Ability to work in a fast paced Agile/Scrum delivery framework
Job Responsibility
Job Responsibility
  • Extend and customize enterprise platforms (e.g., Salesforce, Workday, Microsoft Power Platform) through APIs, code, and integrations while evaluating AI-powered development tools and automation opportunities
  • Design, develop, and maintain full-stack applications and automations that improve internal operations and workflows with focus on scalable, maintainable code and user-centered design
  • Collaborate with business stakeholders to capture requirements, translate them into technical solutions, and deliver clear documentation and training
  • Participate in code reviews, documentation, and internal demos to ensure transparency and maintainability following established quality standards and best practices
  • Cross-train with peers across languages and platforms to support a flexible and adaptable team including exposure to AI development tools and low-code/no-code platforms
  • Maintain and optimize system integrations across cloud and on-prem applications with focus on performance, reliability, and security
  • Partner with “citizen developers” to guide low-code/no-code solutions, ensuring maintainability, governance, and alignment with enterprise standards
  • Integrate AI services (Azure AI, OpenAI APIs, Microsoft Copilot, and other third-party tools) into workflows, automations, and enterprise platforms
  • Implement AI engineering practices such as prompt engineering, API integration, testing/validation of AI solutions, and adherence to the firm’s AI governance standards
  • Develop reusable AI-enabled solution components (connectors, templates, workflows) that accelerate adoption across business functions
What we offer
What we offer
  • medical
  • dental
  • vision
  • paid time off
  • 401k match
  • paid parental leave
  • education assistance
  • Fulltime
Read More
Arrow Right

AI Researcher

Perplexity is seeking top-tier AI Research Scientists and Engineers to advance o...
Location
Location
United States , San Francisco; Palo Alto
Salary
Salary:
210000.00 - 470000.00 USD / Year
perplexity.ai Logo
Perplexity
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proven experience with large-scale LLMs and Deep Learning systems
  • Strong programming skills in Python/PyTorch
  • Experience with post-training techniques and reinforcement learning
  • Self-starter with a willingness to take ownership of tasks
  • Passion for tackling challenging problems
  • Minimum 2-6 years of experience on relevant projects (depending on seniority level)
Job Responsibility
Job Responsibility
  • Post-train SOTA LLMs using the latest supervised and reinforcement learning techniques (SFT/DPO/GRPO)
  • Leverage our rich query/answer dataset to scale model performance across Sonar, Deep Research, Comet, and Search products
  • Stay current with the latest LLM research, especially in model training, optimization, and personalization techniques
  • Implement preference optimization and personalization capabilities to enhance user experience
  • Invent in-house improvements and optimizations to enhance SOTA models
  • Turn research ideas into algorithms and run experiments to launch new models
  • Own full-stack data, training, and evaluation pipelines required for model development
  • Build robust and effective training frameworks (on top of Megatron/PyTorch) for post-training LLMs
  • Implement necessary infrastructure and components to support cutting-edge model training at scale
  • Integrate models seamlessly into our product ecosystem
What we offer
What we offer
  • Equity
  • Health
  • Dental
  • Vision
  • Retirement
  • Fitness
  • Commuter and dependent care accounts
  • Fulltime
Read More
Arrow Right

Senior Software Engineer, AI & ML Ops

Hyundai AutoEver America seeks a seasoned Senior AI/ML Engineer to architect, de...
Location
Location
United States , Irvine
Salary
Salary:
103170.00 - 158873.00 USD / Year
haeaus.com Logo
Hyundai AutoEver America
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s or Master’s degree in Computer Science, Engineering, AI, or related field
  • advanced degrees/certifications are a plus
  • 8+ years of software engineering experience, including 3+ years in AI/ML solution development
  • Proven experience designing and deploying LLM-based solutions, traditional ML models, RAG systems, and agent workflows
  • Strong expertise in Python, TensorFlow/PyTorch, Hugging Face, prompt engineering, vector databases, and AI orchestration
  • Hands-on experience with AWS SageMaker/Bedrock, Azure OpenAI, or Azure ML Studio, plus MLOps best practices (CI/CD, testing, model monitoring)
  • Proficiency in frontend frameworks (React), cloud-native deployment (Docker/Kubernetes), microservice APIs, and relational/NoSQL databases
Job Responsibility
Job Responsibility
  • Architect and develop scalable AI/ML and LLM-based systems, including RAG pipelines, agentic workflows, predictive models, and generative AI solutions
  • Build full‑stack AI applications, including React-based dashboards and front‑end interfaces integrated with backend services and cloud infrastructure
  • Develop data pipelines and ML Ops workflows using Python, SQL, AWS/Azure platforms, and monitoring tools to train, deploy, and optimize models
  • Lead cross-functional AI initiatives, deliver PoCs/MVPs, ensure compliance with AI governance, and integrate AI features into enterprise and user-facing systems
  • Provide technical leadership and mentorship, guiding standards, code reviews, model documentation, and best practices in AI/ML development
  • Continuously improve AI performance and reliability through prompt engineering, architecture enhancements, and data optimization
What we offer
What we offer
  • comprehensive medical/dental coverage
  • generous PTO
  • education assistance
  • annual merit increase eligibility
  • Fulltime
Read More
Arrow Right

Principal AI Architect

As a Principal AI Architect, you will define and drive the end-to-end Cloud + AI...
Location
Location
United States , Multiple Locations
Salary
Salary:
139900.00 - 274800.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
  • OR equivalent experience.
  • Master's Degree in Computer Science, Engineering or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
  • OR Bachelor's Degree in Computer Science or related technical field AND 12+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
  • OR equivalent experience.
  • 10+ experience in software engineering or solution architecture, with demonstrable success building and operating complex systems.
  • 10+ experience designing, implementing, and optimizing AI solutions encompassing data pipelines, model training and serving, and production MLOps.
  • 10 + years of full-stack development across frontend, backend, and cloud infrastructure, with a proficient command of data engineering, AI/ML systems, and deployment architectures.
  • 2+ project experience with NLP and LLM-based systems.
  • Proficient coding skills in C#, Python, JavaScript, and React.
Job Responsibility
Job Responsibility
  • Own the reference architecture and technical roadmap for WWL’s AI/Agentic platform capabilities (e.g., orchestration, tools/plugins, memory, retrieval, evaluation, observability, governance).
  • Translate skilling business objectives into platform investments and architectural decisions, balancing speed-to-value with security, compliance, cost, and long-term maintainability.
  • Establish clear architectural guardrails and decision frameworks (e.g., “build vs. buy,” “Copilot Studio vs. Foundry,” “RAG vs. fine-tune,” “central vs. federated patterns”).
  • Lead architecture/design reviews for major initiatives
  • drive alignment on system boundaries, contracts, dependency management, and resiliency.
  • Define and standardize architecture patterns (multi-tenant SaaS, event-driven architectures, secure data access, model routing, agent safety controls).
  • Create reusable templates, “golden paths,” and reference implementations to accelerate engineering delivery across teams and reduce fragmentation.
  • Embed Responsible AI principles into agentic solution design (human-in-the-loop, safety mitigations, evaluation, transparency, and auditability).
  • Partner with security and compliance stakeholders to ensure services meet required controls and operational standards, and to drive alignment between policy intent and implementation.
  • Define secure patterns for prompt/data handling, secrets management, identity, and access governance for AI systems.
  • Fulltime
Read More
Arrow Right

Staff Engineer-Applied AI

Join us as a Staff Engineer-Applied AI. At Barclays, we don’t just adapt to the ...
Location
Location
India , Bengaluru
Salary
Salary:
Not provided
barclays.co.uk Logo
Barclays
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • A Bachelor’s degree (or above) in computer science, engineering, mathematics, or a related discipline
  • Experience in developing software applications for scale on cloud(AWS or Azure or GCP)
  • Fluent in Python or Java or GoLang programming language and frameworks
  • Hands-on experience in GenAI and Agentic AI frameworks such as LangGraph/LangChain, CrewAI, Strands SDK, Google ADK etc.
  • Familiarity on working with MCP and A2A protocols
  • Solid understanding of working with LLM Models, writing prompts efficiently that provides accurate context for LLM Agents
  • Hands-on experience in developing and deploying Agentic AI solutions using cloud native services like AWS Bedrock or Azure AI Foundry or GCP Vertex AI
  • Strong understanding of AI implementation in software development and legacy code transformation
  • Experience in developing APIs and integrating LLM models into software applications
  • Ability to work on system design from ideation through completion with limited supervision
Job Responsibility
Job Responsibility
  • Lead the development of GenAI & Agentic AI systems & solutions to build intelligent conversational AI and workflow automation for specific business domains
  • Design and develop Full-Stack Multi Agentic AI systems using Agentic AI frameworks, LLMs and technologies, build function tools and integrate them with Back-End systems using REST-API, MCP or A2A protocols
  • Design, develop and Expose Agentic AI back-end, wrap existing back-end systems with MCP Servers to other agentic front-end and agentic back-end systems
  • Apply context engineering techniques such as context caching, context compression etc... to improve model’s (LLM) performance and response times
  • Implement guardrails & policy enforcement for safety and security risks that are required to duly comply with the organization’s data privacy and security standards
  • Collaborate with cross-functional teams to identify requirements and develop solutions to meet business needs within the organization
  • Communicate effectively with both technical and non-technical stakeholders, including senior leadership
  • Develop and maintain tools and frameworks for prompt-based model training, evaluation, and optimization
  • Analyze and interpret data to evaluate model performance and identify areas of improvement
  • Development and delivery of high-quality software solutions by using industry aligned programming languages, frameworks, and tools
What we offer
What we offer
  • Competitive holiday allowance
  • Life assurance
  • Private medical care
  • Pension contribution
  • Modern workspaces, collaborative areas, and state-of-the-art meeting rooms
  • Facilities include wellness rooms, on-site cafeterias, fitness centers, and tech-equipped workstations
  • Health and wellness
  • A place where you can belong
  • More than work
  • Fulltime
Read More
Arrow Right