This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Rapid7 is seeking a Principal AI Engineer to lead the architectural evolution of our AI Center of Excellence. In this role, you will design and own the end-to-end distributed systems that make advanced ML, LLMs, and agentic AI reliable, scalable, and secure at an enterprise level.
Job Responsibility:
Own the end-to-end system architecture for AI, ML, and Agentic platforms, ensuring they are reliable, scalable, and secure
Design complex data ingestion pipelines, feature stores, and inference microservices that bridge the gap between research and production
Establish architectural standards and reference patterns for LLM orchestration, RAG systems, and multi-agent workflows
Lead architectural reviews as the final technical authority, making critical trade-offs across accuracy, latency, cost, and reliability
Requirements:
Exceptional ability to reason at the system and architecture level to make long-term technical decisions
Courageous and principled decision-making when navigating high-stakes, ambiguous problem spaces
Proven mentorship of Staff and Senior engineers, fostering growth in architectural thinking and technical rigor
Accountability for the long-term technical health and security of AI systems across multiple teams
13+ years of experience in Data Science, ML Engineering, or Applied AI with a focus on large-scale systems
Hands-on mastery of LLM orchestration frameworks such as LangChain and LangGraph for agentic workflows
Deep expertise in designing RAG pipelines and managing vector database retrieval at scale
Advanced proficiency in AWS ecosystems, specifically Bedrock, SageMaker, EKS, and Lambda
Expertise in MLOps standards, including model registries, drift detection, and automated retraining frameworks
Strong background in deep learning for NLP and sequence-based problems like malware behaviour modelling
Proficiency in Infrastructure as Code (Terraform) and CI/CD for ML workloads
Experience implementing robust guardrails and evaluation frameworks (e.g., Promptfoo, HELM) for autonomous systems