CrawlJobs Logo

Site Reliability Engineer

edpuzzle.com Logo

Edpuzzle

Location Icon

Location:
Spain , Barcelona

Category Icon
Category:
IT - Software Development

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

45000.00 - 59000.00 EUR / Year

Job Description:

We’re looking for a passionate Site Reliability Engineer to pioneer our SRE strategies of our Security and Infrastructure Team in Barcelona. The right person will help us create the best possible product for teachers and empower them to engage their students with videos. If you’re a self-starter who’s eager to contribute to the education sector, you’ll feel right at home with us. As the key reference point for all things SRE, you'll have the autonomy to shape our systems from the ground up. This role is perfect for someone ready to lead and innovate, making a significant impact on our cloud infrastructure and observability strategies using Datadog. You’ll be responsible for ensuring our system’s reliability, scalability, and maintainability, handling everything from our cloud infrastructure to in-depth observability and comprehensive monitoring. By working closely with our DevOps and Engineering teams, you’ll drive the design and implementation of resilient systems, manage incidents effectively, and champion best practices for observability and incident response.

Job Responsibility:

  • Work with the Product, Infrastructure and Engineering teams to find the best technical solutions by participating in discussions and sharing your opinions
  • Take ownership of the problems that are being worked on, understanding why they are needed by the users, carrying out your own research, making your own proposals and working on the implementation while relying on your teammates for help when needed
  • Communicate effectively in a team in order to maximize productivity, ownership, and focus to help projects reach the finish line with the best possible outcome and by the project deadline
  • Design a cloud infrastructure that is secure, scalable, and highly available on AWS
  • Engage in proactive monitoring and observability with comprehensive tools and practices that not only detect and warn, but also predict potential system issues before they affect our users
  • Lead the charge in root cause analysis for production and infrastructure issues, transforming challenges into learning opportunities
  • Provision, configure and maintain cloud infrastructure as code
  • Perform rotatory on-call service, ensuring reliability and uptime for our users
  • Write technical documentation, contributing to our technical knowledge base and empowering your peers
  • Perform other exciting duties as opportunities and needs arise.

Requirements:

  • At least 3 years of experience in Site Reliability Engineering, DevOps Engineering, System Administration or Cloud Infrastructure Engineering for a web-based product with a focus on observability and reliability
  • Good knowledge of Amazon Web Services (AWS), CloudWatch and Datadog
  • Experience with software release management and deployment pipelines (Git, CI/CD)
  • Experience with Infrastructure as Code using AWS CDK
  • Experience writing JavaScript, TypeScript or Node.js code
  • Pragmatic with technologies: you understand tech is a tool to solve a product problem, tech is never the end goal
  • Excellent ability to communicate your ideas, regardless of the audience
  • Product-oriented: You make all your technology decisions with the final user in mind
  • You are naturally drawn towards understanding the bigger picture and recognize when there's a need for improvement, applying your intentional and rational thought process to address complex issues
  • You are able to work independently, plan and exercise conscious control of time spent on specific goals to reach deadlines effectively, and you don’t hesitate to pursue a goal despite the difficulties, all while maintaining a flexible mindset
  • You are based in Barcelona and have a work permit to work in Spain.

Nice to have:

  • Experience with MongoDB or OpenSearch database administration
  • Experience deploying and maintaining complex cloud infrastructures serving high traffic web applications
  • Experience with complex backend architectures such as Hexagonal Architecture and Domain Driven Design (DDD)
  • Experience with other cloud providers such as Azure or Google Cloud Platform
  • ... or another amazing skill you bring to the table that we haven’t thought of yet!
What we offer:
  • On-call compensation
  • 24 days’ paid holidays plus December 24th and 31st
  • Flexible working hours and reduced working time on Fridays to support work-life balance
  • €2000 annual allowance for meals with Cobee
  • Private health insurance policy with AXA
  • Access to Wellhub to support physical and emotional well-being
  • Flexible remuneration for childcare
  • Flexible remuneration for public transport
  • Flexible remuneration for health insurance of immediate family members (spouse and/or children)
  • Training and development (CodelyTV, Cloud Academy, etc.)
  • Fully stocked pantry with a variety of snacks and drinks in the Barcelona office
  • Team-building events during working hours to connect, learn, and create lasting bonds with passionate colleagues

Additional Information:

Job Posted:
December 08, 2025

Employment Type:
Fulltime
Work Type:
Hybrid work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Site Reliability Engineer

New

Senior AI Site Reliability Engineer

At Schwab, you will build a rewarding career while making a difference in the li...
Location
Location
United States , San Francisco
Salary
Salary:
190000.00 - 270000.00 USD / Year
schwab.com Logo
Charles Schwab
Expiration Date
January 20, 2026
Flip Icon
Requirements
Requirements
  • 8+ years of software development or reliability engineering experience, with 4+ years as a hands-on senior engineer in startups and/or large organizations
  • Bachelor’s degree in Computer Science or related field
  • 5+ years of experience building and operating complex products from scratch and running them in production
  • 3+ years of experience supporting applications that use Artificial Intelligence (AI) models to deliver real business impact
  • 3+ years of experience building and maintaining data pipelines and infrastructure for large datasets
  • 3+ years of experience with containers and cloud-native applications, and the ability to operationalize them in the public cloud with infrastructure as code
  • Experience implementing monitoring, alerting, and incident response for large-scale distributed systems
  • Proven track record in driving reliability, scalability, and performance improvements for production AI systems
Job Responsibility
Job Responsibility
  • Design, implement, and manage the reliability and operational excellence of GenAI applications and platforms
  • Work closely with architects, engineers, and business leaders to align reliability practices with Schwab’s enterprise strategy
  • Mentor and coach junior engineers, helping to build strong operational practices and foster a culture of continuous improvement
  • Lead by example in solving complex reliability challenges, advancing SRE standards, and driving rapid iteration from concept to production
What we offer
What we offer
  • 401(k) with company match and Employee stock purchase plan
  • Paid time for vacation, volunteering, and 28-day sabbatical after every 5 years of service for eligible positions
  • Paid parental leave and family building benefits
  • Tuition reimbursement
  • Health, dental, and vision insurance
  • Bonus or incentive opportunities
  • Fulltime
Read More
Arrow Right
New

Site Reliability Engineer

Join our client, a leading financial institution at the forefront of innovation,...
Location
Location
United States , Austin
Salary
Salary:
57.00 - 63.33 USD / Hour
aquent.com Logo
Aquent
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proven experience leading engineering teams and delivering projects using Scrum and efficient release practices
  • Strong background in converting high-level designs into low-level designs and providing technical oversight
  • Demonstrated experience in designing, architecting, and deploying cloud-native applications, specifically on GCP
  • Proficiency with various database technologies, including MongoDB, Aerospike, SQL Server, and PostgreSQL
  • Expertise in containerization technologies such as Docker and Kubernetes, and building/managing CI/CD pipelines
  • Experience leveraging AI-Driven software development tools to enhance productivity, code comprehension, and documentation
  • Proven track record of integrating and applying AI/Machine Learning models for data analytics, visualization, automation, and problem-solving
  • Ability to maintain high quality standards while delivering within tight schedules
  • Exceptional collaborative mindset with a bias for action, engaging effectively with product management, architects, and other domains
  • Strong ability to work with internal, external, and offshore stakeholders
Job Responsibility
Job Responsibility
  • Drive Technical Leadership & Project Delivery: Lead engineering teams through the entire project lifecycle, leveraging agile methodologies like Scrum to ensure efficient delivery and robust release practices
  • Architect & Design Cloud-Native Solutions: Translate high-level architectural visions into detailed low-level designs, providing expert technical oversight for the development and deployment of cutting-edge cloud-native applications
  • Champion Reliability & Scalability: Design, architect, and deploy highly available and scalable cloud-native applications on platforms such as GCP, ensuring optimal performance and resilience
  • Optimize Data Management: Leverage your expertise with diverse database technologies, including MongoDB, Aerospike, SQL Server, and PostgreSQL, to build and maintain robust data solutions
  • Advance DevOps & Automation: Implement and optimize containerization strategies using technologies like Docker and Kubernetes, and establish sophisticated CI/CD pipelines to streamline development and deployment
  • Innovate with AI/ML: Integrate and apply AI/Machine Learning models to enhance data analytics, visualization, automation, and creatively solve complex business and technical challenges
  • Foster Collaboration & Mentorship: Work closely with diverse stakeholders across product management, architecture, and other engineering domains, while actively mentoring and coaching multiple teams to elevate technical capabilities
  • Influence & Present Solutions: Effectively engage subject matter experts, present complex architectural solutions to governance boards and stakeholders, and advocate for data-driven proposals
What we offer
What we offer
  • subsidized health, vision, and dental plans
  • paid sick leave
  • retirement plans with a match
Read More
Arrow Right

Senior Site Reliability Engineer

Affirm is reinventing credit to make it more honest and friendly, giving consume...
Location
Location
Spain
Salary
Salary:
85000.00 - 115000.00 EUR / Year
affirm.com Logo
Affirm
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 4+ years of experience designing, developing and launching backend systems at scale using scripting and development languages like Bash, Python or Kotlin
  • A track record of developing highly available distributed systems using technologies like AWS, MySQL and Kubernetes
  • Meaningful experience contributing in or driving parts of the Incident Lifecycle process, enabling actionable insights that improve the quality culture, reliability, resilience, and system performance
  • 4+ years working in a Site Reliability or Production Engineering team
  • Experience defining a technical plan for the delivery of a significant feature or system component with an elegant, simple and extensible design
  • Experience in making impactful changes in a large code base, and have developed a suite of tools and practices that enable you and your team to do so safely
  • Strong verbal and written communication skills that support effective collaboration with our global engineering team
  • On-Call Rotation - There would be an on-call rotation for this role as a requirement
Job Responsibility
Job Responsibility
  • You will be responsible for owning and delivering quarterly goals for your team, leading engineers on your team through ambiguity to solve open-ended problems, and ensuring that everyone is supported throughout delivery
  • You will support your peers and stakeholders in the product development lifecycle by collaborating with infrastructure, product management, developer experience & analytics by participating in ideation, articulating technical constraints, and partnering on decisions that properly consider risks and trade-offs
  • You will proactively identify technical solutions and operational processes that strengthen incident readiness, response, and post-incident analysis
  • You will support the operations and availability of your team’s artifacts by creating and monitoring metrics, escalating when needed, and supporting “keep the lights on” & on-call efforts
  • You will foster a culture of quality and ownership on your team by setting or improving code review and design standards for your team, and advocating for them beyond your team through your writing and tech talks
  • You will help develop talent on your team by providing feedback and guidance, and leading by example
What we offer
What we offer
  • Flexible Spending Wallets for tech, food and lifestyle
  • Away Days - wellness days to take off work and recharge
  • Learning & Development programs
  • Parental benefit
  • Employee Resource & Community Groups
  • Health care coverage - Affirm covers all premiums for all levels of coverage for you and your dependents
  • Flexible Spending Wallets - generous stipends for spending on Technology, Food, various Lifestyle needs, and family forming expenses
  • Time off - competitive vacation and holiday schedules allowing you to take time off to rest and recharge
  • ESPP - An employee stock purchase plan enabling you to buy shares of Affirm at a discount
  • Fulltime
Read More
Arrow Right

Site Reliability Engineer

You develop cloud platform according to modern principles. You advise our custom...
Location
Location
Spain , Valencia
Salary
Salary:
Not provided
maibornwolff.de Logo
MaibornWolff GmbH
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Ideally, a degree in computer science or comparable training
  • Sound technical understanding
  • Idea of how to build and run a secure application in the cloud
  • Experience with container orchestration, ideally with Kubernetes
  • Experience with Infrastructure-as-Code tools such as Terraform, Helm, Ansible, or CDK
  • Experience in setting up the release management process using modern CI/CD systems
  • Knowledge of a cloud provider (AWS, Azure, Google Cloud) certified in the best case
  • Development skills in at least one object-oriented, functional or scripting language
  • Very good English and good German Skills
Job Responsibility
Job Responsibility
  • Develop cloud platform according to modern principles
  • Advise customers on the sensible use of services in the cloud with regard to effort, costs and maintenance
  • Live a vibrant DevOps culture internally and carry it to customers
  • Help the customer to introduce the correct release processes and implement them based on the modern CI/CD tools (Azure DevOps, Gitlab, Github)
  • Develop and integrate monitoring and logging infrastructure to improve application maintainability
  • Design and develop scalable and fail-safe IT architectures
What we offer
What we offer
  • Home Office & Office
  • Flexible Working Hours
  • Part-Time Models
  • Working Time Account
  • Sabbatical
  • 30 days of paid vacation
  • An annual training budget of 1.5 gross monthly salaries for training, certifications, conferences, and more
  • Corporate seminars
  • Christmas parties
  • Private health and dental insurance
Read More
Arrow Right

Site Reliability Engineer

About LogRocket: Founded in 2016, LogRocket's goal is to make every experience o...
Location
Location
United States , Boston
Salary
Salary:
135000.00 - 220000.00 USD / Year
logrocket.com Logo
LogRocket
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • At least 4 years of experience as a Site Reliability Engineer, or related job
  • Ability to read and understand product code
  • Familiarity with the state of the art in cloud technologies, including common providers, specific tools of the trade, and their strengths and weaknesses
  • Experience operating applications and databases with demanding scalability or availability requirements
  • Proven expertise in modern container orchestration practices
  • A strong understanding of the performance, architecture, tooling, and cost of cloud systems
  • A security focused mindset with a solid understanding of incident response and risk mitigation
  • A strong collaborator who is transparent about progress on tasks, seeks feedback early and often, works effectively with the team and customers
Job Responsibility
Job Responsibility
  • Improve quality of pager alerts while reducing noise
  • Maintain awareness of engineering initiatives across the organization and monitor their impact on stability, cost, and performance
  • Keep infrastructure up-to-date to take advantage of security patches and new features
  • Improve operational security without sacrificing engineering independence
What we offer
What we offer
  • Catered lunch and an impressive array of your favorite snacks
  • Unlimited vacation policy
  • Health, Dental, Vision benefits, 401k, commuter benefits
  • Generous stock options
  • Regular team outings and activities
  • Fulltime
Read More
Arrow Right

Site Reliability Engineer

As a highly skilled Site Reliability Engineer (SRE), you will contribute to buil...
Location
Location
United States , New York City; San Francisco
Salary
Salary:
160000.00 - 300000.00 USD / Year
hebbia.ai Logo
Hebbia
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Computer Science, Engineering, or a related field, or equivalent practical experience
  • 5+ years software development experience at a venture-backed startup or top technology firm
  • Proven experience as a Site Reliability Engineer, DevOps Engineer, or similar role
  • Strong expertise in managing CI/CD pipelines and deployment automation
  • Proficiency in cloud platforms such as AWS, Azure, or Google Cloud (we are an AWS shop)
  • Solid understanding of containerization and orchestration technologies such as Docker and Kubernetes
  • Experience with monitoring and observability tools such as Datadog, Prometheus, Grafana, or similar
  • Knowledge of infrastructure-as-code (IaC) tools such as Terraform or CloudFormation
  • Familiarity with security best practices and tools for infrastructure and application security
  • Excellent problem-solving skills and the ability to troubleshoot complex issues
Job Responsibility
Job Responsibility
  • Assist in managing deployment pipelines to facilitate smooth and efficient software releases
  • Help implement and maintain observability solutions for monitoring system performance and reliability
  • Support local development environments to optimize developer workflows
  • Work with development teams to ensure infrastructure aligns with project requirements
  • Contribute to improving the security of our infrastructure by assisting with proactive measures and audits
  • Assist in developing and maintaining automation scripts and tools to enhance operational efficiency
  • Help troubleshoot and resolve infrastructure and application issues to minimize downtime and maintain smooth operations
  • Participate in evaluating and integrating new technologies to enhance the scalability, reliability, and security of our infrastructure
What we offer
What we offer
  • PTO: Unlimited
  • Insurance: Medical + Dental + Vision + 401K
  • Eats: Catered lunch daily + doordash dinner credit if you ever need to stay late
  • Parental leave policy: 3 months non-birthing parent, 4 months for birthing parent
  • Fertility benefits: $15k lifetime benefit
  • New hire equity grant: competitive equity package with unmatched upside potential
  • Fulltime
Read More
Arrow Right

Staff Site Reliability Engineer

We are looking for a Site Reliability Engineer to own our internal systems infra...
Location
Location
United States , Sunnyvale
Salary
Salary:
175000.00 - 250000.00 USD / Year
figure.ai Logo
Figure
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong experience with Linux/Unix systems administration
  • Proficiency in programming/scripting
  • Extensive experience with cloud platforms (Azure, AWS, GCP) and on-prem hardware architectures
  • Experience designing, deploying, and operating high-availability, fault-tolerant, and distributed systems
  • Mastery of infrastructure as code (Terraform, CloudFormation, Ansible…)
  • Familiarity with monitoring, logging, and alerting tools (Prometheus, Grafana, Datadog…)
  • Solid understanding of networking fundamentals (TCP/IP, DNS, HTTP, load balancers, firewalls)
  • Experience defining Service Level Objectives (SLO), developing runbooks/incident response plans, facilitating post-mortems and managing systems assets
  • Ability to work in cross-functional teams with developers, infra, and product teams
  • Excellent verbal and written communication skills
Job Responsibility
Job Responsibility
  • Be the go to person for mission critical infrastructure enabling critical operations such as Source Configuration Management, CI/CD systems, software distribution, supplier portals, manufacturing and more
  • Migrate SaaS to self-hosted solutions to enhance security and reliability
  • Implement monitoring and alerting systems, and define incident response plans and runbooks
  • Reduce human workload through automation to automate deployment and scaling
  • Establish strong relationships with stakeholders to identify infrastructure needs and establish Service Level Objectives
  • Use a data driven approach to demonstrate service robustness and track optimization work
  • Partner with the security team to ensure that security remediations and updates are applied in a timely manner
  • Fulltime
Read More
Arrow Right

Principal Site Reliability Engineer

Location
Location
United States , Ft. Meade
Salary
Salary:
Not provided
cipherlogix.com Logo
CipherLogix
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Fourteen (14) years experience in software development/engineering, including requirements analysis, software development, installation, integration, evaluation, enhancement, maintenance, testing, and problem diagnosis/resolution
  • Ten (10) years experience in system engineering/architecture
  • Ten (10) years experience working with products that support highly distributed, massively parallel computation needs such as Hbase, Hadoop, CloudBase/Acumulo, Big Table, Cassandra, Scality etc
  • At least ten (10) years experience writing software scripts using scripting languages such as Perl, Python, or Ruby for software automation
  • At least four (4) years experience managing and monitoring large Cloud System (>200 nodes). Cloud Systems Administrator or Developer Certification
  • Experience in performing and providing technical direction for the development, engineering, interfacing, integration, and testing of complete hardware/software systems to include monitoring technical health of a system, improving organizational processes, implementation of postmortem (failure) analysis and incident management
  • Ten (10) years experience in the cleared environment
  • Ten (10) years demonstrated experience developing software for one of the following: Windows, UNIX, or Linux OS
  • Knowledge and experience with developing distributed storage routing and querying algorithms
  • Experience in developing documentation required to support a program’s technical issues and training situations
  • Fulltime
Read More
Arrow Right
Welcome to CrawlJobs.com
Your Global Job Discovery Platform
At CrawlJobs.com, we simplify finding your next career opportunity by bringing job listings directly to you from all corners of the web. Using cutting-edge AI and web-crawling technologies, we gather and curate job offers from various sources across the globe, ensuring you have access to the most up-to-date job listings in one place.