CrawlJobs Logo

Junior Site Reliability Engineer

accesso.com Logo

accesso

Location Icon

Location:
United Kingdom

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

As a Jr. Site Reliability Engineer, you will 'make things scale' which includes supporting delivery and operation of the managed accesso Horizon product in customers’ cloud environments (AWS/Azure/GCP). You will work under mentor guidance to deploy, operate and support customer environments, automate tasks, and learn site reliability and cloud best practices.

Job Responsibility:

  • Assisting with provisioning and deploying accesso Horizon components to customer cloud accounts using Infrastructure as Code (Terraform)
  • Help maintain CI/CD pipelines (GitHub Actions) for application and infrastructure deployments
  • Support monitoring, logging and alerting (Prometheus, Grafana & Coralogix) and respond to basic alerts with supervision
  • Implement and improve basic automation and scripting
  • Participate in incident triage, root cause investigation and follow-up tasks
  • Follow security and compliance requirements for customer cloud environments (identity, secrets, network controls)
  • Produce and maintain operational runbooks, deployment guides and change notes
  • Participate in on-call rotation as a L1 responder
  • Normal workday may require time outside the normal working day
  • Learn and apply accesso Horizon product architecture and configuration

Requirements:

  • Some practical exposure to cloud platforms (AWS/Azure/GCP)—coursework, internships, or self-led projects
  • Ability to self-learn with assistance from Senior Engineers
  • Basic scripting ability using Python or Bash
  • Familiarity with basic Linux systems and general command–line
  • Understanding of Git and basic CI/CD concepts
  • Good written and verbal communication
  • customer-focused approach
  • Ability to work with minimal direction
  • Willingness to learn, take direction and work within a team

Nice to have:

  • Experience with Terraform, Docker, Kubernetes (EKS/AKS/GKE) or monitoring tools
  • Familiarity with security fundamentals (IAM, network ACLs, secrets management)
  • Experience supporting a SaaS or managed service
What we offer:
  • Competitive compensation package including an annual bonus opportunity
  • 8-days of paid bank holiday leave and 26-days of paid annual leave (paid leave increases with tenure)
  • 8 hours of paid Volunteer Time Off
  • Inclusive Family Benefits, including a $7,500 benefit for surrogacy, adoption, and fertility
  • Robust health insurance scheme with the opportunity to participate in private medical scheme after satisfactory performance
  • Matching pension scheme (up to 8%)
  • Unlimited access to Udemy for Business
  • Flexible work schedule

Additional Information:

Job Posted:
December 05, 2025

Employment Type:
Fulltime
Work Type:
Remote work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Junior Site Reliability Engineer

Lead Site Reliability Engineer

Groupon is a marketplace where customers discover new experiences and services e...
Location
Location
India , Bangalore
Salary
Salary:
Not provided
groupon.com Logo
Groupon
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years in systems engineering
  • at least 5+ years in SRE or DevOps roles
  • expertise in cloud platforms (GCP, AWS) and container orchestration (Kubernetes, Docker)
  • proficiency in programming and scripting languages like Python, Go, and Bash
  • advanced knowledge of Infrastructure as Code (IaC) tools such as Terraform and Ansible
  • deep understanding of networking, DNS, load balancing, and security principles
  • proven track record of managing high-availability systems in demanding environments
  • exceptional analytical and problem-solving skills
Job Responsibility
Job Responsibility
  • Architect and maintain fault-tolerant systems, ensuring uptime SLAs of 99.9% or higher
  • drive automation in infrastructure management and deployment using Terraform, Ansible, Kubernetes, and similar tools
  • create and optimize CI/CD pipelines to ensure reliable, secure, and efficient software delivery
  • build and enhance comprehensive observability solutions, including monitoring, logging, and alerting systems using Prometheus, Grafana, and the ELK stack
  • collaborate with stakeholders to define and achieve SLIs, SLOs, and error budgets aligned with business needs
  • lead incident response during on-call rotations, ensuring rapid resolution and root cause analysis for critical issues
  • design and execute performance testing, capacity planning, and scalability strategies for evolving workloads
  • proactively identify and resolve bottlenecks, increasing system performance and developer efficiency
  • mentor junior engineers, fostering a collaborative and growth-oriented team environment
  • guide architectural decisions that drive innovation and enhance system reliability
What we offer
What we offer
  • The opportunity to work with cutting-edge technologies in a transformative environment
  • a collaborative and innovative work values alignment that values your expertise and contributions
  • professional growth and leadership development pathways tailored to your aspirations
  • a chance to leave a lasting impact by shaping the future of reliable and scalable systems
Read More
Arrow Right

Staff Engineer, Site Reliability

LearnUpon is looking for a Staff Site Reliability Engineer to join our team in I...
Location
Location
Ireland , Dublin
Salary
Salary:
Not provided
learnupon.com Logo
LearnUpon
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years of experience in a software or Ops role
  • 5+ years of cloud engineering experience, with at least 2 years experience with AWS
  • Experience deploying Microservice environments, using containerisation technologies such as Kubernetes and Docker
  • Experience in designing and implementing Observability tech stacks
  • Have championed the benefits of Observability to Engineering teams
  • Can architect the design of SLO/SLI implementation that balances the needs of different teams
  • Familiar with cost analysis of Observability metrics gathering, Engineering effort, and tooling
  • Experience building and supporting large-scale distributed systems that back a consumer app or website with associated requirements of performance, security and disaster recovery
  • Experience with implementing IaaC (e.g. CloudFormation, Terraform etc.), automation tooling (e.g. Puppet, Ansible etc.), CI/CD (e.g. Jenkins, Travis CI, GitLab etc.)
  • Able to effectively communicate technical ideas to and collaborate with both technical and non-technical peers
Job Responsibility
Job Responsibility
  • Identifying opportunities to improve and scale our infrastructure for performance, observability, maintainability, and cost, by creating innovative solutions
  • Leading our efforts to build an observability function that incorporates application metrics, application transaction tracking, and event log management
  • Driving the processes to maintain resilient, scalable and cost-effective infrastructure
  • Working with other Engineering teams to provide infrastructure solutions that meet their ongoing requirements
  • Building tools focused on measuring, monitoring and alerting, with an eye towards self-service in order to promote Engineers’ ownership of observability
  • Reacting quickly to changing customer and business needs
  • Participate in on-call rota
  • Mentoring junior talent
What we offer
What we offer
  • Work in a fun and supportive environment with regular team events
  • Excellent career progression
  • Structured learning environment
  • Competitive salary and company ESOP
  • Private health insurance
  • 26 days annual leave
  • Fulltime
Read More
Arrow Right
New

Senior Site Reliability Engineer

We're looking for a Senior Site Reliability Engineer for our Currents team, resp...
Location
Location
United States , New York City
Salary
Salary:
129600.00 - 232200.00 USD / Year
braze.com Logo
Braze
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s in Computer Science, Software Engineering, or a related STEM field
  • Five (5) years of experience in any role/occupation/position involving software engineering or site reliability engineering
  • Experience must include: Using distributed systems to deploy and monitor live applications such as Kubernetes or Docker Swarm
  • Working with alerting software (Sentry, Datadog, and/or PagerDuty)
  • Utilizing programming languages (Java, Kotlin, and/or Ruby) to understand and contribute to the codebase
  • Storing data in relational and non-relational databases such as Postgres and MongoDb
  • Data streaming or queuing systems to build data pipelines with technologies like Kafka, Sidekiq or SQS and SNS
  • Leveraging continuous integration tools such as Jenkins or Buildkite
  • Collaborating with engineers through pull requests and code reviews in version control software such as GitHub or GitLab
Job Responsibility
Job Responsibility
  • Solve live performance and reliability issues and prevent their recurrence
  • Write and review code, educating engineers and building a culture of reliability
  • Practice sustainable incident response and blameless postmortems
  • Define and enable standards for monitoring, reliability, and performance
  • Bridge the gap between infrastructure and platform engineering teams
  • Support and improve services by planning for scale and reliability
  • Guide junior engineers in SRE best practices, software engineering, and agile project leadership
What we offer
What we offer
  • Competitive compensation that may include equity
  • Retirement and Employee Stock Purchase Plans
  • Flexible paid time off
  • Comprehensive benefit plans covering medical, dental, vision, life, and disability
  • Family services that include fertility benefits and equal paid parental leave
  • Professional development supported by formal career pathing, learning platforms, and a yearly learning stipend
  • A curated in-office employee experience, designed to foster community, team connections, and innovation
  • Opportunities to give back to your community, including an annual company-wide Volunteer Week and donation matching
  • Employee Resource Groups that provide supportive communities within Braze
  • Fulltime
Read More
Arrow Right

Principle SRE

The Principal Site Reliability Engineer will be a senior technical expert respon...
Location
Location
India , Pune
Salary
Salary:
Not provided
barclays.co.uk Logo
Barclays
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 12+ years in software engineering or infrastructure roles
  • at least 5 years focused on reliability engineering or SRE
  • proven experience building and operating fault-tolerant, highly available systems at scale
  • strong knowledge of distributed systems, resiliency patterns (circuit breakers, retries, failover), and disaster recovery strategies
  • expertise across infrastructure (compute, storage, networking), application architecture, databases, and integration patterns
  • ability to troubleshoot complex technical issues across distributed systems and perform deep root cause analysis
  • skilled at working with development, operations, and architecture teams to embed reliability into design and delivery
Job Responsibility
Job Responsibility
  • Drive strategies to improve reliability, maintainability, and scalability across payment flows and platform components
  • conduct deep technical assessments of system architectures, identifying risks and recommending improvements for fault tolerance and disaster recovery
  • act as a senior escalation point for production incidents, lead RCA, and implement permanent fixes to prevent recurrence
  • define and enforce reliability patterns, frameworks, and best practices
  • advocate and implement chaos engineering principles to validate system resilience under real-world failure scenarios
  • design and implement full-stack observability solutions, including metrics, logging, distributed tracing, and alerting
  • develop automation for failover, capacity management, and self-healing mechanisms to reduce operational risk
  • partner with development, infrastructure, and production support teams to embed reliability into the SDLC
  • analyze service risk assessments and production incidents to identify systemic issues and drive long-term improvements
  • promote operational excellence and a mindset of designing for failure across all engineering teams
What we offer
What we offer
  • Competitive holiday allowance
  • Life assurance
  • Private medical care
  • Pension contribution
  • Fulltime
Read More
Arrow Right

Staff Observability Operations Engineer

We are currently seeking several experienced and highly skilled Staff Observabil...
Location
Location
United States , Hartford
Salary
Salary:
130295.00 - 260590.00 USD / Year
https://www.cvshealth.com/ Logo
CVS Health
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ Years of experience in IT operations, with significant responsibilities in system monitoring, performance tuning, and troubleshooting enterprise applications
  • 5+ Years in a Site Reliability Engineering (SRE) role deploying and managing modern observability solutions
  • 5+ Years managing and implementing observability and event management platforms (e.g., AppDynamics, Splunk, Prometheus, Grafana)
  • Experience developing and administering ServiceNow ITOM event management solutions
  • Experience deploying and managing service reliability platforms (e.g., xMatters, OpsGenie, PagerDuty)
  • Experience with and deep knowledge of cloud environments, cloud monitoring platforms, and container orchestration tools (e.g., AWS/CloudTrail, Azure/Monitor, GCP/GCM, Kubernetes, OpenShift)
  • Proficiency in Python and other scripting languages such as Ansible, PowerShell, Bash for automation and configuration
  • Hands-on experience deploying, managing, and administering observability platforms
  • Hands-on experience leading, coordinating, and performing migration of application, platform, and infrastructure observability solutions
  • Proven ability to troubleshoot and resolve complex technical issues
Job Responsibility
Job Responsibility
  • Deploy and implement modern observability solutions
  • Manage and administer observability and event management platforms
  • Coordinate and manage release cycles for observability platforms
  • Troubleshoot and resolve incidents related to observability platforms
  • Continuously monitor and enhance platform performance
  • Collaborate with cross-functional stakeholders
  • Provide training and mentoring to junior engineers
  • Ensure compliance and security of observability platforms
  • Maintain documentation of observability platform configurations
  • Generate and analyze reports on platform performance and capacity
What we offer
What we offer
  • Affordable medical plan options
  • a 401(k) plan (including matching company contributions)
  • an employee stock purchase plan
  • No-cost programs for all colleagues including wellness screenings, tobacco cessation and weight management programs
  • confidential counseling and financial coaching
  • Paid time off
  • flexible work schedules
  • family leave
  • dependent care resources
  • colleague assistance programs
  • Fulltime
Read More
Arrow Right

Reliability Engineer II

The BizOps team is looking for a Site Reliability Engineer who can help us solve...
Location
Location
United States of America , O Fallon
Salary
Salary:
76000.00 - 127000.00 USD / Year
mastercard.com Logo
Mastercard
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • BS degree in Computer Science or related technical field involving coding (e.g., physics or mathematics), or equivalent practical experience
  • Experience with algorithms, data structures, scripting, pipeline management, and software design
  • Systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive
  • Ability to help debug and optimize code and automate routine tasks
  • Experience in dealing with difficult situations and making decisions with a sense of urgency
  • Interest in designing, analyzing and troubleshooting large-scale distributed systems
  • Experience in working across development, operations, and product teams to prioritize needs and to build relationships
  • Experience in industry standard CI/CD tools like Git/BitBucket, Jenkins, Maven, Artifactory, and Chef
  • Experience designing and implementing an effective and efficient CI/CD flow that gets code from dev to prod with high quality and minimal manual effort
Job Responsibility
Job Responsibility
  • Engage in and improve the whole lifecycle of services—from inception and design, through deployment, operation and refinement
  • Analyze ITSM activities of the platform and provide feedback loop to development teams on operational gaps or resiliency concerns
  • Support services before they go live through activities such as system design consulting, capacity planning and launch reviews
  • Maintain services once they are live by measuring and monitoring availability, latency and overall system health
  • Scale systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocity
  • Support the application CI/CD pipeline for promoting software into higher environments through validation and operational gating, and lead Mastercard in DevOps automation and best practices
  • Practice sustainable incident response and blameless postmortems
  • Take a holistic approach to problem solving, by connecting the dots during a production event thru the various technology stack that makes up the platform, to optimize mean time to recover
  • Work with a global team spread across tech hubs in multiple geographies and time zones
  • Share knowledge and mentor junior resources
What we offer
What we offer
  • insurance (including medical, prescription drug, dental, vision, disability, life insurance)
  • flexible spending account and health savings account
  • paid leaves (including 16 weeks of new parent leave and up to 20 days of bereavement leave)
  • 80 hours of Paid Sick and Safe Time, 25 days of vacation time and 5 personal days, pro-rated based on date of hire
  • 10 annual paid U.S. observed holidays
  • 401k with a best-in-class company match
  • deferred compensation for eligible roles
  • fitness reimbursement or on-site fitness facilities
  • eligibility for tuition reimbursement
  • Fulltime
Read More
Arrow Right

Senior Service Delivery Leader

At IKEA, we are committed to transforming the way we manage and deliver services...
Location
Location
India , Bangalore
Salary
Salary:
Not provided
https://www.ikea.com Logo
IKEA
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong Service Management skills and how to integrate a strong and modern engineering culture across Group Digital and digital organizations in the markets with a proven track record (at least 10+ years) in managing services in a global organization
  • Strong understanding of software development best practices, and of how to lead, develop, define, plan, and execute a roadmap to meet business requirements together with relevant stakeholders
  • Demonstrable relevant knowledge of technology and/or software engineering within the relevant areas combined with good knowledge of agile ways of working, how to enable a product- and service-led organization, knowledge of how to set direction, create and manage plans, set budgets and goals, and follow up on OKRs across service delivery framework, SLAs, and KPIs
  • Proven analytical skills and experience in making decisions based on both hard and soft data
  • Strong negotiation and influencing skills, with the ability to build trustful relationships and hold stakeholders accountable at all levels (junior team members or senior management) both internally and externally
  • Excellent written and verbal communication skills, with the ability to engage and communicate with senior business leaders
  • Degree with a focus on Engineering, Technology, or related areas/equivalent combination of education and experience
  • 10+ years of diverse experience in Digital Foundation or Digital products and service delivery with a proven track record of delivering products and services that provide substantial value
  • 10+ years of experience with the services and products within the area and proven knowledge and ability to transform and optimize processes and behaviours
  • 8+ years of demonstrable experience working in agile/scrum and Software Engineering environments in complex global organizations
Job Responsibility
Job Responsibility
  • Align Service Management practices across areas to ensure consistent, robust, and complete implementation of service delivery guardrails
  • Collaborate with (Senior) Engineering Domain Managers to secure the competence and capacity of engineering teams required for great service delivery
  • Build and maintain the service management community in the assigned area, ensuring SLAs are met
  • Ensure Service Management & Operations capabilities are embedded in engineering teams with sufficient expert support
  • Identify goals and drive implementation of Service Delivery practices, ensuring standardized and coherent ways of working
  • Drive the implementation of Service Management-as-Code to ensure stability and rapid recovery from incidents.
What we offer
What we offer
  • Some travel between Digital Hubs
  • Collaboration and co-creation in the office
  • Fulltime
Read More
Arrow Right

BizOps Engineer I

The Transaction Stream BizOps team is looking for a Site Reliability Engineer wh...
Location
Location
India , Pune
Salary
Salary:
Not provided
mastercard.com Logo
Mastercard
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • BS degree in Computer Science or related technical field involving coding (e.g., physics or mathematics), or equivalent practical experience
  • Experience with algorithms, data structures, scripting, pipeline management, and software design
  • Systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive
  • Ability to help debug and optimize code and automate routine tasks
  • Experience in dealing with difficult situations and making decisions with a sense of urgency
  • Interest in designing, analyzing and troubleshooting large-scale distributed systems
  • Appetite for change and pushing the boundaries of what can be done with automation
  • Experience in working across development, operations, and product teams to prioritize needs and to build relationships
Job Responsibility
Job Responsibility
  • Engage in and improve the whole lifecycle of services—from inception and design, through deployment, operation and refinement
  • Analyze ITSM activities of the platform and provide feedback loop to development teams on operational gaps or resiliency concerns
  • Support services before they go live through activities such as system design consulting, capacity planning and launch reviews
  • Maintain services once they are live by measuring and monitoring availability, latency and overall system health
  • Scale systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocity
  • Support the application CI/CD pipeline for promoting software into higher environments through validation and operational gating, and lead Mastercard in DevOps automation and best practices
  • Practice sustainable incident response and blameless postmortems
  • Take a holistic approach to problem solving, by connecting the dots during a production event thru the various technology stack that makes up the platform, to optimize mean time to recover
  • Work with a global team spread across tech hubs in multiple geographies and time zones
  • Share knowledge and mentor junior resources
  • Fulltime
Read More
Arrow Right