CrawlJobs Logo

Site Reliability Engineer - Core

blockchain.com Logo

Blockchain

Location Icon

Location:
United Kingdom , London

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

We are looking for a Site Reliability Engineer to join our Core team to encourage infrastructure best practices across our organization that would allow to securely scale a distributed financial platform that touches millions of people a day. Our distributed financial platform tackles some of the most interesting problems in the crypto for millions of our customers and continues to grow rapidly. The SRE team at blockchain combines software and systems engineering to provide a platform that abstracts complexity for increased security, reliability and rapid product delivery. As a member of the Core team you will be tasked with developing an in-depth understanding of the infrastructure needs of our products. You will establish and maintain creative engineering solutions to improve our customers’ experience by building necessary tooling. Crucially, you will also guide and educate developer teams so that they can deliver new features in a rapid, secure and scalable manner.

Job Responsibility:

  • Play a critical role in evolving our infrastructure as we develop solutions to complex technical problems involving reliability, latency, bandwidth and most importantly security
  • Be an integral part of improving observability, monitoring and alerting throughout the platform
  • Help co-ordinate work across different areas of the company to ensure the most efficient path of execution
  • Centralize wherever possible common streams of work that are currently duplicated across developer teams
  • Focus heavily on writing tooling to replace manual, repetitive work in a scalable way
  • Work in a fast paced, and dynamic environment complementing our existing high calibre team

Requirements:

  • Experience with containerization and service orchestration, including best practices and security
  • Strong knowledge of at least one programming language
  • Linux, including an understanding of resource allocation, network and/or internals
  • Experience working with cloud solutions (GCP or AWS)
  • Deep understanding and demonstrable experience with modern monitoring tools such as Prometheus, Datadog, Grafana, Telegraf
  • Experience with infrastructure as code tools
  • Solid background with configuration management tools
  • Experience with using GitOps and CI to make changes, preferably Github Actions
  • Experience with messaging systems such as Kafka
  • Experience with database management

Nice to have:

  • Experience with Hashicorp Nomad, Consul and Vault is a plus
  • Experience with Golang, Python, and Bash is a plus
  • Experience with complex Terraform deployments is a plus
  • Experience with Saltstack is a plus
  • Experience working in Data Centers is a plus
  • Knowledge of routing and switching protocols is a plus
What we offer:
  • Full-time salary based on experience and meaningful equity in an industry-leading company
  • Hybrid model working from home & awesome office location in the heart of London
  • Unlimited vacation policy
  • work hard and take time when you need it
  • Work from Anywhere Policy: You can work remotely from anywhere in the world for up to 20 days per year
  • Apple equipment
  • The opportunity to be a key player and build your career at a rapidly expanding, global technology company in an emerging field
  • Flexible work culture

Additional Information:

Job Posted:
December 06, 2025

Employment Type:
Fulltime
Work Type:
Hybrid work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Site Reliability Engineer - Core

Senior Site Reliability Engineer

We are seeking an experienced Senior Site Reliability Engineer (L3) to join our ...
Location
Location
India , Chennai
Salary
Salary:
Not provided
arcadia.com Logo
Arcadia
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Computer Science, Engineering, or equivalent practical experience
  • 8–10+ years of experience in SRE/DevOps/Cloud Engineering, with deep hands-on exposure to AWS and Kubernetes
  • Strong hands-on experience with: Terraform & Infrastructure as Code
  • AWS core services (EKS, IAM, RDS, EC2, VPC, CloudWatch, CloudTrail, GuardDuty)
  • Jenkins + Groovy, GitHub Actions, ArgoCD, FluxCD
  • Kubernetes troubleshooting and operations
  • Prometheus/Grafana/Datadog observability stacks
  • Proven ability to operate in high-scale, high-uptime, multi-environment production systems
  • Experience building automation via Python/Bash and reducing operational toil
  • Strong understanding of incident management, root cause analysis, and reliability engineering principles
Job Responsibility
Job Responsibility
  • Design, build, and maintain AWS infrastructure (EKS, VPC, RDS, IAM, CloudWatch, CloudTrail, GuardDuty, Load Balancers, S3, CloudFront) using Terraform and CloudFormation
  • Lead all aspects of Kubernetes operations including cluster upgrades, performance tuning, CNI troubleshooting, workload scaling, Helm chart packaging, and GitOps deployments
  • Own and evolve our CI/CD ecosystem across Jenkins (Groovy scripting), GitHub Actions, AWS CodePipeline, ArgoCD, and FluxCD
  • Improve platform reliability by reducing operational toil through automation, scripting (Python/Bash), and proactive system hardening
  • Implement and enhance observability across Prometheus, Grafana, Loki, Tempo, Datadog, and CloudWatch—ensuring actionable alerting, dashboards, and metrics alignment with SLO/SLIs
  • Drive FinOps initiatives, identifying cost inefficiencies and working with engineering teams to implement best practices, tagging standards, budgeting, and resource right-sizing
  • Manage database operations across MySQL and PostgreSQL including backups, performance tuning, replication, and operational runbooks
  • Maintain and improve secret management using Vault, AWS Secrets Manager, and Parameter Store
  • Strengthen cloud security posture with IAM least privilege, CSPM reviews, audit readiness, GuardDuty/CloudTrail monitoring, and environment hardening
  • Troubleshoot complex production issues across networking, Kubernetes, compute, databases, and CI/CD systems
What we offer
What we offer
  • Competitive compensation and employee stock options
  • Hybrid/remote-first working model (India-based role, with global collaboration)
  • Flexible leave policy
  • Comprehensive medical insurance (self + family members)
  • Annual performance cycle + quarterly recognition awards
  • A supportive, diverse engineering culture grounded in empathy, teamwork, and innovation
  • Fulltime
Read More
Arrow Right

Senior Site Reliability Engineer

As a Senior Site Reliability Engineer on the Platform team, you will identify is...
Location
Location
United States , Denver; San Francisco
Salary
Salary:
138000.00 - 191000.00 USD / Year
https://checkr.com Logo
Checkr
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Degree in Computer Science (or related field)
  • 6+ years of experience in building tools with Python (preferred), GoLang, or Ruby
  • 6+ years of experience in maintaining and observing production customer-facing environments in AWS or Azure
  • 6+ years of experience as a member of an incident response team
  • Deep understanding of the fundamental infrastructure and platform concepts behind a micro-service architecture, REST APIs, and asynchronous queueing models
  • Experience with observability platforms and frameworks like Datadog, Splunk, Grafana, Prometheus, or OpenTelemetry
  • Strong collaboration, documentation, communication, and project management skills
  • Experience with container orchestration using Kubernetes/Docker/Terraform
  • Experience driving platform adoption across engineering teams, guided by a self-service and product-first approach
  • A passion for customer-centricity and building relationships with other teams
Job Responsibility
Job Responsibility
  • Collaborate, drive, and execute architectural discussions with cross-functional teams
  • Lead cross-team projects and SREs' technical roadmap to enable engineering and help Checkr customers
  • Design, build, ship, and maintain the core observability libraries, tools, and patterns used by all of Checkr’s engineering teams
  • Proactively engage across teams to foster service reliability, efficiency, and scalability
  • Troubleshoot complex production issues across the stack, with respect to performance, availability, and data quality
  • Present detailed technical information and benefits of the Checkr platform to a wide array of customers, including operations, developers, technical architects, and executives
What we offer
What we offer
  • A fast-paced and collaborative environment
  • Learning and development allowance
  • Competitive cash and equity compensation and opportunities for advancement
  • 100% medical, dental, and vision coverage
  • Up to $25K reimbursement for fertility, adoption, and parental planning services
  • Flexible PTO policy
  • Monthly wellness stipend, home office stipend
  • In-office perks such as lunch four times a week, commuter stipend, and an abundance of snacks and beverages
  • Fulltime
Read More
Arrow Right

Site Reliability Engineer/ Sr DevOps

We are offering a contract to permanent employment opportunity for a Site Reliab...
Location
Location
United States , Woodland Hills
Salary
Salary:
Not provided
https://www.roberthalf.com Logo
Robert Half
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Minimum of 5 years experience in a similar role
  • Proven expertise in Amazon EC2
  • Experience with Ansible for configuration management
  • Knowledge of Apache ANT+ and Apache Tomcat
  • Familiarity with Atlassian Jira for project management
  • Experience in AB Testing methodologies
  • Proficiency in Agile Scrum methodologies
  • Demonstrated skills in automation processes
  • Comprehensive understanding of AWS Technologies
  • Ability to perform Cluster Analysis
Job Responsibility
Job Responsibility
  • Architect and design applications for migration, ensuring they align with compliance standards and best practices
  • Actively participate in building solutions and gain an acute understanding of core infrastructure services and their interaction with applications
  • Provide technical leadership to offshore teams, aiding in the distribution of leadership tasks
  • Execute scripting tasks within a C# and .NET environment
  • Understand and manage the interaction between Service Bus, messaging queues, and other applications, and their subsequent impact on infrastructure
  • Work extensively with Azure and AWS ecosystems
  • Ensure the smooth functioning of applications by understanding the intricacies of infrastructure services
  • Utilize AWS and Azure expertise in scripting within C# and .NET environment
  • Handle the interaction of Service Bus or messaging queues with other applications and its impact on infrastructure
  • Engage in hands-on work to build solutions while understanding the interaction of core infrastructure services with applications
What we offer
What we offer
  • Medical, vision, dental, and life and disability insurance
  • Eligibility to enroll in company 401(k) plan
  • Fulltime
Read More
Arrow Right

Infrastructure Engineer

Descript is on a mission to make audio and video content creation and editing fa...
Location
Location
United States , San Francisco
Salary
Salary:
191000.00 - 250000.00 USD / Year
descript.com Logo
Descript
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years experience in production/site-reliability engineering OR 5+ years of server-side software engineering with an interest in working on core infrastructure
  • A solid understanding of at least two of: public cloud infrastructure, Linux systems administration, and DevOps tooling.
  • Basic coding skills to work on automation and technical guardrails.
  • Strong written and verbal communication skills, and the ability to collaborate with other functions
  • Experience mentoring engineers, including code reviews, architecture discussions, and leadership skills
Job Responsibility
Job Responsibility
  • Develop technical and business solutions that enable engineers to improve the quality and reliability of product features and systems that they build.
  • Drive improvements to the reliability of our core infrastructure, such as production clusters, networking, databases, and observability systems.
  • Champion best practices during reviews of code, technical designs, and launch plans.
  • Own our incident management and fire drill processes.
  • Work with engineering leadership to set goals and prioritize production reliability.
What we offer
What we offer
  • generous healthcare package
  • 401k matching program
  • catered lunches
  • flexible vacation time
  • Fulltime
Read More
Arrow Right

Site Reliability Engineer II

As an intermediate Site Reliability Engineer on the Core Infrastructure team in ...
Location
Location
Canada , Toronto
Salary
Salary:
115000.00 - 165000.00 CAD / Year
https://www.pagerduty.com Logo
PagerDuty
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 3+ years of experience in Site Reliability Engineering, DevOps, or Platform Engineering roles
  • Hands-on experience operating Linux-based systems in production environments
  • Working knowledge of networking fundamentals, such as load balancing, DNS, TLS, and ingress traffic flow
  • Experience with container orchestration (e.g., EKS, Kubernetes)
  • Experience working on cloud-native infrastructure (e.g., AWS, GCP, Azure), including networking and compute concepts
  • Proficiency in at least one programming language (e.g., Python, Ruby, Go, etc.)
  • Experience with Infrastructure as Code (e.g., Terraform, CloudFormation)
Job Responsibility
Job Responsibility
  • Support and improve foundational infrastructure, including networking, compute platforms, Kubernetes clusters, and ingress/traffic management systems
  • Contribute to the reliability and scalability of PagerDuty's core platform by hardening existing systems and supporting the rollout of new infrastructure capabilities
  • Participate in agile rituals (standups, planning, retros) and communicate progress/risks early
  • Stay current on technical trends to suggest innovative tools and approaches to interesting problems
  • Monitor system health using metrics, logs, and alerts, and participate in 24/7 on-call rotations to help detect, respond to, and resolve incidents
What we offer
What we offer
  • Competitive salary
  • Comprehensive benefits package
  • Flexible work arrangements
  • Company equity
  • ESPP (Employee Stock Purchase Program)
  • Retirement or pension plan
  • Generous paid vacation time
  • Paid holidays and sick leave
  • Dutonian Wellness Days & HibernationDuty - companywide paid days off in addition to PTO
  • Paid parental leave: 22 weeks for pregnant parent, 12 weeks for non-pregnant parent
  • Fulltime
Read More
Arrow Right

Director, Equipment Reliability Center of Excellence

The Reliability Manager is responsible for developing and implementing reliabili...
Location
Location
United States , Mapleton
Salary
Salary:
119900.00 - 199800.00 USD / Year
evonik.com Logo
Evonik Industries
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Mechanical/Electrical or Chemical Engineering with strong maintenance reliability experience
  • 5-10 years in manufacturing with leadership and general industrial management experience
  • Strong background and broad-based experience in the complex field of maintenance or reliability engineering
  • A thorough knowledge of technical codes, standards and regulations is required
  • This is a self-motivated position that requires excellent leadership, analytical, written and verbal communication skills
  • Responsiveness and professionalism are critical as this position communicates with Site Manager and Engineering and Maintenance Manager frequently
  • Must have the ability to effectively collaborate with senior management, both locally and globally, and positively add value to short term and long-term strategic planning
  • Must be analytical and have the ability to problem solve in a concise and logical manner
  • Ability to communicate effectively, both verbally and in writing, and manage expectations to create trust and credibility across a broad spectrum of the company
  • Ability to effectively articulate and explain market trends internally and externally
Job Responsibility
Job Responsibility
  • Develop and implement reliability strategies and asset management strategies to improve equipment performance, optimize asset lifecycle and reduce failure rates for Mapleton Site
  • Lead and mentor plant engineers and reliability engineers, providing guidance on best practices and methodologies
  • Drive continuous improvement initiatives using reliability-centered maintenance and other methodologies
  • Collaborate with cross functional team (maintenance, operations, engineering, safety, etc.) to ensure alignment on reliability goals and improve asset utilization and performance
  • Evaluate and prioritize asset investments based on risk, performance, business impact, ensuring alignment with organizational objectives
  • Monitor important reliability trends and technical developments for development of new applications
  • Communicate and liaison with key Evonik contact personnel in Care Solutions Business line , as well as Technical Services and Technology and Engineering Americas to solve reliability issues at Mapleton
What we offer
What we offer
  • Medical, dental, and vision benefits
  • Paid time off plan
  • 401(k) savings plans
  • Health Savings Account (HSA)
  • Flexible Spending Accounts (FSAs)
  • Employee Assistance Program
  • Voluntary Benefits and Employee Discounts
  • Disability benefits
  • Life Insurance
  • Parental leave
  • Fulltime
Read More
Arrow Right

Site Reliability Engineer

As a member of Kalshi’s engineering team, you’ll help build the next-generation ...
Location
Location
United States , New York
Salary
Salary:
100000.00 - 250000.00 USD / Year
kalshi.com Logo
Kalshi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 4+ years of software engineering experience
  • Experience designing, building, scaling, and maintaining production services and service-oriented architectures
  • Strong system design, coding, debugging, performance-tuning, and observability skills
  • High-quality coding practices with strong testing discipline
  • Excellent written and verbal communication
  • comfort working transparently across teams
  • Strong interpersonal skills across junior-to-principal engineering levels
  • Ability to think clearly under pressure and dive into any layer of the stack
  • Passion for building an open financial system that connects the world
  • Willingness to participate in on-call rotations and swiftly resolve issues
Job Responsibility
Job Responsibility
  • Improve observability, reliability, and service availability by defining and measuring key metrics
  • Build automation and systems that eliminate toil and reduce operational burden
  • Collaborate with core infrastructure engineers to performance-tune and optimize cloud deployments (Docker, Terraform, Kubernetes, EC2, etc.)
  • Partner with product teams to minimize service disruptions and automate incident response
  • Identify and analyze reliability problems across the stack, designing and implementing software for significant, long-term improvements
  • Mentor engineers and drive a culture where reliability is a core engineering value
  • Write high-quality, well-tested code that supports internal and external customer needs
  • Debug complex technical issues and improve system usability, operability, and diagnosability
  • Review feature designs across the company and ensure security, safety, scalability, and architectural clarity
  • Build and maintain integrations with third-party vendors
What we offer
What we offer
  • equity and benefits
  • Fulltime
Read More
Arrow Right

Senior Platform Engineer - AWS

We’re currently looking for a skilled and enthusiastic Senior Platform Engineer ...
Location
Location
Germany , Hamburg or Berlin
Salary
Salary:
73000.00 - 90000.00 EUR / Year
aboutyou.de Logo
About You
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of professional experience in Platform Engineering, DevOps, or Site Reliability Engineering (SRE), with a significant focus on cloud infrastructure
  • Fluency in scripting languages (e.g., Python, Go, Bash) for system automation, tooling development, and operational tasks
  • Deep expertise in managing and scaling production workloads within a major public cloud provider (e.g., AWS, Azure, or GCP), including strong familiarity with core services like Compute, Networking, Identity & Access Management (IAM), and Managed Database
  • Proven mastery of Infrastructure-as-Code (IaC) using AWS CloudFormation and/or Terraform in complex, multi-account environments
  • Demonstrated experience designing, implementing, and maintaining robust CI/CD pipelines
  • Solid knowledge of monitoring and logging solutions
  • Excellent communication and documentation skills, with the ability to articulate complex technical issues to technical stakeholders
Job Responsibility
Job Responsibility
  • Own and evolve the Commerce Cloud’s AWS infrastructure through the application of Infrastructure-as-Code (IaC) principles to ensure scalability, high availability, and cost efficiency
  • Design, implement, and optimize CI/CD pipelines and operational workflows utilizing tools such as GitLab CI, AWS CloudFormation, and Terraform
  • Establish and enforce comprehensive, high-quality documentation for all infrastructure, operational playbooks, and critical architecture decisions
  • Act as a subject matter expert and trusted advisor, partnering with application development teams to architect and provision infrastructure that meets their specific workload requirements
  • Drive collaborative efforts with GCP Platform Engineers on cross-cloud initiatives and work closely with Information Security Engineers to design and implement security controls and governance policies
  • Spearhead the evaluation and adoption of emerging cloud and platform technologies, continuously seeking opportunities to improve platform performance and developer experience
What we offer
What we offer
  • Hybrid working
  • Sports courses
  • Free access to code.talks
  • Exclusive employee discounts
  • Free drinks
  • Language courses
  • Laracast account for free
  • Company parties
  • Help in the relocation process
  • Mobility subsidy
  • Fulltime
Read More
Arrow Right