CrawlJobs Logo

Senior Site Reliability Engineer - Data Pipeline

bloomreach.com Logo

Bloomreach

Location Icon

Location:
Czechia

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

Bloomreach is building the world’s premier agentic platform for personalization. We’re revolutionizing how businesses connect with their customers, building and deploying AI agents to personalize the entire customer journey. The Data Pipeline team is a backend-focused engineering team that is built on strong DevOps principles. We believe in autonomy, and we trust data. As the team grows we need to support it by onboarding another DevOps/SRE to pair with our existing one to form an effective duo helping the team to accelerate.

Job Responsibility:

  • Your task is to build and maintain an ecosystem where engineers can safely and efficiently develop, debug and operate their services running in GCP, Kubernetes using DataFlow, DataProc and Python with Go
  • You make sure the services have high level of observability, enabling us to provide quality service for our customers
  • Further services can scale vertically and horizontally based on current load, operational and telemetric data (OTEL, Prometheus, Victoria Metrics)
  • Team have enough insights about health of our services (Grafana, Alerting, PageDuty)
  • You helps the team to fulfill security requirements given ISO and SOC2 audits, by enforce security principles like key distribution, key rotation, authorisation & authentication on service level, data encryption at transit, data isolation, resource limitations, quality of service, audit logs (mainly by Enovy proxies)
  • You contribute to our tooling, so we have tools in place for debugging, troubleshoot and performance testing
  • You automate manual/semi-manual steps deployment and instance setup
  • You have hands on on L3 support and incident resolutions
  • CI pipelines have linters, security scans, code smell detection enabling engineers to produce quality MRs

Requirements:

  • You can articulate how your contributions have transformed the way engineers work and think by fostering a strong DevOps/SRE culture
  • You can demonstrate how impactful your work as an SRE or DevOps Engineer can be in connection to business success
  • You understand the importance of you build - you run it principle and you love the feeling you own it
  • You are mindful of the costs associated with running our service, which translates into effective vertical and horizontal pod autoscaling and detailed telemetry insights
  • You believe the infrastructure as a code is the only thing that can bring stability into chaos
  • Terraform is your daily bread, and HELM deployments are your second-best friend
  • You use telemetry data and metrics to provide feedback to engineers on how the application and services behave
  • You can navigate yourself in complex service architecture by using distributed debugging
  • You have experience with Python and a solid grasp of engineering practices
  • You don’t hesitate to participate in OnCall rotation 24/7 support
  • You know how to behave in a remote-first environment
  • You are able to learn and adapt

Nice to have:

A big advantage is, if you have an experience with Go, or with ETL pipelines

What we offer:
  • A great deal of freedom and trust
  • We have defined our 5 values and the 10 underlying key behaviors that we strongly believe in
  • We believe in flexible working hours to accommodate your working style
  • We work virtual-first with several Bloomreach Hubs available across three continents
  • We organize company events to experience the global spirit of the company and get excited about what's ahead
  • We encourage and support our employees to engage in volunteering activities - every Bloomreacher can take 5 paid days off to volunteer
  • We have a People Development Program -- participating in personal development workshops on various topics run by experts from inside the company
  • Our resident communication coach Ivo Večeřa is available to help navigate work-related communications & decision-making challenges
  • Our managers are strongly encouraged to participate in the Leader Development Program
  • Bloomreachers utilize the $1,500 professional education budget on an annual basis to purchase education products (books, courses, certifications, etc.)
  • The Employee Assistance Program -- with counselors -- is available for non-work-related challenges
  • Subscription to Calm - sleep and meditation app
  • We organize ‘DisConnect’ days where Bloomreachers globally enjoy one additional day off each quarter
  • We facilitate sports, yoga, and meditation opportunities for each other
  • Extended parental leave up to 26 calendar weeks for Primary Caregivers
  • Restricted Stock Units or Stock Options are granted depending on a team member’s role, seniority, and location
  • Everyone gets to participate in the company's success through the company performance bonus
  • We offer an employee referral bonus of up to $3,000 paid out immediately after the new hire starts
  • We reward & celebrate work anniversaries -- Bloomversaries

Additional Information:

Job Posted:
January 05, 2026

Employment Type:
Fulltime
Work Type:
Remote work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Senior Site Reliability Engineer - Data Pipeline

Principal Software Engineer, Trusted Data Platform

As a Principal Software Engineer, you will be a technical leader and hands-on co...
Location
Location
India , Bangalore
Salary
Salary:
Not provided
https://www.atlassian.com Logo
Atlassian
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s or Master’s degree in Computer Science, Software Engineering, or a related technical field
  • 10+ years of experience in backend software development, focusing on distributed systems and storage solutions
  • 5+ years of experience working with AWS storage services (S3, DynamoDB, EBS, EFS, FSx, Glacier)
  • Strong expertise in system design, architecture, and scalability for large-scale storage solutions
  • Proficiency in at least one major backend programming language (Kotlin, Java, Go, Rust, or Python)
  • Experience designing and implementing highly available, fault-tolerant, and cost-efficient storage architectures
  • Deep understanding of distributed systems, replication strategies, sharding, and caching
  • Knowledge of data security, encryption best practices, and compliance requirements (SOC2, GDPR, HIPAA)
  • Experience leading engineering teams, mentoring senior engineers, and driving technical roadmaps
  • Proficiency with observability tools, performance monitoring, and troubleshooting at scale
Job Responsibility
Job Responsibility
  • Designing and optimizing high-scale, distributed storage systems built on AWS storage technologies
  • Shaping the architecture, performance, and reliability of backend storage solutions that power critical applications at scale
  • Designing, implementing, and optimizing backend storage services that support high throughput, low latency, and fault tolerance
  • Working closely with senior engineers, architects, and cross-functional teams to drive scalability, availability, and efficiency improvements in large-scale storage solutions
  • Leading technical deep dives, architecture reviews, and root cause analyses to resolve complex production issues related to storage performance, consistency, and durability
  • Driving best practices in distributed system design, security, and cloud cost optimization
  • Mentoring senior engineers, contributing to technical roadmaps, and helping shape the long-term storage strategy
  • Collaborating with Site Reliability Engineers (SREs) to implement observability, monitoring, and disaster recovery strategies, ensuring high availability and compliance with industry standards
  • Advocating for automation, Infrastructure-as-Code (IaC), and DevOps best practices, leveraging tools like Terraform, AWS CloudFormation, Kubernetes (EKS), and CI/CD pipelines to enable scalable deployments and operational excellence
What we offer
What we offer
  • Atlassians can choose where they work – whether in an office, from home, or a combination of the two
  • Atlassians have more control over supporting their family, personal goals, and other priorities
  • We can hire people in any country where we have a legal entity
  • Interviews and onboarding are conducted virtually
  • Whatever your preference - working from home, an office, or in between - you can choose the place that's best for your work and your lifestyle
Read More
Arrow Right

Senior Software Engineer, Backend

As a Senior Software Engineer, Backend specializing in database architecture and...
Location
Location
United States , San Francisco
Salary
Salary:
150000.00 - 240000.00 USD / Year
chefrobotics.ai Logo
Chef Robotics
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Engineering, or equivalent practical experience
  • 7+ years of professional experience in backend development roles with demonstrated leadership experience
  • Expert knowledge of relational databases (MySQL, PostgreSQL) including schema design, optimization, and administration
  • Strong proficiency with Python and JavaScript/TypeScript with advanced software engineering skills
  • Extensive experience leading projects with at least two web frameworks: Flask, FastAPI, Django, Node.js, or Next.js
  • Proven experience designing and implementing RESTful and GraphQL APIs at scale
  • Advanced understanding of containerization (Docker) and orchestration (Kubernetes) technologies
  • Experience with cloud infrastructure and deployment (AWS, GCP, or Azure) in production environments
  • Proven experience leading complex backend projects and mentoring junior engineers
  • Understanding of data requirements for robotics or automation systems
Job Responsibility
Job Responsibility
  • Lead the design, implementation, and optimization of database schemas to support robot operations, telemetry, recipe management, and system analytics
  • Develop robust data migration strategies and version control for database schema evolution
  • Implement efficient query optimization and indexing strategies to support high-throughput robot operations
  • Establish data integrity protocols and backup systems to ensure operational continuity across customer deployments
  • Create scalable data access layers that balance security, performance, and maintainability
  • Mentor team members on database design patterns and optimization techniques
  • Lead the development and maintenance of scalable APIs to serve robot control systems, dashboards, and monitoring tools
  • Design and implement secure authentication and authorization mechanisms across backend services
  • Develop robust middleware for processing and validating data between robotics subsystems
  • Create service interfaces that enable efficient communication between robotics components and cloud services
What we offer
What we offer
  • medical, dental, and vision insurance
  • commuter benefits
  • flexible paid time off (PTO)
  • catered lunch
  • 401(k) matching
  • early-stage equity
  • Fulltime
Read More
Arrow Right

Senior Security Operations Engineer II

As a Senior Security Operations Engineer, you’ll play a key role in ensuring the...
Location
Location
United States , Scottsdale
Salary
Salary:
Not provided
axon.com Logo
Axon
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years of experience in operations, site reliability, or infrastructure engineering roles
  • Strong experience securing and managing cloud environments (e.g., AWS, Azure) and containerized workloads
  • Deep understanding of Linux systems, networking, distributed systems, and their associated security controls
  • Proficiency in automation, scripting, and security tooling integration to streamline operations and enforcement
  • Experience with security monitoring, alerting, SIEM platforms, and observability tools
  • Solid grasp of CI/CD practices with integrated security testing and compliance checks
  • Experience managing Kubernetes clusters and running containerized workloads in production
  • Experience with deploying and administrating any of the following: scalable cloud native secrets solutions such as AWS KMS, Azure KeyVault
  • PKI solutions such as EJBCA, Smallstep, Venafi
  • or vaulting solutions such as Hashicorp Vault
Job Responsibility
Job Responsibility
  • Implementing and improving automated security checks in CI/CD pipelines to prevent vulnerabilities from reaching production
  • Writing, reviewing, and maintaining security-focused infrastructure-as-code for scalable and compliant deployments
  • Investigating security incidents, performing root cause analysis, and implementing long-term mitigation strategies
  • Collaborating with developers to develop new features, services, and infrastructure requirements
  • Enhancing security observability through improved log collection, metrics, and alerting configurations
  • Maintaining and improving security runbooks, incident response playbooks, and internal security tooling for operational efficiency
  • Resolve security/infrastructure incidents by participating in high impact/high visibility incidents as a participant and ideally as an incident commander
  • Maintain and secure critical infrastructure components such as PKI (Public Key Infrastructure) and IAM ( Identity & Access Management) systems, ensuring reliability, scalability, and compliance with organizational and industry security standards
  • Build and maintain secure, reliable, and scalable infrastructure that protects core services and sensitive data
  • Troubleshoot and resolve complex operational and system-level issues across environments
What we offer
What we offer
  • Competitive salary and 401k with employer match
  • Discretionary paid time off
  • Paid parental leave for all
  • Medical, Dental, Vision plans
  • Fitness Programs
  • Emotional & Mental Wellness support
  • Learning & Development programs
  • Snacks in our offices
  • Fulltime
Read More
Arrow Right

Senior Security Operations Engineer II

As a Senior Security Operations Engineer, you’ll play a key role in ensuring the...
Location
Location
United States , Scottsdale
Salary
Salary:
Not provided
axon.com Logo
Axon
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years of experience in operations, site reliability, or infrastructure engineering roles
  • Strong experience securing and managing cloud environments (e.g., AWS, Azure) and containerized workloads
  • Deep understanding of Linux systems, networking, distributed systems, and their associated security controls
  • Proficiency in automation, scripting, and security tooling integration to streamline operations and enforcement
  • Experience with security monitoring, alerting, SIEM platforms, and observability tools
  • Solid grasp of CI/CD practices with integrated security testing and compliance checks
  • Experience managing Kubernetes clusters and running containerized workloads in production
  • Experience with deploying and administrating any of the following: scalable cloud native secrets solutions such as AWS KMS, Azure KeyVault
  • PKI solutions such as EJBCA, Smallstep, Venafi
  • or vaulting solutions such as Hashicorp Vault
Job Responsibility
Job Responsibility
  • Implementing and improving automated security checks in CI/CD pipelines to prevent vulnerabilities from reaching production
  • Writing, reviewing, and maintaining security-focused infrastructure-as-code for scalable and compliant deployments
  • Investigating security incidents, performing root cause analysis, and implementing long-term mitigation strategies
  • Collaborating with developers to develop new features, services, and infrastructure requirements
  • Enhancing security observability through improved log collection, metrics, and alerting configurations
  • Maintaining and improving security runbooks, incident response playbooks, and internal security tooling for operational efficiency
  • Resolve security/infrastructure incidents by participating in high impact/high visibility incidents as a participant and ideally as an incident commander
  • Maintain and secure critical infrastructure components such as PKI (Public Key Infrastructure) and IAM ( Identity & Access Management) systems, ensuring reliability, scalability, and compliance with organizational and industry security standards
  • Build and maintain secure, reliable, and scalable infrastructure that protects core services and sensitive data
  • Troubleshoot and resolve complex operational and system-level issues across environments
What we offer
What we offer
  • Competitive salary and 401k with employer match
  • Discretionary paid time off
  • Paid parental leave for all
  • Medical, Dental, Vision plans
  • Fitness Programs
  • Emotional & Mental Wellness support
  • Learning & Development programs
  • Snacks in our offices
  • Fulltime
Read More
Arrow Right

Senior Data Platform Engineer

We are looking for a Senior Data Platform Engineer to join our Data & Machine Le...
Location
Location
France , Paris
Salary
Salary:
Not provided
doctolib.fr Logo
Doctolib
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • More than 7 years of experience as Site Reliability Engineer, Data Ops, Data Platform Engineer or in a similar role, with a proven track record of building and maintaining complex data infrastructures
  • Strong proficiency in data engineering and infrastructure tools and technologies, such as stream and events processing (Kafka, PubSub, Firehose) and Kubernetes
  • Expertise in programming languages like Python
  • Familiar with cloud infrastructure and services, preferably AWS, Azure, or GCP, and have experience with infrastructure-as-code tools such as Terraform
  • Excellent problem-solving skills with a focus on identifying and resolving data infrastructure bottlenecks and performance issues
Job Responsibility
Job Responsibility
  • Design and implement a scalable and reliable data infrastructure that supports the collection, processing, storage, and analysis of large-scale datasets while pushing security and privacy best practices
  • Build and maintain data pipelines that efficiently extract, transform, and load data from various sources into our data warehouse
  • Implement automation and orchestration tools to streamline infrastructure provisioning, data workflows, reduce manual effort, and improve operational efficiency
  • Monitor data platform for performance and reliability, identify and troubleshoot issues, and implement proactive solutions to ensure data quality and availability
  • Streamline and monitor platform costs, identify optimizations and saving opportunities while collaborating with data engineers, data scientists, and other stakeholders
What we offer
What we offer
  • Free comprehensive health insurance for you and your children
  • Parent Care Program: receive one additional month of leave on top of the legal parental leave
  • Free mental health and coaching services through our partner Moka.care
  • For caregivers and workers with disabilities, a package including an adaptation of the remote policy, extra days off for medical reasons, and psychological support
  • Work from EU countries and the UK for up to 10 days per year, thanks to our flexibility days policy
  • Up to 14 days of RTT
  • A subsidy from the work council to refund part of the membership to a sport club or a creative class
  • Lunch voucher with Swile card
  • Fulltime
Read More
Arrow Right

Senior Site Reliability Engineer

We're looking for a Senior Site Reliability Engineer for our Currents team, resp...
Location
Location
United States , New York City
Salary
Salary:
129600.00 - 232200.00 USD / Year
braze.com Logo
Braze
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s in Computer Science, Software Engineering, or a related STEM field
  • Five (5) years of experience in any role/occupation/position involving software engineering or site reliability engineering
  • Experience must include: Using distributed systems to deploy and monitor live applications such as Kubernetes or Docker Swarm
  • Working with alerting software (Sentry, Datadog, and/or PagerDuty)
  • Utilizing programming languages (Java, Kotlin, and/or Ruby) to understand and contribute to the codebase
  • Storing data in relational and non-relational databases such as Postgres and MongoDb
  • Data streaming or queuing systems to build data pipelines with technologies like Kafka, Sidekiq or SQS and SNS
  • Leveraging continuous integration tools such as Jenkins or Buildkite
  • Collaborating with engineers through pull requests and code reviews in version control software such as GitHub or GitLab
Job Responsibility
Job Responsibility
  • Solve live performance and reliability issues and prevent their recurrence
  • Write and review code, educating engineers and building a culture of reliability
  • Practice sustainable incident response and blameless postmortems
  • Define and enable standards for monitoring, reliability, and performance
  • Bridge the gap between infrastructure and platform engineering teams
  • Support and improve services by planning for scale and reliability
  • Guide junior engineers in SRE best practices, software engineering, and agile project leadership
What we offer
What we offer
  • Competitive compensation that may include equity
  • Retirement and Employee Stock Purchase Plans
  • Flexible paid time off
  • Comprehensive benefit plans covering medical, dental, vision, life, and disability
  • Family services that include fertility benefits and equal paid parental leave
  • Professional development supported by formal career pathing, learning platforms, and a yearly learning stipend
  • A curated in-office employee experience, designed to foster community, team connections, and innovation
  • Opportunities to give back to your community, including an annual company-wide Volunteer Week and donation matching
  • Employee Resource Groups that provide supportive communities within Braze
  • Fulltime
Read More
Arrow Right

Senior Site Reliability Engineer - Data Pipeline

The Data Pipeline team is a backend-focused engineering team that is built on st...
Location
Location
Slovakia
Salary
Salary:
3500.00 EUR / Month
bloomreach.com Logo
Bloomreach
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • You can articulate how your contributions have transformed the way engineers work and think by fostering a strong DevOps/SRE culture.
  • You can demonstrate how impactful your work as an SRE or DevOps Engineer can be in connection to business success
  • You understand the importance of you build - you run it principle and you love the feeling you own it
  • You are mindful of the costs associated with running our service, which translates into effective vertical and horizontal pod autoscaling and detailed telemetry insights.
  • You believe the infrastructure as a code is the only thing that can bring stability into chaos
  • Terraform is your daily bread, and HELM deployments are your second-best friend
  • You use telemetry data and metrics to provide feedback to engineers on how the application and services behave
  • You can navigate yourself in complex service architecture by using distributed debugging
  • You have experience with Python and a solid grasp of engineering practices
  • A big advantage is, if you have an experience with Go, or with ETL pipelines
Job Responsibility
Job Responsibility
  • Build and maintain an ecosystem where engineers can safely and efficiently develop, debug and operate their services running in GCP, Kubernetes using DataFlow, DataProc and Python with Go
  • Make sure the services have high level of observability, enabling us to provide quality service for our customers
  • Ensure further services can scale vertically and horizontally based on current load, operational and telemetric data (OTEL, Prometheus, Victoria Metrics)
  • Ensure team have enough insights about health of our services (Grafana, Alerting, PageDuty)
  • Help the team to fulfill security requirements given ISO and SOC2 audits, by enforce security principles like key distribution, key rotation, authorisation & authentication on service level, data encryption at transit, data isolation, resource limitations, quality of service, audit logs (mainly by Enovy proxies)
  • Contribute to our tooling, so we have tools in place for debugging, troubleshoot and performance testing
  • Automate manual/semi-manual steps deployment and instance setup
  • Have hands on on L3 support and incident resolutions
  • Ensure CI pipelines have linters, security scans, code smell detection enabling engineers to produce quality MRs
What we offer
What we offer
  • A great deal of freedom and trust
  • Flexible working hours
  • Work virtual-first with several Bloomreach Hubs available across three continents
  • Company events
  • 5 paid days off to volunteer
  • People Development Program
  • Communication coach available
  • Leader Development Program
  • $1,500 professional education budget annually
  • Employee Assistance Program with counselors
  • Fulltime
Read More
Arrow Right
New

Senior Site Reliability Engineer

The Core Services Infrastructure and Security team in Microsoft Teams provides t...
Location
Location
United States , Redmond
Salary
Salary:
119800.00 - 234700.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Master's Degree in Computer Science, Information Technology, or related field AND 2+ years technical experience in software engineering, network engineering, or systems administration
  • OR Bachelor's Degree in Computer Science, Information Technology, or related field AND 4+ years technical experience in software engineering, network engineering, or systems administration
  • OR equivalent experience
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
  • These requirements include but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter
Job Responsibility
Job Responsibility
  • Contribute to the design, implementation, and operation of secure, reliable network and infrastructure services supporting Microsoft Teams’ microservices environment
  • Improve reliability by developing and refining monitoring, alerting, dashboards, and automated recovery mechanisms across critical control‑plane and data‑plane systems
  • Troubleshoot complex issues involving traffic routing, gateway behavior, certificate and TLS issues, DNS, CDN interactions, and network security policies
  • Serve as a Designated Responsible Individual (DRI) on a rotational basis triaging incidents, driving mitigations, documenting learnings, and helping improve live‑site processes
  • Participate in root‑cause analyses (RCAs) and implement durable fixes that eliminate recurring issues
  • Optimize the performance, reliability, and availability of services through data‑driven analysis using metrics, logs, and distributed tracing
  • Work closely with partner engineering teams (security, networking, microservices, compliance, governance) to deliver integrated improvements across shared infrastructure layers
  • Influence project designs by providing SRE perspective on reliability, scalability, testability, operability, and security
  • Contribute to documentation, patterns, and best practices that raise the operational bar across the broader Teams engineering organization
  • Identify opportunities for automation using scripts, pipelines, policy‑driven guardrails, or AI‑enabled tooling to reduce manual toil and increase engineering productivity
  • Fulltime
Read More
Arrow Right