CrawlJobs Logo

Database Reliability Engineer

pointclickcare.com Logo

PointClickCare

Location Icon

Location:
United States

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

120000.00 - 179000.00 USD / Year

Job Description:

The Database Reliability Engineer (DBRE) is responsible for managing, building, maintaining, monitoring, and troubleshooting the cloud-based MySQL database infrastructure that our mission-critical SaaS application depends on. This role also focuses heavily on automation and coding to reduce operational toil. The DBRE will collaborate closely with Engineering and SRE teams to support new product development and ensure reliable database integration across the platform.

Job Responsibility:

  • Managing, building, maintaining, monitoring, and troubleshooting the cloud-based MySQL database infrastructure that our mission-critical SaaS application depends on
  • Focuses heavily on automation and coding to reduce operational toil
  • Collaborate closely with Engineering and SRE teams to support new product development and ensure reliable database integration across the platform
  • Work on observability of MySQL database metrics and ensure database performance and reliability objectives are consistently met
  • Work with the DBA team to identify areas of operational toil and implement automations/processes to manage PCC’s MySQL database systems at scale
  • Apply a data-driven approach to performance tuning, availability improvements, and operational optimization
  • Provide database support to Engineering and SRE teams, including review of database migrations, query performance, schema/design improvements, and standardizing MySQL configuration and deployment patterns
  • Assist the DBA team with performance troubleshooting and root-cause analysis

Requirements:

  • 3+ years of experience working with relational database systems
  • Strong hands-on experience with MySQL (administration, performance tuning, replication, HA/DR)
  • 1+ years in a DBRE or database-focused engineering role
  • Experience working in cloud environments (AWS, GCP, or Azure — Azure preferred)
  • Coding and automation experience (Python, PowerShell, SQL, etc.)
  • Experience with Infrastructure-as-Code tools such as Ansible and Terraform
  • Experience working with source control systems such as Git
  • MySQL experience preferred
  • PostgreSQL is a plus
  • Experience working with VLDBs (1+ TB) and managing large database fleets (100+ instances)
  • Experience with scripting or programming languages such as PowerShell, C#, SQL, and Python
  • On-call or after-hours work will be required

Nice to have:

Additional experience with the following tools/systems is a plus: Git, Jira, Azure Cloud, Grafana, and various NoSQL technologies

What we offer:
  • Benefits starting from Day 1!
  • Retirement Plan Matching
  • Flexible Paid Time Off
  • Wellness Support Programs and Resources
  • Parental & Caregiver Leaves
  • Fertility & Adoption Support
  • Continuous Development Support Program
  • Employee Assistance Program
  • Allyship and Inclusion Communities
  • Employee Recognition … and more!
  • bonus

Additional Information:

Job Posted:
December 11, 2025

Employment Type:
Fulltime
Work Type:
Remote work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Database Reliability Engineer

Database Reliability Engineer

We are committed to providing our customers with reliable and secure services at...
Location
Location
Netherlands
Salary
Salary:
Not provided
clickhouse.com Logo
ClickHouse
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s or Master’s degree in Computer Science or a related field
  • At least 5 years of experience in Reliability Engineering, QA or customer facing engineering
  • Previous experience operating ClickHouse or other SQL databases in production
  • Excellent understanding of distributed database internals and SQL, particularly ClickHouse is a major plus
  • Scripting experience with Shell or Python, and ability to read and understand C++ code
  • Knowledge of cloud computing platforms such as AWS, Azure, or Google Cloud Platform
  • You are a strong problem-solver and have solid production debugging skills
  • You thrive in a fast-paced environment as part of a global team, and you see yourself as a partner with the business with the shared goal of moving the business forward
  • You have a high level of responsibility, ownership, and accountability
  • Excellent communication skills
Job Responsibility
Job Responsibility
  • Continuously improve the reliability and performance of ClickHouse core
  • Improve and create metrics and alerts for ClickHouse to be able to identify and prevent problems in production before they affect customers
  • Dig deeper into the most common problems encountered by customers in Clickhouse Core to identify the root cause of problems and submit bug fixes, issue reports and suggest improvements
  • Enhance and refine incident response processes and post-mortem analysis for ClickHouse core related outages including working with support and Cloud teams to communicate to the impacted customers
  • Plan, enable, and drive Chaos initiatives across Engineering teams, based upon internal priorities
  • Manage on-call processes to respond to performance and reliability issues, and establish best practices for coordinating escalation to resolve issues and minimize customer impact
What we offer
What we offer
  • Flexible work environment - ClickHouse is a globally distributed company and remote-friendly. We currently operate in 20 countries
  • Healthcare - Employer contributions towards your healthcare
  • Equity in the company - Every new team member who joins our company receives stock options
  • Time off - Flexible time off in the US, generous entitlement in other countries
  • A $500 Home office setup if you’re a remote employee
  • Global Gatherings – We believe in the power of in-person connection and offer opportunities to engage with colleagues at company-wide offsites
  • Fulltime
Read More
Arrow Right

Database Reliability Engineer

We are committed to providing our customers with reliable and secure services at...
Location
Location
Germany
Salary
Salary:
Not provided
clickhouse.com Logo
ClickHouse
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s or Master’s degree in Computer Science or a related field
  • At least 5 years of experience in Reliability Engineering, QA or customer facing engineering
  • Previous experience operating ClickHouse or other SQL databases in production
  • Excellent understanding of distributed database internals and SQL, particularly ClickHouse is a major plus
  • Scripting experience with Shell or Python, and ability to read and understand C++ code
  • Knowledge of cloud computing platforms such as AWS, Azure, or Google Cloud Platform
  • You are a strong problem-solver and have solid production debugging skills
  • You thrive in a fast-paced environment as part of a global team, and you see yourself as a partner with the business with the shared goal of moving the business forward
  • You have a high level of responsibility, ownership, and accountability
  • Excellent communication skills
Job Responsibility
Job Responsibility
  • Continuously improve the reliability and performance of ClickHouse core
  • Improve and create metrics and alerts for ClickHouse to be able to identify and prevent problems in production before they affect customers
  • Dig deeper into the most common problems encountered by customers in Clickhouse Core to identify the root cause of problems and submit bug fixes, issue reports and suggest improvements
  • Enhance and refine incident response processes and post-mortem analysis for ClickHouse core related outages including working with support and Cloud teams to communicate to the impacted customers
  • Plan, enable, and drive Chaos initiatives across Engineering teams, based upon internal priorities
  • Manage on-call processes to respond to performance and reliability issues, and establish best practices for coordinating escalation to resolve issues and minimize customer impact
What we offer
What we offer
  • Flexible work environment - ClickHouse is a globally distributed company and remote-friendly. We currently operate in 20 countries
  • Healthcare - Employer contributions towards your healthcare
  • Equity in the company - Every new team member who joins our company receives stock options
  • Time off - Flexible time off in the US, generous entitlement in other countries
  • A $500 Home office setup if you’re a remote employee
  • Global Gatherings – We believe in the power of in-person connection and offer opportunities to engage with colleagues at company-wide offsites
Read More
Arrow Right

Database Reliability Engineer - Core Team

We are committed to providing our customers with reliable and secure services at...
Location
Location
United Kingdom
Salary
Salary:
Not provided
clickhouse.com Logo
ClickHouse
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s or Master’s degree in Computer Science or a related field
  • At least 5 years of experience in Reliability Engineering, QA or customer facing engineering
  • Previous experience operating ClickHouse or other SQL databases in production
  • Excellent understanding of distributed database internals and SQL, particularly ClickHouse is a major plus
  • Scripting experience with Shell or Python, and ability to read and understand C++ code
  • Knowledge of cloud computing platforms such as AWS, Azure, or Google Cloud Platform
  • You are a strong problem-solver and have solid production debugging skills
  • You thrive in a fast-paced environment as part of a global team, and you see yourself as a partner with the business with the shared goal of moving the business forward
  • You have a high level of responsibility, ownership, and accountability
  • Excellent communication skills
Job Responsibility
Job Responsibility
  • Continuously improve the reliability and performance of ClickHouse core
  • Improve and create metrics and alerts for ClickHouse to be able to identify and prevent problems in production before they affect customers
  • Dig deeper into the most common problems encountered by customers in Clickhouse Core to identify the root cause of problems and submit bug fixes, issue reports and suggest improvements
  • Enhance and refine incident response processes and post-mortem analysis for ClickHouse core related outages including working with support and Cloud teams to communicate to the impacted customers
  • Plan, enable, and drive Chaos initiatives across Engineering teams, based upon internal priorities
  • Manage on-call processes to respond to performance and reliability issues, and establish best practices for coordinating escalation to resolve issues and minimize customer impact
What we offer
What we offer
  • Flexible work environment - ClickHouse is a globally distributed company and remote-friendly. We currently operate in 20 countries
  • Healthcare - Employer contributions towards your healthcare
  • Equity in the company - Every new team member who joins our company receives stock options
  • Time off - Flexible time off in the US, generous entitlement in other countries
  • A $500 Home office setup if you’re a remote employee
  • Global Gatherings – We believe in the power of in-person connection and offer opportunities to engage with colleagues at company-wide offsites
Read More
Arrow Right

Senior Database Engineer

We’re looking for a skilled Data Reliability Engineer to join our team for a cli...
Location
Location
Salary
Salary:
Not provided
zoolatech.com Logo
Zoolatech
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of experience in Data Engineering, Database Reliability, or Infrastructure Operations
  • Strong expertise in PostgreSQL on AWS, including tuning, replication, backups, and HA configurations
  • Experience operating RDBMS databases (PostgreSQL, MySQL, etc.) and Kubernetes technologies is highly desirable
  • Experience provisioning and operating NoSQL databases at scale like Elasticsearch, Elastic Cache, DynamoDB, Neo4j, Mongo, Cassandra, etc.
  • Advanced SQL scripting and query optimization skills
  • Experience with data systems monitoring, alerting, and performance tuning
  • Strong programming/scripting in Java, Python, or Shell
  • Proven experience in designing or supporting complex data ecosystems
  • Solid understanding of cloud infrastructure (preferably AWS) and Infrastructure as Code tools (Terraform)
  • Familiarity with event streaming platforms (Kafka), and observability stacks (New Relic, ELK, etc.)
Job Responsibility
Job Responsibility
  • Own and optimize the reliability, availability, and performance of data infrastructure across production systems
  • Lead the design and implementation of resilient, secure, and observable data systems
  • Collaborate with SRE, Security, and Engineering teams to enforce data infrastructure standards and align on architectural decisions
  • Design and implement automation around provisioning, uptime monitoring, data refresh, integrity, backups, and disaster recovery
  • Support application developers with performance tuning, complex query optimization, and database design reviews
  • Analyze and resolve performance bottlenecks and incidents with a focus on long-term solutions
  • Participate in on-call rotation to support production systems and ensure high availability
  • Actively contribute to improving incident response and observability through metrics, alerting, and runbooks
  • Work with technologies such as Java, Ruby on Rails, PostgreSQL, AWS, Kafka, S3, Elasticsearch
What we offer
What we offer
  • Paid Vacation
  • Sick Days
  • Floating Holidays
  • Sport/Insurance Compensation
  • English Classes
  • Charity
  • Training Compensation
Read More
Arrow Right

Senior Database Engineer

We’re looking for a skilled Data Reliability Engineer to join our team for a cli...
Location
Location
United States
Salary
Salary:
Not provided
zoolatech.com Logo
Zoolatech
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of experience in Data Engineering, Database Reliability, or Infrastructure Operations
  • Strong expertise in PostgreSQL on AWS, including tuning, replication, backups, and HA configurations
  • Experience operating RDBMS databases (PostgreSQL, MySQL, etc.) and Kubernetes technologies is highly desirable
  • Experience provisioning and operating NoSQL databases at scale like Elasticsearch, Elastic Cache, DynamoDB, Neo4j, Mongo, Cassandra, etc.
  • Advanced SQL scripting and query optimization skills
  • Experience with data systems monitoring, alerting, and performance tuning
  • Strong programming/scripting in Java, Python, or Shell
  • Proven experience in designing or supporting complex data ecosystems
  • Solid understanding of cloud infrastructure (preferably AWS) and Infrastructure as Code tools (Terraform)
  • Familiarity with event streaming platforms (Kafka), and observability stacks (New Relic, ELK, etc.)
Job Responsibility
Job Responsibility
  • Own and optimize the reliability, availability, and performance of data infrastructure across production systems
  • Lead the design and implementation of resilient, secure, and observable data systems
  • Collaborate with SRE, Security, and Engineering teams to enforce data infrastructure standards and align on architectural decisions
  • Design and implement automation around provisioning, uptime monitoring, data refresh, integrity, backups, and disaster recovery
  • Support application developers with performance tuning, complex query optimization, and database design reviews
  • Analyze and resolve performance bottlenecks and incidents with a focus on long-term solutions
  • Participate in on-call rotation to support production systems and ensure high availability
  • Actively contribute to improving incident response and observability through metrics, alerting, and runbooks
  • Work with technologies such as Java, Ruby on Rails, PostgreSQL, AWS, Kafka, S3, Elasticsearch
What we offer
What we offer
  • Paid Vacation
  • Sick Days
  • Floating Holidays
  • Sport/Insurance Compensation
  • English Classes
  • Charity
  • Training Compensation
  • Fulltime
Read More
Arrow Right

Site Reliability Engineer

About LogRocket: Founded in 2016, LogRocket's goal is to make every experience o...
Location
Location
United States , Boston
Salary
Salary:
135000.00 - 220000.00 USD / Year
logrocket.com Logo
LogRocket
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • At least 4 years of experience as a Site Reliability Engineer, or related job
  • Ability to read and understand product code
  • Familiarity with the state of the art in cloud technologies, including common providers, specific tools of the trade, and their strengths and weaknesses
  • Experience operating applications and databases with demanding scalability or availability requirements
  • Proven expertise in modern container orchestration practices
  • A strong understanding of the performance, architecture, tooling, and cost of cloud systems
  • A security focused mindset with a solid understanding of incident response and risk mitigation
  • A strong collaborator who is transparent about progress on tasks, seeks feedback early and often, works effectively with the team and customers
Job Responsibility
Job Responsibility
  • Improve quality of pager alerts while reducing noise
  • Maintain awareness of engineering initiatives across the organization and monitor their impact on stability, cost, and performance
  • Keep infrastructure up-to-date to take advantage of security patches and new features
  • Improve operational security without sacrificing engineering independence
What we offer
What we offer
  • Catered lunch and an impressive array of your favorite snacks
  • Unlimited vacation policy
  • Health, Dental, Vision benefits, 401k, commuter benefits
  • Generous stock options
  • Regular team outings and activities
  • Fulltime
Read More
Arrow Right

Site Reliability Engineer

Join our client, a leading financial institution at the forefront of innovation,...
Location
Location
United States , Austin
Salary
Salary:
57.00 - 63.33 USD / Hour
aquent.com Logo
Aquent
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proven experience leading engineering teams and delivering projects using Scrum and efficient release practices
  • Strong background in converting high-level designs into low-level designs and providing technical oversight
  • Demonstrated experience in designing, architecting, and deploying cloud-native applications, specifically on GCP
  • Proficiency with various database technologies, including MongoDB, Aerospike, SQL Server, and PostgreSQL
  • Expertise in containerization technologies such as Docker and Kubernetes, and building/managing CI/CD pipelines
  • Experience leveraging AI-Driven software development tools to enhance productivity, code comprehension, and documentation
  • Proven track record of integrating and applying AI/Machine Learning models for data analytics, visualization, automation, and problem-solving
  • Ability to maintain high quality standards while delivering within tight schedules
  • Exceptional collaborative mindset with a bias for action, engaging effectively with product management, architects, and other domains
  • Strong ability to work with internal, external, and offshore stakeholders
Job Responsibility
Job Responsibility
  • Drive Technical Leadership & Project Delivery: Lead engineering teams through the entire project lifecycle, leveraging agile methodologies like Scrum to ensure efficient delivery and robust release practices
  • Architect & Design Cloud-Native Solutions: Translate high-level architectural visions into detailed low-level designs, providing expert technical oversight for the development and deployment of cutting-edge cloud-native applications
  • Champion Reliability & Scalability: Design, architect, and deploy highly available and scalable cloud-native applications on platforms such as GCP, ensuring optimal performance and resilience
  • Optimize Data Management: Leverage your expertise with diverse database technologies, including MongoDB, Aerospike, SQL Server, and PostgreSQL, to build and maintain robust data solutions
  • Advance DevOps & Automation: Implement and optimize containerization strategies using technologies like Docker and Kubernetes, and establish sophisticated CI/CD pipelines to streamline development and deployment
  • Innovate with AI/ML: Integrate and apply AI/Machine Learning models to enhance data analytics, visualization, automation, and creatively solve complex business and technical challenges
  • Foster Collaboration & Mentorship: Work closely with diverse stakeholders across product management, architecture, and other engineering domains, while actively mentoring and coaching multiple teams to elevate technical capabilities
  • Influence & Present Solutions: Effectively engage subject matter experts, present complex architectural solutions to governance boards and stakeholders, and advocate for data-driven proposals
What we offer
What we offer
  • subsidized health, vision, and dental plans
  • paid sick leave
  • retirement plans with a match
Read More
Arrow Right

Senior Site Reliability Engineer

This is a role at Baxter where your work impacts saving and sustaining lives thr...
Location
Location
United States , Deerfield
Salary
Salary:
96000.00 - 132000.00 USD / Year
https://www.baxter.com/ Logo
Baxter
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in computer science, IT, or related field (or equivalent experience)
  • Prior experience in Site Reliability Engineering and cloud-based infrastructure management
  • Experience in enterprise engineering, including 24x7 uptime, regulated environments, and planning/operations
  • Azure administration and operations experience, with certifications a plus
  • Knowledge of related technologies, including cloud, encryption, and security protocols
  • Systems administration experience in Windows and Linux environments
  • Proven problem-solving skills and experience with scripting and automation tools
  • Ability to create accurate documentation and reports, with excellent communication skills
  • Applicants must be authorized to work for any employer in the U.S.
  • Unable to sponsor or take over sponsorship of an employment visa at this time.
Job Responsibility
Job Responsibility
  • Drive strategies to ensure 24x7 availability of services and business continuity for customer-facing healthcare software applications and platforms hosted on Microsoft Azure cloud
  • Manage and administer Azure resources, including virtual machines, databases, and networking components
  • Define and document operating procedures to ensure required security, privacy and other compliance standards are maintained for digital solutions deployed in cloud
  • Manage process, planning, and execution for Disaster Recovery (DR) and Business Continuity Planning (BCP)
  • Define and refine Operations SLAs to maintain high level of Customer Satisfaction
  • Establish non-functional requirements to meet SLAs
  • Establish infrastructure and application monitoring dashboards and workflow for automatic routing of notifications
  • Define key performance indicators that can be monitored, measured, and used to derive opportunities
  • Standardize site metrics for stakeholders, reporting on various KPIs including SLAs, availability, capacity utilization, service metrics and cost utilization
  • Work closely with DevOps Engineers to automate infrastructure provisioning and deployment processes.
What we offer
What we offer
  • Support for Parents
  • Continuing Education/Professional Development
  • Employee Health & Well-Being Benefits
  • Paid Time Off
  • 2 Days a Year to Volunteer
  • Medical and dental coverage starting day one
  • Insurance coverage for basic life, accident, short-term and long-term disability
  • Business travel accident insurance
  • Employee Stock Purchase Plan (ESPP)
  • 401(k) Retirement Savings Plan
  • Fulltime
Read More
Arrow Right