Senior Site Reliability Engineer Job at HiveWatch (El Segundo)

Senior Site Reliability Engineer

HiveWatch

Location:
United States , El Segundo

Category:
IT - Software Development

Contract Type:
Not provided

Salary:

183000.00 - 235000.00 USD / Year

Save Job

Apply Position

Job Description:

HiveWatch is seeking a Staff Site Reliability Engineer to join our Platform Team, where you'll architect and maintain mission-critical edge infrastructure that connects our SaaS platform to customer systems. You'll ensure exceptional performance, reliability, and observability across our distributed environment while providing technical leadership to our growing engineering team. This role reports directly to our VP of Engineering.

Job Responsibility:

Own the reliability of mission-critical systems including production monitoring, alerting, and capacity planning
Debug and resolve complex production issues across the full stack, from infrastructure to application code
Participate in a regular on-call rotation to provide 24/7 coverage for critical systems
Perform root cause analysis requiring deep code-level investigation and implement preventive measures
Build automation and tooling to reduce operational toil and improve system reliability
Maintain CI/CD pipelines, observability infrastructure, and database performance optimization
Increase the resiliency, scalability, and maintainability of production environments
Establish on-call procedures and disaster recovery processes
Provide technical leadership and mentorship to foster engineering excellence and reliability culture

Requirements:

7+ years of software engineering experience with strong coding skills in production environments
5+ years of SRE, DevOps, or production operations experience
Expertise with cloud platforms (AWS preferred) and containerized applications (Docker, Kubernetes)
Experience with Infrastructure as Code (Terraform, CloudFormation, or similar)
Proficiency in at least one object oriented programming language in our tech stack (Java, Kotlin, Python)
Hands-on experience with relational databases and SQL performance optimization
Experience with monitoring and observability tools (Prometheus, Grafana, DataDog, or equivalent)
Strong debugging skills across distributed systems and microservices architectures
Bachelor's degree in Computer Science, Engineering, or equivalent practical experience

Nice to have:

Experience with our tech stack: Kotlin, Rust, TypeScript, Python
Expertise in AWS architecture and services
Experience in physical security, IoT, or edge computing environments
Expertise with advanced AWS services (Kinesis, Lambda, EKS, RDS)
Experience with Terraform and Terragrunt specifically
Background in high-availability, multi-tenant SaaS environments
Experience establishing SRE practices and culture from the ground up
Track record of leading incident response and post-mortem processes
Experience mentoring and developing junior engineers
Knowledge of security best practices and compliance requirements
Experience with edge computing and distributed system architectures
Previous experience in a startup or high-growth environment (50-200 employees)

What we offer:

Comprehensive health coverage: medical, dental, vision, and life insurance
Cutting-edge work in an emerging field with huge growth potential
Competitive compensation packages designed to reward top talent
A modern, newly renovated HQ right on Main Street in El Segundo, CA
401(k) with a 4% company match to help you invest in your future (match launches in 2026)
Flexible paid time off so you can recharge when you need it
Additional benefits include ClassPass credits and a discount on pet insurance
A family-friendly, compassionate culture that values balance and belonging
Eligible to participate in HiveWatch Equity Incentive Plan

Additional Information:

Job Posted:
December 09, 2025

Employment Type:

Fulltime

Work Type:

On-site work

HiveWatch - All Job Offers

Job Link Share: