This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Corporate Tools is looking for a Site Reliability Engineer. You will be a traditional company employee. This is a remote position, but if you’re near one of our local offices, you’re welcome to come hangout with us in-office as well. Our main offices are in Post Falls, ID, and Spokane, WA; we also have satellite offices in Austin, TX, and Salt Lake City, UT. You’ll be working 40 hours a week and, of course, enjoy great company benefits. Our Site Reliability Engineer should help keep our systems steady, secure, and running like a well-oiled machine (except without actual oil). You’ll work closely with our DevOps engineers to build out tools and automation that make things faster, easier, and less painful for everyone.
Job Responsibility:
Stop problems before they start
Fix issues quickly and learn from them
Help keep systems steady, secure, and running
Work closely with DevOps engineers to build out tools and automation
Take ownership
Requirements:
Bachelor's degree in Computer Science, Software Engineering, or equivalent practical experience
5+ years of experience in software engineering
2+ years of experience in site reliability engineering, DevOps, or infrastructure engineering roles
Deep experience with cloud platforms (AWS, Azure, or GCP) and infrastructure as code tools such as Terraform, CloudFormation, or Pulumi
Strong proficiency with Kubernetes, Docker, and container orchestration in production environments
Hands-on experience with observability and monitoring tools like Prometheus, Grafana, OpenTelemetry, Sentry, or New Relic
Proven ability to design and implement highly available, fault-tolerant systems and lead proactive incident response efforts
Experience with performance tuning, database optimization, and caching strategies (e.g., PostgreSQL, Redis, Memcached)
Demonstrated ability to drive reliability improvements, reduce operational toil, and foster a culture of resilience and continuous improvement
Experience leading reliability-focused initiatives such as post-incident reviews, capacity planning, and root cause analysis
Experience in site reliability engineering within Ruby on Rails environments
Familiarity with the Grafana observability stack and related tools (e.g., Alloy, Loki, Tempo, Prometheus)
In-depth experience with AWS services, including ECS, EKS, Route 53, and other related tools
Proven ability to collaborate across teams to improve service reliability, reduce incident frequency, and drive operational excellence
Troubleshoot and resolve complex production issues, applying SRE best practices to minimize impact and prevent recurrence
What we offer:
100% employer-paid medical, dental and vision for employees
Annual review with raise option
22 days Paid Time Off accrued annually, and 4 holidays
After 3 years, PTO increases to 29 days
Employees transition to flexible time off after 5 years with the company—not accrued, not capped, take time off when you want
Paid Parental Leave
Up to 6% company matching 401(k) with no vesting period
Quarterly allowance
Open concept office with friendly coworkers
Creative environment where you can make a difference
Welcome to CrawlJobs.com – Your Global Job Discovery Platform
At CrawlJobs.com, we simplify finding your next career opportunity by bringing job listings directly to you from all corners of the web. Using cutting-edge AI and web-crawling technologies, we gather and curate job offers from various sources across the globe, ensuring you have access to the most up-to-date job listings in one place.
We use cookies to enhance your experience, analyze traffic, and serve personalized content. By clicking “Accept”, you agree to the use of cookies.