CrawlJobs Logo

Senior Software Engineer - Observability and Reliability

sigmacomputing.com Logo

Sigma Computing

Location Icon

Location:
United States, New York City

Category Icon
Category:
IT - Software Development

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

150000.00 - 220000.00 USD / Year

Job Description:

We are growing the engineering team and looking for engineers who have the chops to build and deliver world-class technology. You will be part of a talented team of engineers with a shared mission to make data easily accessible.

Job Responsibility:

  • Build observability tools and platforms, including: metrics, logging, distributed tracing, dashboarding, alerting, application performance management
  • Build with modern tools and languages like Go, Open Telemetry and Kubernetes
  • Participate in on-call rotation and ensure uptime of services
  • Create runtime tools/processes that optimize cloud triaging and limit downtime
  • Define best practices around making our systems and services measurable
  • Collaborate with peers and stakeholders through design and code reviews to ensure best practices amongst available technologies. We expect successful candidates to be coding a majority of their time

Requirements:

  • Strong Computer Science fundamentals
  • 5+ years industry experience building and maintaining high-quality software, especially software other engineers use
  • You apply a product mindset to infrastructure systems and feel accomplished enabling others
  • Desire to be a great teammate and have fun at work
  • Strong sense of craftsmanship, and a healthy academic curiosity

Nice to have:

  • Experience building systems for data analytics
  • Distributed systems monitoring and profiling skills
  • Knowledge of cloud application security models
  • Administered cloud service infrastructure (GCP, AWS, Azure)
  • Startup experience
What we offer:
  • Equity
  • Generous health benefits
  • Flexible time off policy. Take the time off you need!
  • Paid bonding time for all new parents
  • Traditional and Roth 401k
  • Commuter and FSA benefits
  • Lunch Program
  • Dog friendly office

Additional Information:

Job Posted:
December 12, 2025

Employment Type:
Fulltime
Work Type:
On-site work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Senior Software Engineer - Observability and Reliability

New

Senior Software Engineer

The Wikimedia Foundation is looking for a Senior Software Engineer to join our t...
Location
Location
United States of America
Salary
Salary:
141352.00 - 175725.00 USD / Year
wikimediafoundation.org Logo
Wikimedia Foundation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Being comfortable working in a semi-ambiguous environment, similar to that of a startup
  • Experience in supporting complex web applications running on Amazon Web Services or other comparable cloud platforms
  • Experience working with Kafka or similar distributed event processing systems
  • Experience working with Nodejs and Go applications
  • Comfortable with configuration management and orchestration tools (ECS, Kubernetes), and modern observability infrastructure (monitoring, metrics and logging)
  • Aptitude for automation and streamlining of tasks
  • Comfortable with shell and scripting languages used in an SRE/Operations engineering context (e.g. Python, Go, Bash, Ruby, etc.)
  • Good understanding of Linux/Unix fundamentals and debugging skills
  • Strong English language skills and ability to work independently, as an effective part of a globally distributed team
  • B.S. or M.S. in Computer Science or equivalent in related work experience
Job Responsibility
Job Responsibility
  • Bringing your creativity to improve our current infrastructure
  • Being a key part of planning our future technical roadmap
  • Maintaining and improving the reliability of highly used commercial data feeds
  • Supporting new code/feature deployments
  • Troubleshooting, debugging and following-up on emerging issues in our application stack and its surroundings
  • Assisting in the architectural design of new services and making them operate at scale
  • Incident response, diagnosis and follow-up on system outages or alerts across Wikimedia Enterprise’s production infrastructure
  • Sharing our values and work in accordance with them
  • Fulltime
Read More
Arrow Right
New

Senior Software Engineer - Search

Truveta is the world’s first health provider led data platform with a vision of ...
Location
Location
United States , Seattle
Salary
Salary:
155000.00 - 190000.00 USD / Year
truveta.com Logo
Truveta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Computer Science, Software Engineering, Computer Engineering, Information Systems, or a related field (advanced degree a plus)
  • 5+ years of professional software engineering experience
  • Designing, building, and operating distributed systems at scale
  • Writing production-quality, efficient, multi-threaded code that runs reliably in cloud environments
  • Architecting and implementing search system features (indexing, querying, optimization), including building robust test frameworks
  • Reviewing data specifications and handling large-scale data storage and distribution using specialized protocols
  • Debugging and resolving complex production issues in distributed systems
  • Proven experience with cloud-native architectures and DevOps practices (preferably Azure, though AWS/GCP experience is relevant)
Job Responsibility
Job Responsibility
  • Design, build, and maintain index, query, and search system features utilized to aggregate and analyze health data
  • Architecting, implementing, and testing new index and query features
  • Optimizing end-to-end index performance
  • Planning, architecting, and deploying highly scalable and highly reliable search systems
  • Implement relevant compliance controls and conduct thorough security reviews
  • Drive observability, reliability, and automation across the infrastructure and platform
  • Monitor emerging technology in the search and infrastructure domains, evaluate applicability, and champion adoption where appropriate
  • Contribute to knowledge sharing and best practices within the team
What we offer
What we offer
  • Comprehensive benefits with strong medical, dental and vision insurance plans
  • 401K plan
  • Professional development & training opportunities for continuous learning
  • Work/life autonomy via flexible work hours and flexible paid time off
  • Generous parental leave
  • Regular team activities (virtual and in-person)
  • Additional compensation such as incentive pay and stock options
  • Fulltime
Read More
Arrow Right
New

Senior Software Engineer, Infrastructure Observability

We have an opening for a Senior Software Engineer on our Infrastructure Team, wi...
Location
Location
United States
Salary
Salary:
180000.00 - 225000.00 USD / Year
temporal.io Logo
Temporal
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Demonstrated ability to develop horizontally scalable, resilient, and high performance distributed systems in a production environment
  • Experience designing, implementing, deploying, and supporting large scale, geographically distributed observability and/or high throughput data streaming/processing pipelines, or similar
  • Expert in one or more high-level programming languages, preferably Go
  • Expert-level Kubernetes skills
  • Expert-level query development skills, preferably SQL
  • Hands-on experience with one or more cloud providers, preferably AWS, or GCP
  • Thorough understanding of computer architecture, operating systems, and networking
  • Familiarity with best practices regarding monitoring, instrumenting, and configuring infrastructure
  • User-first mindset
  • Motivated by impact
Job Responsibility
Job Responsibility
  • Lead the end-to-end Software Development Lifecycle: goals & requirements solicitation, design & review, implementation, operationalization & deployment, support & maintenance
  • Formulate feature designs, review with stakeholders, iterate to incorporate feedback and drive consensus
  • Clearly document design choices and operational knowledge to successfully deploy and manage the software you develop
  • Provide appropriate test and production readiness coverage for unit, integration, and performance of your feature ownership area
  • Set a high bar for technical excellence and take pride in the software you develop
  • Design and build multi-component, distributed systems that operate at scale
  • Investigate issues with a methodical approach to identify a root cause
  • Understand performance and reliability implications of design options at scale. Make related tradeoffs
  • Able to participate in the team’s on-call rotation
  • Expert-level knowledge of architecture and services of assigned domain. Strong command over all aspects of the Temporal ecosystem
What we offer
What we offer
  • Unlimited PTO, 12 Holidays + 2 Floating Holidays
  • 100% Premiums Coverage for Medical, Dental, and Vision
  • AD&D, LT & ST Disability, and Life Insurance (Standard & Supplemental Available)
  • Empower 401K Plan
  • Additional Perks for Learning & Development, Lifestyle Spending, In-Home Office Setup, Professional Memberships, WFH Meals, Internet Stipend and more
  • $3,600 / Year Work from Home Meals
  • $1,500 / Year Career Development & Learning
  • $1,200 / Year Lifestyle Spending Account
  • $1,000 / Year In-Home Office Setup (In addition to Temporal issued equipment)
  • $500 / Year Professional Memberships
  • Fulltime
Read More
Arrow Right
New

Senior Software Engineer - Developer Productivity

As a Software Engineer focused on Developer Productivity, you will work on desig...
Location
Location
United States , San Mateo
Salary
Salary:
170000.00 - 260000.00 USD / Year
skydio.com Logo
Skydio
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Understand cloud platforms architecture, especially networking, security, storage, and resilient application topologies
  • Familiarity with Bazel, Starlark, and maintaining rule sets
  • Prior experience implementing Continuous Deployment practices
  • Can write and test software in Go and Python
  • Bachelor’s degree in Computer Science or relevant experience
Job Responsibility
Job Responsibility
  • Identify and lead internal cross-team projects end-to-end with a keen eye for simplicity, reliability, and a low-friction developer experience
  • Feature and app development to streamline developer workflows, which span on-premises workstations, cloud workstations, backend services and other development productivity improvements
  • Build and maintain tooling common to engineering to improve deployments, observability, and scalability
  • Identify ways to deliver software updates to our customers more quickly
  • Improve the functionality, performance, and reliability of core build architecture and corresponding build infrastructure services including remote execution, remote cache, and build analytics
  • Educate developers and evangelize best practices on code quality, development workflows, and test
What we offer
What we offer
  • Equity in the form of stock options
  • Comprehensive benefits packages
  • Relocation assistance may also be provided for eligible roles
  • Paid vacation time
  • Sick leave
  • Holiday pay
  • 401K savings plan
  • Group health insurance plans
  • Fulltime
Read More
Arrow Right
New

Senior Software Engineer, Release Engineering

We’re looking for a Senior Software Engineer to join our Release Engineering tea...
Location
Location
United States
Salary
Salary:
143000.00 - 203000.00 USD / Year
getdbt.com Logo
dbt Labs
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Experience designing, operating, or improving CI/CD systems for large-scale distributed applications
  • Proficiency with one or more of the following: Helm, ArgoCD, Terraform, GitHub Actions, or Kubernetes
  • Familiarity with infrastructure-as-code practices and the principles of reliable, observable systems
  • Background in Python (or other modern language) development for automation or platform tooling
  • A collaborative mindset and interest in enabling other developers through tooling and platform improvements
  • Worked asynchronously as part of a fully remote, distributed team
Job Responsibility
Job Responsibility
  • Design, build, and maintain components of our CI/CD platform to make deployments safer, faster, and more reliable
  • Lead initiatives that improve automation, observability, and self-service capabilities for engineers
  • Collaborate across teams to identify friction points in our delivery process and build tools to eliminate them
  • Evolve our release architecture to support dbt Cloud’s multi-cloud, cell-based infrastructure at scale
  • Continuously improve developer experience by refining build pipelines, release workflows, and infrastructure-as-code practices
What we offer
What we offer
  • Unlimited vacation
  • 401k w/3% guaranteed contribution
  • Excellent healthcare
  • Paid Parental Leave
  • Wellness stipend
  • Home office stipend
  • Fulltime
Read More
Arrow Right
New

Senior Software Engineer II, Release Engineering

We’re looking for a Senior Software Engineer to join our Release Engineering tea...
Location
Location
United States
Salary
Salary:
153000.00 - 218000.00 USD / Year
getdbt.com Logo
dbt Labs
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Experience designing, operating, or improving CI/CD systems for large-scale distributed applications
  • Proficiency with one or more of the following: Helm, ArgoCD, Terraform, GitHub Actions, or Kubernetes
  • Familiarity with infrastructure-as-code practices and the principles of reliable, observable systems
  • Background in Python (or other modern language) development for automation or platform tooling
  • A collaborative mindset and interest in enabling other developers through tooling and platform improvements
  • Worked asynchronously as part of a fully remote, distributed team
Job Responsibility
Job Responsibility
  • Design, build, and maintain components of our CI/CD platform to make deployments safer, faster, and more reliable
  • Lead initiatives that improve automation, observability, and self-service capabilities for engineers
  • Collaborate across teams to identify friction points in our delivery process and build tools to eliminate them
  • Evolve our release architecture to support dbt Cloud’s multi-cloud, cell-based infrastructure at scale
  • Continuously improve developer experience by refining build pipelines, release workflows, and infrastructure-as-code practices
What we offer
What we offer
  • Unlimited vacation
  • 401k w/3% guaranteed contribution
  • Excellent healthcare
  • Paid Parental Leave
  • Wellness stipend
  • Home office stipend
  • Fulltime
Read More
Arrow Right

Senior Software Engineer, Observability

The Observability team at Airtable ensures that engineers have the tools they ne...
Location
Location
United States , San Francisco; New York; Seattle
Salary
Salary:
196000.00 - 270000.00 USD / Year
airtable.com Logo
Airtable
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 6+ years of software engineering experience
  • 3+ years focused on observability or infrastructure at scale
  • Demonstrated success implementing and running production-grade logging, metrics, or tracing systems
  • Proficiency in distributed systems concepts, data streaming pipelines, and container orchestration (Kubernetes)
  • Deep hands-on knowledge of tools such as Prometheus, Grafana, Datadog, OpenTelemetry, ELK Stack, Loki, or ClickHouse
  • Comfort with at least one programming language (e.g., Go, Python, Java) to build and maintain observability tooling
  • Experience mentoring engineers and collaborating across multiple teams
  • Strong communication skills
  • Eagerness to own high-impact initiatives
  • Proven ability to balance short-term fixes with long-term strategic vision
Job Responsibility
Job Responsibility
  • Architect and scale core observability systems
  • Lead the design and evolution of logging, metrics, and tracing pipelines
  • Evaluate and integrate new technologies (e.g., OpenTelemetry, ClickHouse, ELK stack)
  • Guide and mentor a growing team of infrastructure engineers
  • Define and uphold coding standards and operational excellence
  • Partner with Deploy Infrastructure, Service Orchestration, and Product teams
  • Align infrastructure decisions with business goals
  • Own end-to-end reliability for observability tools and establish SLAs, SLOs, and error budgets
  • Optimize performance and cost of large-scale data pipelines
  • Shape the observability roadmap
What we offer
What we offer
  • Opportunity to receive benefits
  • Restricted stock units
  • May include incentive compensation
  • Comprehensive benefit offerings
  • Fulltime
Read More
Arrow Right
New

Senior Software Engineer, Wikidata Platform

The Wikimedia Foundation is seeking a Senior Software Engineer to join the team ...
Location
Location
Salary
Salary:
141352.00 - 175725.00 USD / Year
wikimediafoundation.org Logo
Wikimedia Foundation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of experience as a backend or platform engineer working on distributed systems or data platforms
  • Deep understanding of database and knowledge graph representation technologies and standards
  • Proficiency in Java, C++, or other systems languages. Ability to set up, scale, and investigate systems is more important than expertise in a particular language.
  • Experience building and operating production-grade services with SLOs
  • Familiarity with modern observability tools (metrics, logging, tracing)
  • Understanding of graph databases, search indexes, or data processing pipelines
  • Ability to work collaboratively across disciplines and communicate clearly across technical and non-technical audiences
  • A commitment to learning, resilience, and contributing to a mission-driven engineering culture
Job Responsibility
Job Responsibility
  • Design, build, and maintain backend systems and APIs that power Wikidata’s query infrastructure
  • Improve reliability, observability, and automation of the Wikidata Query Service and data pipelines
  • Collaborate with SRE, data engineers, and product teams to ensure stability and scalability under growing usage
  • Monitor production systems, respond to operational incidents, and proactively identify and resolve bottlenecks
  • Support platform migrations and system upgrades (e.g., triple stores, streaming ingestion)
  • Contribute to deployment automation, CI/CD workflows, and service instrumentation
  • Participate in code reviews, design discussions, and technical planning
  • Document systems and share knowledge with team members and Wikimedia’s broader technical community
  • Fulltime
Read More
Arrow Right
Welcome to CrawlJobs.com
Your Global Job Discovery Platform
At CrawlJobs.com, we simplify finding your next career opportunity by bringing job listings directly to you from all corners of the web. Using cutting-edge AI and web-crawling technologies, we gather and curate job offers from various sources across the globe, ensuring you have access to the most up-to-date job listings in one place.