CrawlJobs Logo

Senior Software Engineer, Observability

airtable.com Logo

Airtable

Location Icon

Location:
United States, San Francisco

Category Icon
Category:
IT - Software Development

Job Type Icon

Contract Type:
Employment contract

Salary Icon

Salary:

196000.00 - 270000.00 USD / Year

Job Description:

The Observability team at Airtable ensures that engineers have the tools they need to measure performance, monitor reliability, and debug issues in real time. The mission is to provide actionable insights into errors and crashes, fueling a better and more reliable experience for millions of users. The team builds logging, metrics, and tracing systems leveraged by nearly every engineering team. They also work on LLM observability for AI-powered features, providing visibility into prompts, model calls, and RAG components.

Job Responsibility:

  • Architect and scale core observability systems
  • Lead the design and evolution of logging, metrics, and tracing pipelines
  • Evaluate and integrate new technologies (e.g., OpenTelemetry, ClickHouse, ELK stack)
  • Guide and mentor a growing team of infrastructure engineers
  • Define and uphold coding standards and operational excellence
  • Partner with Deploy Infrastructure, Service Orchestration, and Product teams
  • Align infrastructure decisions with business goals
  • Own end-to-end reliability for observability tools and establish SLAs, SLOs, and error budgets
  • Optimize performance and cost of large-scale data pipelines
  • Shape the observability roadmap
  • Extend observability to LLM and AI features
  • Instrument prompts, model calls, and RAG pipelines
  • Design online and offline evaluation loops for LLM quality
  • Build dashboards and alerts for AI feature performance
  • Partner with AI and Product teams to define SLOs for AI features

Requirements:

  • 6+ years of software engineering experience
  • 3+ years focused on observability or infrastructure at scale
  • Demonstrated success implementing and running production-grade logging, metrics, or tracing systems
  • Proficiency in distributed systems concepts, data streaming pipelines, and container orchestration (Kubernetes)
  • Deep hands-on knowledge of tools such as Prometheus, Grafana, Datadog, OpenTelemetry, ELK Stack, Loki, or ClickHouse
  • Comfort with at least one programming language (e.g., Go, Python, Java) to build and maintain observability tooling
  • Experience mentoring engineers and collaborating across multiple teams
  • Strong communication skills
  • Eagerness to own high-impact initiatives
  • Proven ability to balance short-term fixes with long-term strategic vision
  • A passion for enabling engineering organizations through reliable, intuitive observability tools
  • Commitment to measuring success by team velocity and confidence

Nice to have:

  • Experience with LLM observability for AI-powered features
  • Experience instrumenting prompts, model calls, and RAG pipelines
  • Experience designing evaluation loops for LLM quality
  • Experience building dashboards and alerts for token usage, error rates, guardrail triggers, and model performance
What we offer:
  • Opportunity to receive benefits
  • Restricted stock units
  • May include incentive compensation
  • Comprehensive benefit offerings

Additional Information:

Job Posted:
December 05, 2025

Employment Type:
Fulltime
Work Type:
On-site work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Senior Software Engineer, Observability

New

Senior Software Engineer

The Wikimedia Foundation is looking for a Senior Software Engineer to join our t...
Location
Location
United States of America
Salary
Salary:
141352.00 - 175725.00 USD / Year
wikimediafoundation.org Logo
Wikimedia Foundation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Being comfortable working in a semi-ambiguous environment, similar to that of a startup
  • Experience in supporting complex web applications running on Amazon Web Services or other comparable cloud platforms
  • Experience working with Kafka or similar distributed event processing systems
  • Experience working with Nodejs and Go applications
  • Comfortable with configuration management and orchestration tools (ECS, Kubernetes), and modern observability infrastructure (monitoring, metrics and logging)
  • Aptitude for automation and streamlining of tasks
  • Comfortable with shell and scripting languages used in an SRE/Operations engineering context (e.g. Python, Go, Bash, Ruby, etc.)
  • Good understanding of Linux/Unix fundamentals and debugging skills
  • Strong English language skills and ability to work independently, as an effective part of a globally distributed team
  • B.S. or M.S. in Computer Science or equivalent in related work experience
Job Responsibility
Job Responsibility
  • Bringing your creativity to improve our current infrastructure
  • Being a key part of planning our future technical roadmap
  • Maintaining and improving the reliability of highly used commercial data feeds
  • Supporting new code/feature deployments
  • Troubleshooting, debugging and following-up on emerging issues in our application stack and its surroundings
  • Assisting in the architectural design of new services and making them operate at scale
  • Incident response, diagnosis and follow-up on system outages or alerts across Wikimedia Enterprise’s production infrastructure
  • Sharing our values and work in accordance with them
  • Fulltime
Read More
Arrow Right
New

Senior Software Engineer - Search

Truveta is the world’s first health provider led data platform with a vision of ...
Location
Location
United States , Seattle
Salary
Salary:
155000.00 - 190000.00 USD / Year
truveta.com Logo
Truveta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Computer Science, Software Engineering, Computer Engineering, Information Systems, or a related field (advanced degree a plus)
  • 5+ years of professional software engineering experience
  • Designing, building, and operating distributed systems at scale
  • Writing production-quality, efficient, multi-threaded code that runs reliably in cloud environments
  • Architecting and implementing search system features (indexing, querying, optimization), including building robust test frameworks
  • Reviewing data specifications and handling large-scale data storage and distribution using specialized protocols
  • Debugging and resolving complex production issues in distributed systems
  • Proven experience with cloud-native architectures and DevOps practices (preferably Azure, though AWS/GCP experience is relevant)
Job Responsibility
Job Responsibility
  • Design, build, and maintain index, query, and search system features utilized to aggregate and analyze health data
  • Architecting, implementing, and testing new index and query features
  • Optimizing end-to-end index performance
  • Planning, architecting, and deploying highly scalable and highly reliable search systems
  • Implement relevant compliance controls and conduct thorough security reviews
  • Drive observability, reliability, and automation across the infrastructure and platform
  • Monitor emerging technology in the search and infrastructure domains, evaluate applicability, and champion adoption where appropriate
  • Contribute to knowledge sharing and best practices within the team
What we offer
What we offer
  • Comprehensive benefits with strong medical, dental and vision insurance plans
  • 401K plan
  • Professional development & training opportunities for continuous learning
  • Work/life autonomy via flexible work hours and flexible paid time off
  • Generous parental leave
  • Regular team activities (virtual and in-person)
  • Additional compensation such as incentive pay and stock options
  • Fulltime
Read More
Arrow Right
New

Senior Software Engineer, Infrastructure Observability

We have an opening for a Senior Software Engineer on our Infrastructure Team, wi...
Location
Location
United States
Salary
Salary:
180000.00 - 225000.00 USD / Year
temporal.io Logo
Temporal
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Demonstrated ability to develop horizontally scalable, resilient, and high performance distributed systems in a production environment
  • Experience designing, implementing, deploying, and supporting large scale, geographically distributed observability and/or high throughput data streaming/processing pipelines, or similar
  • Expert in one or more high-level programming languages, preferably Go
  • Expert-level Kubernetes skills
  • Expert-level query development skills, preferably SQL
  • Hands-on experience with one or more cloud providers, preferably AWS, or GCP
  • Thorough understanding of computer architecture, operating systems, and networking
  • Familiarity with best practices regarding monitoring, instrumenting, and configuring infrastructure
  • User-first mindset
  • Motivated by impact
Job Responsibility
Job Responsibility
  • Lead the end-to-end Software Development Lifecycle: goals & requirements solicitation, design & review, implementation, operationalization & deployment, support & maintenance
  • Formulate feature designs, review with stakeholders, iterate to incorporate feedback and drive consensus
  • Clearly document design choices and operational knowledge to successfully deploy and manage the software you develop
  • Provide appropriate test and production readiness coverage for unit, integration, and performance of your feature ownership area
  • Set a high bar for technical excellence and take pride in the software you develop
  • Design and build multi-component, distributed systems that operate at scale
  • Investigate issues with a methodical approach to identify a root cause
  • Understand performance and reliability implications of design options at scale. Make related tradeoffs
  • Able to participate in the team’s on-call rotation
  • Expert-level knowledge of architecture and services of assigned domain. Strong command over all aspects of the Temporal ecosystem
What we offer
What we offer
  • Unlimited PTO, 12 Holidays + 2 Floating Holidays
  • 100% Premiums Coverage for Medical, Dental, and Vision
  • AD&D, LT & ST Disability, and Life Insurance (Standard & Supplemental Available)
  • Empower 401K Plan
  • Additional Perks for Learning & Development, Lifestyle Spending, In-Home Office Setup, Professional Memberships, WFH Meals, Internet Stipend and more
  • $3,600 / Year Work from Home Meals
  • $1,500 / Year Career Development & Learning
  • $1,200 / Year Lifestyle Spending Account
  • $1,000 / Year In-Home Office Setup (In addition to Temporal issued equipment)
  • $500 / Year Professional Memberships
  • Fulltime
Read More
Arrow Right
New

Senior Software Engineer - Developer Productivity

As a Software Engineer focused on Developer Productivity, you will work on desig...
Location
Location
United States , San Mateo
Salary
Salary:
170000.00 - 260000.00 USD / Year
skydio.com Logo
Skydio
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Understand cloud platforms architecture, especially networking, security, storage, and resilient application topologies
  • Familiarity with Bazel, Starlark, and maintaining rule sets
  • Prior experience implementing Continuous Deployment practices
  • Can write and test software in Go and Python
  • Bachelor’s degree in Computer Science or relevant experience
Job Responsibility
Job Responsibility
  • Identify and lead internal cross-team projects end-to-end with a keen eye for simplicity, reliability, and a low-friction developer experience
  • Feature and app development to streamline developer workflows, which span on-premises workstations, cloud workstations, backend services and other development productivity improvements
  • Build and maintain tooling common to engineering to improve deployments, observability, and scalability
  • Identify ways to deliver software updates to our customers more quickly
  • Improve the functionality, performance, and reliability of core build architecture and corresponding build infrastructure services including remote execution, remote cache, and build analytics
  • Educate developers and evangelize best practices on code quality, development workflows, and test
What we offer
What we offer
  • Equity in the form of stock options
  • Comprehensive benefits packages
  • Relocation assistance may also be provided for eligible roles
  • Paid vacation time
  • Sick leave
  • Holiday pay
  • 401K savings plan
  • Group health insurance plans
  • Fulltime
Read More
Arrow Right
New

Senior Software Engineer - Observability and Reliability

We are growing the engineering team and looking for engineers who have the chops...
Location
Location
United States , New York City
Salary
Salary:
150000.00 - 220000.00 USD / Year
sigmacomputing.com Logo
Sigma Computing
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong Computer Science fundamentals
  • 5+ years industry experience building and maintaining high-quality software, especially software other engineers use
  • You apply a product mindset to infrastructure systems and feel accomplished enabling others
  • Desire to be a great teammate and have fun at work
  • Strong sense of craftsmanship, and a healthy academic curiosity
Job Responsibility
Job Responsibility
  • Build observability tools and platforms, including: metrics, logging, distributed tracing, dashboarding, alerting, application performance management
  • Build with modern tools and languages like Go, Open Telemetry and Kubernetes
  • Participate in on-call rotation and ensure uptime of services
  • Create runtime tools/processes that optimize cloud triaging and limit downtime
  • Define best practices around making our systems and services measurable
  • Collaborate with peers and stakeholders through design and code reviews to ensure best practices amongst available technologies. We expect successful candidates to be coding a majority of their time
What we offer
What we offer
  • Equity
  • Generous health benefits
  • Flexible time off policy. Take the time off you need!
  • Paid bonding time for all new parents
  • Traditional and Roth 401k
  • Commuter and FSA benefits
  • Lunch Program
  • Dog friendly office
  • Fulltime
Read More
Arrow Right
New

Senior Software Engineer

We’re hiring a Senior Software Engineer to join our platform team and improve th...
Location
Location
Poland , Warsaw
Salary
Salary:
94000.00 - 121000.00 USD / Year
invisible.co Logo
Invisible Technologies
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 6+ years of software engineering work experience
  • Demonstrated expertise with Python and Typescript
  • Experience designing web APIs and packaged libraries
  • Fluency in DevOps practices and tooling (CI/CD, observability, Infra as code, etc.)
  • Ability to strategize across multiple layers of abstraction and reduce complexity
Job Responsibility
Job Responsibility
  • Lead the design and evolution of internal tools that abstract the details of deployment
  • Implement architectural changes to the platform to guarantee security and scalability
  • Partner with engineering teams to drive adoption and support their usage of our tools
  • Collaborate with SRE and Security to manage real-world instances of our platform
  • Author documentation to ease onboarding and user-friendliness of the tools we provide
What we offer
What we offer
  • Bonuses and equity are included in offers above entry level
  • Fulltime
Read More
Arrow Right
New

Senior Software Engineer, Platform Observability

Everlaw is looking for a Senior Software Engineer that brings experience in buil...
Location
Location
United States , Oakland
Salary
Salary:
164000.00 - 208000.00 USD / Year
everlaw.com Logo
Everlaw
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • BS or MS in Computer Science, or equivalent coursework
  • At least 3 years of experience building logging, metrics, and tracing infrastructure
  • Proficiency in coding in a language such as C, C++, C#, Java, Python, Javascript, Go or Rust
  • Experience with Infrastructure as Code and container solutions to manage cloud environments (ex: Terraform, Ansible, Docker, etc)
  • At least 1 year of experience leading multi-developer efforts, including planning, technical breakdown, and coordination
  • Excellent communication and collaboration skills
  • Please note that at this time, Everlaw is not sponsoring U.S. employment visas for this role. Due to federal contract requirements, Everlaw may only hire US citizens for this position.
Job Responsibility
Job Responsibility
  • Build observability strategies to support application and infrastructure metrics, logs, traces, dashboards, and alerts
  • Develop and maintain infrastructure as code (IAC) using tools such as Terraform and Ansible
  • Monitor usage trends to identify opportunities to optimize efficiency and performance of our metrics database and logging tools
  • Improve our on-call and incident management processes by encouraging deeper understanding, communication, and trust
  • Support developer projects by influencing design and implementation of infrastructure features as well as providing technical guidance
  • Support compliance efforts by promoting continuous documentation of our processes and involvement in audits
  • Provide Technical Mentorship to other engineers by both sharing your technical knowledge and becoming an expert in an area of our code base.
What we offer
What we offer
  • Equity program
  • 401(k) retirement plan with company matching
  • Health, dental, and vision
  • Flexible Spending Accounts for health and dependent care expenses
  • Paid parental leave and approximately 10 days (80 hours) per year of sick leave
  • Seventeen paid vacation days plus 11 federal holidays
  • Membership to Modern Health to help employees prioritize mental health and wellness
  • Annual allocation for Learning & Development opportunities and applicable professional membership dues
  • Company-sponsored life and disability insurance
  • Work in Uptown Oakland, just steps from the BART line and dozens of restaurants and walking distance to Lake Merritt
  • Fulltime
Read More
Arrow Right
New

Senior Software Engineer, Release Engineering

We’re looking for a Senior Software Engineer to join our Release Engineering tea...
Location
Location
United States
Salary
Salary:
143000.00 - 203000.00 USD / Year
getdbt.com Logo
dbt Labs
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Experience designing, operating, or improving CI/CD systems for large-scale distributed applications
  • Proficiency with one or more of the following: Helm, ArgoCD, Terraform, GitHub Actions, or Kubernetes
  • Familiarity with infrastructure-as-code practices and the principles of reliable, observable systems
  • Background in Python (or other modern language) development for automation or platform tooling
  • A collaborative mindset and interest in enabling other developers through tooling and platform improvements
  • Worked asynchronously as part of a fully remote, distributed team
Job Responsibility
Job Responsibility
  • Design, build, and maintain components of our CI/CD platform to make deployments safer, faster, and more reliable
  • Lead initiatives that improve automation, observability, and self-service capabilities for engineers
  • Collaborate across teams to identify friction points in our delivery process and build tools to eliminate them
  • Evolve our release architecture to support dbt Cloud’s multi-cloud, cell-based infrastructure at scale
  • Continuously improve developer experience by refining build pipelines, release workflows, and infrastructure-as-code practices
What we offer
What we offer
  • Unlimited vacation
  • 401k w/3% guaranteed contribution
  • Excellent healthcare
  • Paid Parental Leave
  • Wellness stipend
  • Home office stipend
  • Fulltime
Read More
Arrow Right
Welcome to CrawlJobs.com
Your Global Job Discovery Platform
At CrawlJobs.com, we simplify finding your next career opportunity by bringing job listings directly to you from all corners of the web. Using cutting-edge AI and web-crawling technologies, we gather and curate job offers from various sources across the globe, ensuring you have access to the most up-to-date job listings in one place.