CrawlJobs Logo

Sr. Software Engineer, Observability

dialpad.com Logo

Dialpad

Location Icon

Location:
India , Bengaluru

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

As a Sr. Software Engineer in Observability, you’ll be responsible for our metrics and log collection platform. You’ll work closely with other Infrastructure engineers to determine resource usage and requirements. You’ll also help create tooling, libraries, and documentation that enable other engineers to instrument their own projects. In addition, you’ll keep our team aware of trends in the larger observability/monitoring industry.

Job Responsibility:

  • Develop and improve instrumentation for monitoring and logging the health and availability of services
  • Develop and maintain the observability stack within Dialpad engineering
  • Define best practices and standards around making systems and services measurable and work with various teams to get those best practices applied
  • Create tools and libraries for other engineering teams to enable them to build self-monitoring capabilities
  • Create and own internal documentation used by the other engineering teams
  • Stay up-to-date with the latest trends in observability, logging, monitoring, and cloud technologies
  • Collaborate with different engineering teams to integrate observability practices into their workflows
  • Participate in a rotating on-call within the larger Infrastructure Engineering division.

Requirements:

  • Background in both Systems and/or Software Engineering
  • Experience in designing, automating, maintaining, and optimizing observability platforms (logging, metrics, and tracing)
  • Experience with configuration management tools such as Ansible, Terraform, etc.
  • Experience with Public Cloud environments such as GCP, AWS, etc.
  • Familiarity with languages such as Python, Go, Rust, etc.

Nice to have:

  • Previous direct experience with Grafana, Loki, Prometheus
  • Experience with Linux
  • Experience with Kubernetes (including GKE/EKS) and building containerized applications
  • Undergraduate degree in Computer Science or Engineering.
What we offer:
  • Competitive benefits and perks
  • Robust training program
  • Inclusive office environment
  • Recognized Great Place to Work culture.

Additional Information:

Job Posted:
December 29, 2025

Work Type:
On-site work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Sr. Software Engineer, Observability

Sr. Manager, Software Engineering (Search)

As a Senior Engineering Manager – Search, you will lead and inspire a talented t...
Location
Location
India , Hyderabad
Salary
Salary:
Not provided
highspot.com Logo
Highspot
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 3+ years of experience managing engineering teams, with a proven record of developing and scaling backend or search systems
  • 7+ years of total software development experience with cloud-native SaaS platforms
  • Strong background in search & recommendation technologies such as Lucene, Solr, Opensearch, Elasticsearch, RAG, or similar frameworks
  • Deep understanding of enterprise search architecture, schema design, and relevance tuning
  • Proven success building REST APIs, distributed systems, and integrating services using AWS or similar cloud platforms
  • Experience with object-oriented and functional programming languages, such as JavaScript/TypeScript, Python, or Ruby
  • Familiarity with machine learning and AI concepts for ranking, personalization, or content recommendations
  • Track record of attracting and developing diverse talent, fostering a collaborative and inclusive culture
  • Strong leadership, communication, and stakeholder management skills able to balance technical depth with strategic decision-making
Job Responsibility
Job Responsibility
  • Lead, mentor, and grow a team of search and backend engineers focused on high-impact, scalable search solutions
  • Own the technical vision for search architecture combining traditional and vector based, including relevance, ranking models, and distributed indexing systems
  • Drive execution excellence — set goals, manage delivery timelines, and ensure consistent progress against engineering objectives
  • Collaborate with Product and Data Science to translate customer and business needs into measurable search and content recommendation improvements
  • Optimize and scale our enterprise search stack (Lucene, Solr, ZooKeeper, or similar technologies) to support massive data volumes
  • Oversee the design and delivery of highly available distributed services and RESTful APIs integrated into Highspot’s platform
  • Partner with DevOps to ensure reliability, observability, and performance across multiple data centers
  • Champion AI-driven enhancements to improve personalization, ranking, and search recommendations
  • Foster a culture of quality, inclusion, and accountability, emphasizing mentorship, continuous learning, and technical excellence
  • Partner cross-functionally to ensure alignment between platform strategy and product outcomes, including stakeholder communication and risk management
  • Fulltime
Read More
Arrow Right

Sr Software Engineer

The Sr Software (Java) Developer is responsible for establishing and implementin...
Location
Location
Canada , Mississauga
Salary
Salary:
120800.00 - 170800.00 USD / Year
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 6+ years of strong hands-on experience in coding (Java)
  • deep expertise in system design and microservices architecture
  • experience with trunk-based development, feature flags, and progressive delivery strategies
  • proficiency in TDD, BDD, and automation-first mindset to ensure high test coverage and reliability
  • strong understanding of CI/CD pipelines, and DevOps practices
  • experience conducting code reviews, vulnerability assessments, and secure coding
  • familiarity with modern cloud-native technologies (AWS, Kubernetes, Docker)
  • excellent problem-solving skills and ability to work in fast-paced, agile environments
  • strong communication and collaboration skills
Job Responsibility
Job Responsibility
  • design, develop, and maintain robust, scalable, and high-performance applications
  • implement trunk-based development practices to enable continuous integration and rapid delivery
  • develop clean, maintainable, and testable code following SOLID principles and software design best practices
  • ensure high levels of unit test coverage, test-driven development (TDD), and behavior-driven development (BDD)
  • actively contribute to hands-on coding, code reviews, and refactoring to maintain high engineering standards
  • drive the adoption of modern engineering ways of working, including Agile, DevOps, and CI/CD
  • apply Behavior-Driven Development (BDD), Test-Driven Development (TDD), and unit testing to ensure code quality and functionality
  • collaborate effectively in agile environments, embracing DevOps principles and fostering a culture of continuous delivery and improvement
  • mentor junior engineers and foster a culture of engineering excellence and continuous learning
  • partner with architects, product owners, and cross-functional teams to design scalable and distributed systems
  • Fulltime
Read More
Arrow Right

Sr Software Development Engineer

As Highspot continues to scale rapidly, building a robust and efficient platform...
Location
Location
India , Hyderabad
Salary
Salary:
Not provided
highspot.com Logo
Highspot
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of experience in software or infrastructure engineering
  • At least 5 years focused on platform engineering or cloud infrastructure at scale
  • Proven success designing and operating internal developer platforms in AWS and/or Azure environments
  • Expert-level experience with Kubernetes, including provisioning, cluster lifecycle management, workload orchestration, and multi-tenant design
  • Strong expertise in Terraform, GitOps tools (e.g., ArgoCD), and CI/CD systems (e.g., GitHub Actions, Spinnaker)
  • Deep understanding of cloud networking, IAM, service meshes, and container orchestration at scale
  • Familiar with the CNCF landscape and how to leverage open-source tools to solve platform problems
  • Passion for developer experience
  • Track record of technical leadership, mentoring, and influencing engineering culture at a large scale
  • Bachelor's or Master’s in Computer Science or related discipline, or equivalent practical experience
Job Responsibility
Job Responsibility
  • Design and build scalable platform capabilities that empower engineering teams to ship features reliably, securely, and quickly
  • Create and maintain developer-facing tools and paved paths (e.g., CI/CD pipelines, Kubernetes platforms, observability stacks, secrets management)
  • Implement Infrastructure-as-Code and GitOps patterns to promote consistency, automation, and compliance across environments
  • Collaborate with product, security, and compliance stakeholders to build platform services that meet SLAs and governance standards
  • Drive efforts to standardize and simplify infrastructure across cloud environments (AWS, Azure), enabling secure multi-cloud operation
  • Lead incident response, reliability engineering, and observability improvements that ensure platform uptime and performance
  • Act as a technical mentor and thought leader, guiding teams on infrastructure architecture, platform adoption, and best practices
  • Define and execute on a strategic roadmap to evolve the internal platform in line with company growth and technology direction
  • Fulltime
Read More
Arrow Right

Sr Staff Software Engineer - Compute Platform

We are seeking a highly experienced Senior Staff Engineer to lead the technical ...
Location
Location
United States , Sunnyvale
Salary
Salary:
267000.00 - 297000.00 USD / Year
uber.com Logo
Uber
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of software engineering experience, including expertise in distributed systems or infrastructure engineering
  • Deep expertise in Kubernetes internals, container runtimes, and cloud-native compute platforms
  • Strong background in containerization, resource scheduling, and cluster management at scale
  • Hands-on experience with performance tuning, reliability engineering, and cost optimization in compute environments
  • Excellent leadership, communication, and organizational skills, with a track record of building and mentoring high-performing teams
  • Strong coding proficiency in one or more languages such as Go, Java, or Python
  • Demonstrated ability to drive cross-functional technical initiatives and deliver impactful results
Job Responsibility
Job Responsibility
  • Own the technical vision, architecture, and strategy for the global compute platform org
  • Define and execute the roadmap for our compute platform, focusing on scalability, performance, and efficiency
  • Drive architectural decisions and set technical direction for compute scheduling, resource allocation, and container orchestration systems
  • Ensure high availability and reliability of the compute platform through best-in-class observability, automation, and incident response practices
  • Drive adoption of best practices in scalability, availability, and security for multi-tenant compute environments
  • Evaluate emerging technologies in cloud-native ecosystems and guide their integration into the platform
  • Partner with product and infrastructure teams to deliver high-impact, cross-organizational initiatives
  • Mentor and coach engineers, helping grow their technical depth and leadership skills
  • Influence company-wide engineering standards and practices
What we offer
What we offer
  • Eligible to participate in Uber's bonus program
  • May be offered an equity award & other types of comp
  • All full-time employees are eligible to participate in a 401(k) plan
  • Eligible for various benefits
  • Fulltime
Read More
Arrow Right

Sr Software Engineer, Gen AI

Instrumentl automates grant discovery and management for nonprofits. We’re a mis...
Location
Location
United States , Oakland
Salary
Salary:
175000.00 - 220000.00 USD / Year
helpcare.ai Logo
Helpcare AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of professional software engineering experience
  • 2+ years working with modern LLMs (as an IC)
  • Startup experience and comfort operating in fast, scrappy environments is a plus
  • Proven production impact: You’ve taken LLM/RAG systems from prototype to production, owned reliability/observability, and iterated post‑launch based on evals and user feedback
  • LLM agentic systems: Experience building tool/function‑calling workflows, planning/execution loops, and safe tool integrations (e.g., with LangChain/LangGraph, LlamaIndex, Semantic Kernel, or custom orchestration)
  • RAG expertise: Strong grasp of document ingestion, chunking/windowing, embeddings, hybrid search (keyword + vector), re‑ranking, and grounded citations. Experience with re‑rankers/cross‑encoders, hybrid retrieval tuning, or search/recommendation systems
  • Embeddings & vector stores: Hands‑on with embedding model selection/versioning and vector DBs (e.g., pgvector, FAISS, Pinecone, Weaviate, Milvus, Qdrant)
  • Evaluation mindset: Comfort designing eval suites (RAG/QA, extraction, summarization), using automated and human‑in‑the‑loop methods
  • familiarity with frameworks like Ragas/DeepEval/OpenAI Evals or equivalent
  • Infrastructure & languages: Proficiency in Python (FastAPI, Celery) and TypeScript/Node
Job Responsibility
Job Responsibility
  • Design agentic systems & ship AI to production: Turn prototypes into resilient, observable services with clear SLAs, rollback/fallback strategies, and cost/latency budgets. Build tool‑using LLM “agents” (task planning, function/tool calling, multi‑step workflows, guardrails) for tasks like grant discovery, application drafting, and research assistance
  • Own RAG end‑to‑end: Ingest and normalize content, choose chunking/embedding strategies, implement hybrid retrieval, re‑ranking, citations, and grounding. Continuously improve recall/precision while managing index health
  • Manage embeddings at scale: Select, evaluate, and migrate embedding models
  • maintain vector stores (e.g., pgvector/FAISS/Pinecone/Weaviate/Milvus/Qdrant)
  • monitor drift and rebuild strategies
  • Fine‑tune & build evaluation: Run SFT/LoRA or instruction‑tuning on curated datasets
  • evaluate the ROI vs. prompt engineering/model selection
  • manage data versioning and reproducibility. Create offline and online eval harnesses (helpfulness, groundedness, hallucination, toxicity, latency, cost), synthetic test sets, red‑teaming, and human‑in‑the‑loop review
  • Collaborate cross‑functionally while raising engineering standards: Work side by side with Product, Design, and GTM on scoping, UX, and measurement
  • run experiments (A/B, canaries), interpret results, and iterate. Write clear, maintainable code, add tests and docs, and contribute to reliability practices (alerts, dashboards, incident response)
What we offer
What we offer
  • 100% covered health, dental, and vision insurance for employees, 50% for dependents
  • Generous PTO policy, including parental leave
  • 401(k)
  • Company laptop + stipend to set up your home workstation
  • Company retreats for in-person time with your colleagues
  • Work with awesome nonprofits around the US. We partner with incredible organizations doing meaningful work, and you get to help power their success
  • Fulltime
Read More
Arrow Right

Sr. Director, Product Management, DevX - Operational Intelligence & Observability

The operational intelligence and observability team within our Cloud Operations ...
Location
Location
United States , McLean; Richmond; New York; Plano; San Francisco; Chicago
Salary
Salary:
245100.00 - 335700.00 USD / Year
capitalone.com Logo
Capital One
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • At least 9 years of experience working in Product Management
  • Currently has, or is in the process of obtaining one of the following with an expectation that the required degree will be obtained on or before the scheduled start date: A Bachelor's Degree in a quantitative field (Statistics, Economics, Operations Research, Analytics, Mathematics, Computer Science, Computer Engineering, Software Engineering, Mechanical Engineering, Information Systems or a related quantitative field)
  • A Master's Degree in a quantitative field (Statistics, Economics, Operations Research, Analytics, Mathematics, Computer Science, Computer Engineering, Software Engineering, Mechanical Engineering, Information Systems or a related quantitative field) or an MBA with a quantitative concentration
Job Responsibility
Job Responsibility
  • Demonstrate proficiency in five key areas: Human Centered - Obsesses about internal and external customer needs to reimagine and innovate product solutions
  • Business Focused -Delivers game-changing outcomes by focusing on leverage and execution excellence
  • Technology Driven -Leverages technology to deliver innovative and resilient solutions that enable both near term and long term value
  • Integrated Problem Solving - Identifies and resolves complex problems to deliver outcomes while mitigating product risks
  • Transformational Leadership -Leads cross functional teams to solve customer problems and drive organizational alignment
What we offer
What we offer
  • Performance based incentive compensation, which may include cash bonus(es) and/or long term incentives (LTI)
  • A comprehensive, competitive, and inclusive set of health, financial and other benefits that support your total well-being
  • Fulltime
Read More
Arrow Right

Sr Software Engineer - Python

As a Sr. Software Engineer, you serve as a specialist in the engineering team th...
Location
Location
India , Bangalore
Salary
Salary:
Not provided
blueyonder.com Logo
Blue Yonder
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in computer science is required, Masters is preferred
  • 4+ years of software engineering experience building production software
  • Experience in Frontend technologies, JavaScript, TypeScript, React
  • Good working knowledge of Kubernetes and other virtualized execution technologies
  • 1+ years of experience working on at least one cloud environment, GCP preferred
  • 4+ years of Python programming experience with excellent understanding of Object-Oriented Design & Patterns
  • 3+ years of experience in building REST APIs
  • 1+ Working Experience on Kafka and its integration with Cloud Services
  • 3+ years of Linux scripting experience
  • 1+ years working with traditional and new relational SQL DBMS
Job Responsibility
Job Responsibility
  • Design, architect, implement and help operate the Machine Learning platform
  • Develop and gain insight in the application architecture
  • Distill an abstract architecture into concrete design and influence the implementation
  • Observing inefficiencies, both in cost and reliability, of existing processes
  • Researching alternative solutions using custom or existing open source technologies
  • Designing replacement processes and components
  • Implementing processes, extending and configuring open source components
  • Work with the ML DevOps and Support teams to operate ML platform
  • Helping implement DevOps best practices of in-house and open source components
  • Ensuring smooth operation via monitoring and alerting facilities
  • Fulltime
Read More
Arrow Right

Sr. Distinguished Software Engineer (Anti-Money Laundering)

As a Sr. Distinguished Engineer at Capital One, you will be a part of a communit...
Location
Location
United States , McLean; Richmond; New York
Salary
Salary:
286200.00 - 392000.00 USD / Year
capitalone.com Logo
Capital One
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s Degree
  • At least 9 years of experience in Software Engineering and solution architecture
  • At least 9 years of experience in Cloud computing (AWS, Microsoft Azure, Google Cloud)
  • At least 9 years of experience in Data architecture
Job Responsibility
Job Responsibility
  • Decompose complex problems into practical and operational solutions
  • Ensure the quality of technical design and implementation
  • Serve as an authoritative expert on non-functional system characteristics, such as observability, resiliency, and operational excellence
  • Continue learning and injecting advanced technical knowledge into our community
  • Handle several projects simultaneously, balancing your time to maximize impact
  • Act as a role model and mentor within the tech community, helping to coach and strengthen the technical expertise and know-how of our engineering and product community
  • Learn the business of our stakeholders and conceive of creative technical solutions that solve for their goals
  • Develop full stack applications with a product engineering mindset, spanning frontend and backend ecosystems that balance simplicity with flexibility
  • Utilize AWS cloud Infrastructure across the entire stack (IaaS primitives to PaaS offerings)
  • Utilize best practices for modern engineering operations including observability, SLOs, Continuous Deployment, and Incident Management that embrace a 'you build it, you run it' mentality
What we offer
What we offer
  • performance based incentive compensation, which may include cash bonus(es) and/or long term incentives (LTI)
  • comprehensive, competitive, and inclusive set of health, financial and other benefits that support your total well-being
  • Fulltime
Read More
Arrow Right