CrawlJobs Logo

Sr. Software Engineer, Observability

dialpad.com Logo

Dialpad

Location Icon

Location:
India , Bengaluru

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

As a Sr. Software Engineer in Observability, you’ll be responsible for our metrics and log collection platform. You’ll work closely with other Infrastructure engineers to determine resource usage and requirements. You’ll also help create tooling, libraries, and documentation that enable other engineers to instrument their own projects. In addition, you’ll keep our team aware of trends in the larger observability/monitoring industry.

Job Responsibility:

  • Develop and improve instrumentation for monitoring and logging the health and availability of services
  • Develop and maintain the observability stack within Dialpad engineering
  • Define best practices and standards around making systems and services measurable and work with various teams to get those best practices applied
  • Create tools and libraries for other engineering teams to enable them to build self-monitoring capabilities
  • Create and own internal documentation used by the other engineering teams
  • Stay up-to-date with the latest trends in observability, logging, monitoring, and cloud technologies
  • Collaborate with different engineering teams to integrate observability practices into their workflows
  • Participate in a rotating on-call within the larger Infrastructure Engineering division.

Requirements:

  • Background in both Systems and/or Software Engineering
  • Experience in designing, automating, maintaining, and optimizing observability platforms (logging, metrics, and tracing)
  • Experience with configuration management tools such as Ansible, Terraform, etc.
  • Experience with Public Cloud environments such as GCP, AWS, etc.
  • Familiarity with languages such as Python, Go, Rust, etc.

Nice to have:

  • Previous direct experience with Grafana, Loki, Prometheus
  • Experience with Linux
  • Experience with Kubernetes (including GKE/EKS) and building containerized applications
  • Undergraduate degree in Computer Science or Engineering.
What we offer:
  • Competitive benefits and perks
  • Robust training program
  • Inclusive office environment
  • Recognized Great Place to Work culture.

Additional Information:

Job Posted:
December 29, 2025

Work Type:
On-site work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Sr. Software Engineer, Observability

Sr. Manager, Software Engineering (Search)

As a Senior Engineering Manager – Search, you will lead and inspire a talented t...
Location
Location
India , Hyderabad
Salary
Salary:
Not provided
highspot.com Logo
Highspot
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 3+ years of experience managing engineering teams, with a proven record of developing and scaling backend or search systems
  • 7+ years of total software development experience with cloud-native SaaS platforms
  • Strong background in search & recommendation technologies such as Lucene, Solr, Opensearch, Elasticsearch, RAG, or similar frameworks
  • Deep understanding of enterprise search architecture, schema design, and relevance tuning
  • Proven success building REST APIs, distributed systems, and integrating services using AWS or similar cloud platforms
  • Experience with object-oriented and functional programming languages, such as JavaScript/TypeScript, Python, or Ruby
  • Familiarity with machine learning and AI concepts for ranking, personalization, or content recommendations
  • Track record of attracting and developing diverse talent, fostering a collaborative and inclusive culture
  • Strong leadership, communication, and stakeholder management skills able to balance technical depth with strategic decision-making
Job Responsibility
Job Responsibility
  • Lead, mentor, and grow a team of search and backend engineers focused on high-impact, scalable search solutions
  • Own the technical vision for search architecture combining traditional and vector based, including relevance, ranking models, and distributed indexing systems
  • Drive execution excellence — set goals, manage delivery timelines, and ensure consistent progress against engineering objectives
  • Collaborate with Product and Data Science to translate customer and business needs into measurable search and content recommendation improvements
  • Optimize and scale our enterprise search stack (Lucene, Solr, ZooKeeper, or similar technologies) to support massive data volumes
  • Oversee the design and delivery of highly available distributed services and RESTful APIs integrated into Highspot’s platform
  • Partner with DevOps to ensure reliability, observability, and performance across multiple data centers
  • Champion AI-driven enhancements to improve personalization, ranking, and search recommendations
  • Foster a culture of quality, inclusion, and accountability, emphasizing mentorship, continuous learning, and technical excellence
  • Partner cross-functionally to ensure alignment between platform strategy and product outcomes, including stakeholder communication and risk management
  • Fulltime
Read More
Arrow Right

Sr Software Engineer

The Sr Software (Java) Developer is responsible for establishing and implementin...
Location
Location
Canada , Mississauga
Salary
Salary:
120800.00 - 170800.00 USD / Year
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 6+ years of strong hands-on experience in coding (Java)
  • deep expertise in system design and microservices architecture
  • experience with trunk-based development, feature flags, and progressive delivery strategies
  • proficiency in TDD, BDD, and automation-first mindset to ensure high test coverage and reliability
  • strong understanding of CI/CD pipelines, and DevOps practices
  • experience conducting code reviews, vulnerability assessments, and secure coding
  • familiarity with modern cloud-native technologies (AWS, Kubernetes, Docker)
  • excellent problem-solving skills and ability to work in fast-paced, agile environments
  • strong communication and collaboration skills
Job Responsibility
Job Responsibility
  • design, develop, and maintain robust, scalable, and high-performance applications
  • implement trunk-based development practices to enable continuous integration and rapid delivery
  • develop clean, maintainable, and testable code following SOLID principles and software design best practices
  • ensure high levels of unit test coverage, test-driven development (TDD), and behavior-driven development (BDD)
  • actively contribute to hands-on coding, code reviews, and refactoring to maintain high engineering standards
  • drive the adoption of modern engineering ways of working, including Agile, DevOps, and CI/CD
  • apply Behavior-Driven Development (BDD), Test-Driven Development (TDD), and unit testing to ensure code quality and functionality
  • collaborate effectively in agile environments, embracing DevOps principles and fostering a culture of continuous delivery and improvement
  • mentor junior engineers and foster a culture of engineering excellence and continuous learning
  • partner with architects, product owners, and cross-functional teams to design scalable and distributed systems
  • Fulltime
Read More
Arrow Right

Sr Software Development Engineer

As Highspot continues to scale rapidly, building a robust and efficient platform...
Location
Location
India , Hyderabad
Salary
Salary:
Not provided
highspot.com Logo
Highspot
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of experience in software or infrastructure engineering
  • At least 5 years focused on platform engineering or cloud infrastructure at scale
  • Proven success designing and operating internal developer platforms in AWS and/or Azure environments
  • Expert-level experience with Kubernetes, including provisioning, cluster lifecycle management, workload orchestration, and multi-tenant design
  • Strong expertise in Terraform, GitOps tools (e.g., ArgoCD), and CI/CD systems (e.g., GitHub Actions, Spinnaker)
  • Deep understanding of cloud networking, IAM, service meshes, and container orchestration at scale
  • Familiar with the CNCF landscape and how to leverage open-source tools to solve platform problems
  • Passion for developer experience
  • Track record of technical leadership, mentoring, and influencing engineering culture at a large scale
  • Bachelor's or Master’s in Computer Science or related discipline, or equivalent practical experience
Job Responsibility
Job Responsibility
  • Design and build scalable platform capabilities that empower engineering teams to ship features reliably, securely, and quickly
  • Create and maintain developer-facing tools and paved paths (e.g., CI/CD pipelines, Kubernetes platforms, observability stacks, secrets management)
  • Implement Infrastructure-as-Code and GitOps patterns to promote consistency, automation, and compliance across environments
  • Collaborate with product, security, and compliance stakeholders to build platform services that meet SLAs and governance standards
  • Drive efforts to standardize and simplify infrastructure across cloud environments (AWS, Azure), enabling secure multi-cloud operation
  • Lead incident response, reliability engineering, and observability improvements that ensure platform uptime and performance
  • Act as a technical mentor and thought leader, guiding teams on infrastructure architecture, platform adoption, and best practices
  • Define and execute on a strategic roadmap to evolve the internal platform in line with company growth and technology direction
  • Fulltime
Read More
Arrow Right
New

Sr. Distinguished Software Engineer (Anti-Money Laundering)

As a Sr. Distinguished Engineer at Capital One, you will be a part of a communit...
Location
Location
United States , McLean; Richmond; New York
Salary
Salary:
286200.00 - 392000.00 USD / Year
capitalone.com Logo
Capital One
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s Degree
  • At least 9 years of experience in Software Engineering and solution architecture
  • At least 9 years of experience in Cloud computing (AWS, Microsoft Azure, Google Cloud)
  • At least 9 years of experience in Data architecture
Job Responsibility
Job Responsibility
  • Decompose complex problems into practical and operational solutions
  • Ensure the quality of technical design and implementation
  • Serve as an authoritative expert on non-functional system characteristics, such as observability, resiliency, and operational excellence
  • Continue learning and injecting advanced technical knowledge into our community
  • Handle several projects simultaneously, balancing your time to maximize impact
  • Act as a role model and mentor within the tech community, helping to coach and strengthen the technical expertise and know-how of our engineering and product community
  • Learn the business of our stakeholders and conceive of creative technical solutions that solve for their goals
  • Develop full stack applications with a product engineering mindset, spanning frontend and backend ecosystems that balance simplicity with flexibility
  • Utilize AWS cloud Infrastructure across the entire stack (IaaS primitives to PaaS offerings)
  • Utilize best practices for modern engineering operations including observability, SLOs, Continuous Deployment, and Incident Management that embrace a 'you build it, you run it' mentality
What we offer
What we offer
  • performance based incentive compensation, which may include cash bonus(es) and/or long term incentives (LTI)
  • comprehensive, competitive, and inclusive set of health, financial and other benefits that support your total well-being
  • Fulltime
Read More
Arrow Right
New

Sr Principal Software Engineer

Zuora provides a platform for managing subscription-based businesses, handling b...
Location
Location
India , Bengaluru
Salary
Salary:
Not provided
zuora.com Logo
Zuora
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • BTech/BE in Computer Science, Engineering, or related discipline
  • 16+ years of experience in full-stack development and enterprise-scale architecture
  • Expert-level experience with Java/Spring, data structures & algorithms, and building large-scale distributed systems
  • Deep understanding of system design, microservices frameworks (Spring Boot, Dropwizard, etc.), PaaS environments, and modern web/cloud technologies (REST, gRPC, JSON, Protobufs)
  • Strong background in database design, object modeling, and API ecosystem development
  • Awareness of trade-offs in architecture (scalability vs. cost, flexibility vs. complexity) to deliver long-term value
  • Proven ability to design scalable, high-performance platforms supporting millions of users or transactions
  • Track record of driving architecture roadmaps and influencing technical direction at the org or business-unit level
  • Ability to influence without authority, guiding multiple engineering teams toward a common vision
  • Excellent communication and storytelling skills to align executives, PMs, and engineers
Job Responsibility
Job Responsibility
  • Architect and deliver secure, reliable, and scalable payment solutions that support core transaction lifecycles end-to-end
  • Design and evolve payments frameworks, APIs, and flows including checkout forms, hosted payment pages, and payment links that accelerate product development and improve global adoption
  • Guide system design trade-offs by balancing simplicity, performance, security, compliance, and cost throughout the payment journey
  • Architect and evolve payment orchestration layers to intelligently route transactions across multiple providers, optimize costs, maximize authorization rates, ensure redundancy, and reduce regional or processor dependencies
  • Anticipate fintech and regulatory trends to define an architectural vision that ensures competitiveness, compliance, and readiness for future payment innovations
  • Enable frictionless checkout experiences with embedded forms, localized hosted flows, and customizable links that maximize conversion and minimize abandonment
  • Support global readiness with diverse payment methods, multicurrency support, localized checkout experiences, and adherence to regional compliance requirements
  • Partner across Product, Risk, Compliance, and UX to ensure payment architecture aligns with customer needs, regulatory demands, and business growth goals
  • Provide technical thought leadership by mentoring engineers and architects, raising standards for design and execution in payment systems
  • Champion engineering excellence through adoption of best practices, observability, metrics-driven improvements, and continuous innovation in payments transaction processing
What we offer
What we offer
  • Competitive compensation, corporate bonus program and performance rewards, company equity and retirement programs
  • Medical insurance
  • Generous, flexible time off
  • Paid holidays, “wellness” days and company wide end of year break
  • 6 months fully paid parental leave
  • Learning & Development stipend
  • Opportunities to volunteer and give back, including charitable donation match
  • Free resources and support for your mental wellbeing
  • Fulltime
Read More
Arrow Right
New

Sr Software Engineering

Microsoft’s Health and Life Sciences (HLS) team is dedicated to empowering healt...
Location
Location
Canada , Vancouver
Salary
Salary:
114400.00 - 203900.00 CAD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • Experience with Kubernetes for container orchestration
  • Strong understanding of distributed systems and cloud-native design principles
  • Excellent problem-solving skills and ability to work collaboratively in cross-functional teams
Job Responsibility
Job Responsibility
  • Collaborate with SREs to design and implement features that improve system reliability, observability, and performance
  • Develop secure, scalable software components for healthcare applications built on Microsoft Cloud technologies
  • Partner with architects and product teams to integrate resiliency best practices into application design and safe deployment practices
  • Contribute to automated testing and validation frameworks to ensure high availability and disaster recovery readiness
  • Write scripts to automate operational tasks, improving efficiency and reducing manual effort
  • Participate in on-call rotations as the Directly Responsible Individual (DRI), taking ownership of issues and driving resolution
  • Participate in incident reviews and postmortems to drive continuous improvement in reliability and security
  • Fulltime
Read More
Arrow Right
New

Sr. Engineer, ML Platform

As the leading delivery platform in the region, we have a unique responsibility ...
Location
Location
United Kingdom , London
Salary
Salary:
Not provided
deliveryhero.com Logo
Delivery Hero
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong software engineering background with experience in building distributed systems or platforms designed for machine learning and AI workloads
  • Expert-level proficiency in Python and familiarity with ML frameworks (TensorFlow, PyTorch), infrastructure tooling (MLflow, Kubeflow, Ray), and popular APIs (Hugging Face, OpenAI, LangChain)
  • Experience implementing modern MLOps practices, including model lifecycle management, CI/CD, Docker, Kubernetes, model registries, and infrastructure-as-code tools (Terraform, Helm)
  • Demonstrated experience working with cloud infrastructure, ideally AWS or GCP, including Kubernetes clusters (GKE/EKS), serverless architectures, and managed ML services (e.g., Vertex AI, SageMaker)
  • Proven experience with generative AI technologies: transformers, embeddings, prompt engineering strategies, fine-tuning vs. prompt-tuning, vector databases, and retrieval-augmented generation (RAG) systems
  • Experience designing and maintaining real-time inference pipelines, including integrations with feature stores, streaming data platforms (Kafka, Kinesis), and observability platforms
  • Familiarity with SQL and data warehouse modeling
  • capable of managing complex data queries, joins, aggregations, and transformations
  • Solid understanding of ML monitoring, including identifying model drift, decay, latency optimization, cost management, and scaling API-based genAI applications efficiently
  • Bachelor’s degree in Computer Science, Engineering, or a related field
Job Responsibility
Job Responsibility
  • Design, build, and maintain scalable, reusable, and reliable ML platforms and tooling that support the entire ML lifecycle, including data ingestion, model training, evaluation, deployment, and monitoring for both traditional and generative AI models
  • Develop standardized ML workflows and templates using MLflow and other platforms, enabling rapid experimentation and deployment cycles
  • Implement robust CI/CD pipelines, Docker containerization, model registries, and experiment tracking to support reproducibility, scalability, and governance in ML and genAI
  • Collaborate closely with genAI experts to integrate and optimize genAI technologies, including transformers, embeddings, vector databases (e.g., Pinecone, Redis, Weaviate), and real-time retrieval-augmented generation (RAG) systems
  • Automate and streamline ML and genAI model training, inference, deployment, and versioning workflows, ensuring consistency, reliability, and adherence to industry best practices
  • Ensure reliability, observability, and scalability of production ML and genAI workloads by implementing comprehensive monitoring, alerting, and continuous performance evaluation
  • Integrate infrastructure components such as real-time model serving frameworks (e.g., TensorFlow Serving, NVIDIA Triton, Seldon), Kubernetes orchestration, and cloud solutions (AWS/GCP) for robust production environments
  • Drive infrastructure optimization for generative AI use-cases, including efficient inference techniques (batching, caching, quantization), fine-tuning, prompt management, and model updates at scale
  • Partner with data engineering, product, infrastructure, and genAI teams to align ML platform initiatives with broader company goals, infrastructure strategy, and innovation roadmap
  • Contribute actively to internal documentation, onboarding, and training programs, promoting platform adoption and continuous improvement
  • Fulltime
Read More
Arrow Right
New

Sr. Software Engineer II - Cloud Platform

We’re not just building better tech. We’re rewriting how data moves and what the...
Location
Location
Salary
Salary:
Not provided
confluent.io Logo
Confluent
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong programming skills, with a focus on reading, debugging, and evolving existing code in languages like Go, Java, or Python
  • Solid understanding of systems internals, including filesystems, memory management, network stacks, and kernel behavior
  • Hands-on experience running Kubernetes in production, and a deep understanding of containers and modern cloud-native workflows
  • Demonstrated experience with at least one major public cloud provider such as AWS, GCP, or Azure
  • A strong bias for automation and reproducibility
  • comfortable working with Kubernetes, GitOps, Terraform, CI/CD pipelines, and observability tools
  • Confidence in diagnosing complex systems, handling incidents, and driving continuous improvements to reliability
  • A genuine interest in how large-scale systems behave in the real world, and a drive to make them better every day
  • Exceptional teamwork, collaboration skills, and the ability to work independently as part of a globally distributed team
  • Solid written and verbal communication skills coupled with strong motivation to help others
Job Responsibility
Job Responsibility
  • Design, build, and evolve internal infrastructure services written in Go, often as Kubernetes operators, that power the core platform behind Confluent Cloud
  • Own the systems that make cloud infrastructure secure, scalable, observable, and reliable, using GitOps, Terraform, Prometheus, Grafana, and a strong foundation in Linux, networking, and public cloud
  • Collaborate with engineers across Confluent to enable fast, safe, and autonomous deployment of services through shared platform tooling and best practices
  • Take shared responsibility for the full lifecycle of our infrastructure: availability, performance, monitoring, incident response, and capacity planning
  • Work on systems at scale, across tens of thousands of instances and multiple regions, with a focus on smooth, fast, and safe operations
  • Influence the architecture and operational strategy behind the critical infrastructure that supports all of Confluent’s cloud services
  • Participate in a 8-hour, follow-the-sun on-call rotation
What we offer
What we offer
  • Remote-First Work
  • Robust Insurance Benefits
  • Flexible Time Away
  • The Best Teammates
  • Experience Ambassadors
  • Open and Honest Culture
  • Well-Being and Growth
  • Offers Equity
Read More
Arrow Right