CrawlJobs Logo

Monitoring & Observability Engineer

https://www.citi.com/ Logo

Citi

Location Icon

Location:
India , Chennai

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

The Monitoring & Observability Engineer is a senior level position responsible for being an expert with a wide range of monitoring tools, including APM (Appdynamics), Splunk and other tools. The position will drive the monitoring agenda forward for the Global Consumer Bank, driving best-in-class monitoring across all regions and applications, and incubating new capabilities and technologies.

Job Responsibility:

  • Drive the best-in-class monitoring using a range of tools across all regions of Global Consumer bank
  • Drive POCs and incubate new features and capabilities
  • Be forward looking and ensure long term strategic success
  • Work closely with the monitoring operations teams, production support, performance test teams, operations, application owners and application owners to deliver best-in-class monitoring
  • Explain complicated performance bottlenecks to stakeholders
  • Understand complicated application architecture, including Java app servers, Web Servers, Cloud (PCF, AWS, Google), Kubernetes, TIBCO, mainframe
  • Build advanced dashboards and queries
  • Be a subject matter expert for the Global Consumer Bank, including conducting brown bags and office hours
  • Recommend product customization for system integration
  • Identify problem causality, business impact and root causes
  • Advise or mentor junior team members
  • Impact the engineering function by influencing decisions through advice, counsel or facilitating services
  • Appropriately assess risk when business decisions are made, demonstrating particular consideration for the firm's reputation and safeguarding Citigroup, its clients and assets, by driving compliance with applicable laws, rules and regulations, adhering to Policy, applying sound ethical judgment regarding personal behavior, conduct and business practices, and escalating, managing and reporting control issues with transparency

Requirements:

  • 3-7 years of relevant experience in an Engineering & IT role
  • At least 2+ years of hands-on working experience in: Strong understanding of UI/UX principles and best practices
  • Proficient in JavaScript, TypeScript, HTML, CSS, React, and Node.js
  • Experience with backend technologies and databases (e.g., MongoDB)
  • Experience with Python Programming
  • Experience with version control systems (e.g., Git)
  • Strong problem-solving and analytical skills
  • Excellent communication and collaboration skills
  • Create modular and reusable React components to streamline development and maintain consistency across the application
  • Continuously improve existing applications, addressing bugs, and implementing new features
  • Good exposure to microservices/micro front end architecture
  • Experience with building applications on cloud platform
  • Experience with CI/CD tools (e.g. GitHub Actions)
  • Portfolio showcasing UI/UX projects
  • At least 2+ years of hands-on working experience in: Enterprise monitoring system (such as AppDynamics, Grafana or any APM solutions)
  • Automation scripting (such as Python, PowerShell, etc)
  • Drive the implementation and configuration of Enterprise Observability solution( AppDynamics) to meet organizational monitoring needs
  • Plan, design, build and manage Observability for the applications running multi-cloud environment
  • Perform regular updates, patches, and upgrades to observability tools to ensure they are up-to-date and secure
  • 2+ years working Splunk (or alternative log analytics tool)
  • 2+ years of Experience with a range of architecture techstacks including Java app servers, Web Servers, Cloud (PCF, AWS, Google), Kubernetes, TIBCO, mainframe
  • Experience with synthetic monitoring tools (ideally Micro focus BSM / APM)
  • Ability to converse with application owners, architects, performance testers to pinpoint application performance bottlenecks via the monitoring & observability tools
  • Big Data / AIOPS experience is a strong plus. Including experience with the Splunk Machine Learning Toolkit
  • Experience working in Financial Services or a large complex and/or global environment
  • Experience working with diverse stakeholders, including operations, application developers and performance testing
  • Project Management experience
  • Consistently demonstrates clear and concise written and verbal communication
  • Comprehensive knowledge of design metrics, analytics tools, benchmarking activities and related reporting to identify best practices
  • Demonstrated analytic/diagnostic skills
  • Ability to work in a matrix environment and partner with virtual teams
  • Ability to work independently, multi-task, and take ownership of various parts of a project or initiative
  • Ability to work under pressure and manage to tight deadlines or unexpected changes in expectations or requirements
  • Proven track record of operational process change and improvement

Nice to have:

  • Big Data / AIOPS experience
  • Experience with the Splunk Machine Learning Toolkit
  • Experience working in Financial Services or a large complex and/or global environment

Additional Information:

Job Posted:
April 29, 2025

Employment Type:
Fulltime
Work Type:
On-site work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Monitoring & Observability Engineer

New

Observability engineer

Be the eyes and ears of our platforms with a role that puts you at the heart of ...
Location
Location
Bulgaria , Sofia
Salary
Salary:
Not provided
ebrd.com Logo
European Bank for Reconstruction and Development
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Designing, Implementing and Supporting COTS and Open Source monitoring solutions
  • Understanding of software development principles and troubleshooting application issues
  • Understanding of infrastructure management principles and troubleshooting practices
  • Understanding of performance monitoring approaches
  • Knowledge of Azure monitoring services, container monitoring
  • Understanding of telemetry standards for interoperability
  • Intermediate to advanced technology certification in the given specialism
  • Entry level service management certification such a ITIL Foundation
Job Responsibility
Job Responsibility
  • Design, automate, and optimize observability platforms for logging, metrics, and tracing
  • Expertise in consolidating and analysing application / system logs at enterprise scale, including familiarity with distributed tracing technologies, integrating with ITSM platforms
  • Proficient in scripting languages (Python, Bash, PowerShell) for task automation
  • Experience with Terraform or Ansible for deploying and configuring monitoring / logging infrastructure
  • Strong understanding of protocols (WMI, SSH, SNMP) and methods (API, Traps) for data gathering
What we offer
What we offer
  • Varied, stimulating and engaging work
  • A working culture that embraces inclusion and celebrates diversity
  • An environment that places sustainability, equality and digital transformation at the heart of what we do
  • Flexible working
  • Fulltime
Read More
Arrow Right

Lead Director – Observability Engineering

At CVS Health, we’re building a world of health around every consumer and surrou...
Location
Location
United States
Salary
Salary:
144200.00 - 288400.00 USD / Year
https://www.cvshealth.com/ Logo
CVS Health
Expiration Date
December 31, 2025
Flip Icon
Requirements
Requirements
  • 10+ years of experience Leading Software Development teams developing and managing applications for IT operations, SRE, logging and/or observability, with at least 5 years in a leadership role within a large enterprise (Fortune 100)
  • 10 + years' of experience designing, developing, and implementing observability systems for large-scale, distributed systems, encompassing legacy and modern technologies
  • Experience leading a major logging and observability platform migration. Demonstrable experience building custom monitoring solutions
  • Proven experience building and implementing operational data models. Experience designing and deploying data lakes and data pipelines at massive scale in an enterprise environment. Experience with enterprise demand analysis, capacity planning, and performance engineering
  • Deep knowledge of, and experience with on-premises infrastructure, cloud infrastructure, and application architectures
  • Strong background in cloud-native technologies and architectures (e.g., Kubernetes, Docker, microservices) and an understanding of the unique challenges they pose to observability
  • Proven experience developing automation solutions and workflows for deployment, event correlation, and incident remediation
  • Experience and/or expertise with the following: Core Platforms & Languages: Python, Java, Javascript, XML, JSON - 5 years
  • Application Programming Interfaces (API): REST, SOAP - 5 years
  • Source Control: Github/GitOps - 5 years
Job Responsibility
Job Responsibility
  • Program Development and Modernization –Develop a plan to rationalize and modernize observability platforms, delivering an efficient observability ecosystem that meets the unique needs of CVS Health
  • Spearhead technology enablement for the transition of services from numerous legacy platforms, improving operational visibility and predictability
  • This involves designing and implementing complex solutions to collect, process, and manage structured and unstructured data at massive scale, optimizing built and purchased platforms to ensure efficient and performant operations, and ensuring solutions align with the organization’s goals
  • Team Leadership and Mentoring: Provide guidance and leadership to the Observability Engineering team
  • This involves hiring and developing talent, mentoring and supporting team members, assigning tasks, and ensuring projects are on track
  • Foster collaboration and knowledge sharing within the team
  • Architecture and Design: Help define the overall architecture of the Observability environment, including observability standards, data models, integrations, and security controls
  • ensure our platforms are scalable, reliable, and aligned with best practices
  • Leverage open source and commercial software to deliver and maintain resilient, reliable, cost-effective platforms tailored to the needs of CVS Health
  • Project Management: Engage executives, department heads, and IT teams to plan, execute, and oversee Observability projects
What we offer
What we offer
  • Affordable medical plan options, a 401(k) plan (including matching company contributions), and an employee stock purchase plan
  • No-cost programs for all colleagues including wellness screenings, tobacco cessation and weight management programs, confidential counseling and financial coaching
  • Benefit solutions that address the different needs and preferences of our colleagues including paid time off, flexible work schedules, family leave, dependent care resources, colleague assistance programs, tuition assistance, retiree medical access and many other benefits depending on eligibility
  • Fulltime
Read More
Arrow Right

Observability Engineer

Observability Engineer position within the Engineering team of Digital Business ...
Location
Location
Poland
Salary
Salary:
Not provided
https://www.hsbc.com Logo
HSBC
Expiration Date
January 13, 2026
Flip Icon
Requirements
Requirements
  • Hands-on experience with AppDynamics and Splunk in enterprise environments
  • Strong knowledge of OpenTelemetry and distributed tracing concepts
  • Proficiency in instrumenting applications for metrics, logs, and traces
  • Familiarity with scripting or programming languages (e.g., Python, Java, Go)
  • Strong analytical and problem-solving skills
  • Fluency in written and spoken English
  • Comfortable working in a multi-cultured/global environment
  • Strong level of IT literacy, including broad knowledge of IT infrastructure (operating systems, networking, security)
  • Proven Agile delivery experience
  • Able to operate both autonomously and collaboratively in diverse global teams
Job Responsibility
Job Responsibility
  • Design, deploy, and maintain observability solutions using AppDynamics, Splunk, and OpenTelemetry
  • Develop and implement monitoring, logging, and tracing strategies for distributed applications and microservices
  • Create and maintain dashboards, alerts, and reports to provide actionable insights for engineering and operations teams
  • Collaborate with Product Engineering, Service Delivery and SRE teams to define observability requirements and best practices
  • Stay current with industry trends and emerging observability tools and practices
  • Adhere to HSBC policy, procedures, and control requirements applicable to day-to-day working
What we offer
What we offer
  • Competitive salary
  • Annual performance-based bonus
  • Additional bonuses for recognition awards
  • Multisport card
  • Private medical care
  • Life insurance
  • One-time reimbursement of home office set-up (up to 800 PLN)
  • Corporate parties & events
  • CSR initiatives
  • Nursery and kindergarten discounts
  • Fulltime
Read More
Arrow Right

Observability Engineer – Splunk Focus

Join our growing Monitoring team! As a Splunk Specialist, you will collaborate c...
Location
Location
Portugal , Lisbon
Salary
Salary:
Not provided
https://www.inetum.com Logo
Inetum
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proven expertise in Splunk Enterprise
  • Strong experience with Splunk ITSI
  • Knowledge of Cribl
  • Ability to design and implement Splunk dashboards
  • Familiarity with automation tools (e.g., Ansible)
  • Experience working in multi-regional teams is a plus
Job Responsibility
Job Responsibility
  • Provide support for monitoring tools: Splunk (Enterprise & ITSI), OpenTelemetry, Cribl, SolarWinds, Dynatrace
  • Automate daily tasks using Ansible
  • Assist development and production teams in migrating to the new Splunk Enterprise and ITSI platforms
  • Build dashboards and define relevant metrics
  • Propose and implement improvements across tools, processes, and KPIs
  • Fulltime
Read More
Arrow Right

Federal Observability Engineer

You will be part of a larger technical team, working as an Observability Enginee...
Location
Location
United States , HILL AFB
Salary
Salary:
105500.00 - 243000.00 USD / Year
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • US Citizenship Required
  • Secret Clearance Required
  • DD8750 - Security Plus or higher Security Certification (CISSP, CASP, etc)
  • Bachelor's degree preferred or Associate degree holder (technical field) with 6-8 years working experience in related fields
  • Strong understanding of cloud computing platforms (AWS, Azure, GCP)
  • Experience with containerization technologies (Docker, Kubernetes)
  • Proficiency in scripting languages (Python, Go, Bash)
  • Experience with SQL and NoSQL databases
  • Knowledge of networking protocols (TCP/IP, HTTP)
  • Proven experience with the OpsRamp platform is a strong plus
Job Responsibility
Job Responsibility
  • Designing, implementing, and maintaining observability infrastructure in an OpsRamp environment
  • Working as part of a larger technical team supporting HPE's PCE environment and Cloud infrastructure for a Federal Customer
  • Configuring and managing data sources, defining and monitoring key performance indicators (KPIs), and analyzing performance trends
  • Configuring log collection, aggregation, and analysis within the OpsRamp platform
  • Creating and managing alerts, defining escalation paths, and integrating with incident management systems
  • Developing and implementing automated workflows and remediation actions within the OpsRamp platform
  • Designing and building custom dashboards and reports to provide key insights into system health and performance
  • Integrating OpsRamp with other monitoring and observability tools as needed
  • Ensuring data quality and integrity within the OpsRamp platform
  • Troubleshooting and resolving performance issues, application errors, and other operational problems
What we offer
What we offer
  • Health & Wellbeing benefits
  • Personal & Professional Development programs
  • Unconditional Inclusion environment
  • Comprehensive suite of benefits supporting physical, financial and emotional wellbeing
  • Fulltime
Read More
Arrow Right

Staff Observability Operations Engineer

We are currently seeking several experienced and highly skilled Staff Observabil...
Location
Location
United States , Hartford
Salary
Salary:
130295.00 - 260590.00 USD / Year
https://www.cvshealth.com/ Logo
CVS Health
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ Years of experience in IT operations, with significant responsibilities in system monitoring, performance tuning, and troubleshooting enterprise applications
  • 5+ Years in a Site Reliability Engineering (SRE) role deploying and managing modern observability solutions
  • 5+ Years managing and implementing observability and event management platforms (e.g., AppDynamics, Splunk, Prometheus, Grafana)
  • Experience developing and administering ServiceNow ITOM event management solutions
  • Experience deploying and managing service reliability platforms (e.g., xMatters, OpsGenie, PagerDuty)
  • Experience with and deep knowledge of cloud environments, cloud monitoring platforms, and container orchestration tools (e.g., AWS/CloudTrail, Azure/Monitor, GCP/GCM, Kubernetes, OpenShift)
  • Proficiency in Python and other scripting languages such as Ansible, PowerShell, Bash for automation and configuration
  • Hands-on experience deploying, managing, and administering observability platforms
  • Hands-on experience leading, coordinating, and performing migration of application, platform, and infrastructure observability solutions
  • Proven ability to troubleshoot and resolve complex technical issues
Job Responsibility
Job Responsibility
  • Deploy and implement modern observability solutions
  • Manage and administer observability and event management platforms
  • Coordinate and manage release cycles for observability platforms
  • Troubleshoot and resolve incidents related to observability platforms
  • Continuously monitor and enhance platform performance
  • Collaborate with cross-functional stakeholders
  • Provide training and mentoring to junior engineers
  • Ensure compliance and security of observability platforms
  • Maintain documentation of observability platform configurations
  • Generate and analyze reports on platform performance and capacity
What we offer
What we offer
  • Affordable medical plan options
  • a 401(k) plan (including matching company contributions)
  • an employee stock purchase plan
  • No-cost programs for all colleagues including wellness screenings, tobacco cessation and weight management programs
  • confidential counseling and financial coaching
  • Paid time off
  • flexible work schedules
  • family leave
  • dependent care resources
  • colleague assistance programs
  • Fulltime
Read More
Arrow Right

Senior Software Engineer, Platform Observability

Everlaw is looking for a Senior Software Engineer that brings experience in buil...
Location
Location
United States , Oakland
Salary
Salary:
164000.00 - 208000.00 USD / Year
everlaw.com Logo
Everlaw
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • BS or MS in Computer Science, or equivalent coursework
  • At least 3 years of experience building logging, metrics, and tracing infrastructure
  • Proficiency in coding in a language such as C, C++, C#, Java, Python, Javascript, Go or Rust
  • Experience with Infrastructure as Code and container solutions to manage cloud environments (ex: Terraform, Ansible, Docker, etc)
  • At least 1 year of experience leading multi-developer efforts, including planning, technical breakdown, and coordination
  • Excellent communication and collaboration skills
  • Please note that at this time, Everlaw is not sponsoring U.S. employment visas for this role. Due to federal contract requirements, Everlaw may only hire US citizens for this position.
Job Responsibility
Job Responsibility
  • Build observability strategies to support application and infrastructure metrics, logs, traces, dashboards, and alerts
  • Develop and maintain infrastructure as code (IAC) using tools such as Terraform and Ansible
  • Monitor usage trends to identify opportunities to optimize efficiency and performance of our metrics database and logging tools
  • Improve our on-call and incident management processes by encouraging deeper understanding, communication, and trust
  • Support developer projects by influencing design and implementation of infrastructure features as well as providing technical guidance
  • Support compliance efforts by promoting continuous documentation of our processes and involvement in audits
  • Provide Technical Mentorship to other engineers by both sharing your technical knowledge and becoming an expert in an area of our code base.
What we offer
What we offer
  • Equity program
  • 401(k) retirement plan with company matching
  • Health, dental, and vision
  • Flexible Spending Accounts for health and dependent care expenses
  • Paid parental leave and approximately 10 days (80 hours) per year of sick leave
  • Seventeen paid vacation days plus 11 federal holidays
  • Membership to Modern Health to help employees prioritize mental health and wellness
  • Annual allocation for Learning & Development opportunities and applicable professional membership dues
  • Company-sponsored life and disability insurance
  • Work in Uptown Oakland, just steps from the BART line and dozens of restaurants and walking distance to Lake Merritt
  • Fulltime
Read More
Arrow Right

Senior Software Engineer, Infrastructure Observability

We have an opening for a Senior Software Engineer on our Infrastructure Team, wi...
Location
Location
United States
Salary
Salary:
180000.00 - 225000.00 USD / Year
temporal.io Logo
Temporal
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Demonstrated ability to develop horizontally scalable, resilient, and high performance distributed systems in a production environment
  • Experience designing, implementing, deploying, and supporting large scale, geographically distributed observability and/or high throughput data streaming/processing pipelines, or similar
  • Expert in one or more high-level programming languages, preferably Go
  • Expert-level Kubernetes skills
  • Expert-level query development skills, preferably SQL
  • Hands-on experience with one or more cloud providers, preferably AWS, or GCP
  • Thorough understanding of computer architecture, operating systems, and networking
  • Familiarity with best practices regarding monitoring, instrumenting, and configuring infrastructure
  • User-first mindset
  • Motivated by impact
Job Responsibility
Job Responsibility
  • Lead the end-to-end Software Development Lifecycle: goals & requirements solicitation, design & review, implementation, operationalization & deployment, support & maintenance
  • Formulate feature designs, review with stakeholders, iterate to incorporate feedback and drive consensus
  • Clearly document design choices and operational knowledge to successfully deploy and manage the software you develop
  • Provide appropriate test and production readiness coverage for unit, integration, and performance of your feature ownership area
  • Set a high bar for technical excellence and take pride in the software you develop
  • Design and build multi-component, distributed systems that operate at scale
  • Investigate issues with a methodical approach to identify a root cause
  • Understand performance and reliability implications of design options at scale. Make related tradeoffs
  • Able to participate in the team’s on-call rotation
  • Expert-level knowledge of architecture and services of assigned domain. Strong command over all aspects of the Temporal ecosystem
What we offer
What we offer
  • Unlimited PTO, 12 Holidays + 2 Floating Holidays
  • 100% Premiums Coverage for Medical, Dental, and Vision
  • AD&D, LT & ST Disability, and Life Insurance (Standard & Supplemental Available)
  • Empower 401K Plan
  • Additional Perks for Learning & Development, Lifestyle Spending, In-Home Office Setup, Professional Memberships, WFH Meals, Internet Stipend and more
  • $3,600 / Year Work from Home Meals
  • $1,500 / Year Career Development & Learning
  • $1,200 / Year Lifestyle Spending Account
  • $1,000 / Year In-Home Office Setup (In addition to Temporal issued equipment)
  • $500 / Year Professional Memberships
  • Fulltime
Read More
Arrow Right
Welcome to CrawlJobs.com
Your Global Job Discovery Platform
At CrawlJobs.com, we simplify finding your next career opportunity by bringing job listings directly to you from all corners of the web. Using cutting-edge AI and web-crawling technologies, we gather and curate job offers from various sources across the globe, ensuring you have access to the most up-to-date job listings in one place.