CrawlJobs Logo

Site Reliability Operations III

walmart.com Logo

Walmart

Location Icon

Location:
United States of America , Bentonville

Category Icon

Job Type Icon

Contract Type:
Employment contract

Salary Icon

Salary:

80000.00 - 155000.00 USD / Year

Job Description:

The Command & Control Center is the nerve center for Walmart Global Technology. On the Logistics Support team, we proactively monitor critical supply chain applications and infrastructure, providing early warnings and rapid response to potential disruptions. Our team ensures seamless operations by swiftly mitigating incidents and leveraging advanced automation and AI-driven monitoring to keep Walmart’s supply chain resilient and efficient.

Job Responsibility:

  • Monitor and alert on software or system performance, determining thresholds for monitoring metrics and triggers alerts based on thresholds
  • Supervise specific procedures to proactively check the health of applications and infrastructure, including a variety of operating systems, hardware, and software
  • Investigate and diagnose incidents to restore a failed IT service as quickly as possible and within specified SLAs
  • Document troubleshooting steps and service restoration details for knowledge management
  • Liaison between Tech and external support to resolve escalated incidents and ensure timely closure
  • Record and classify received incidents and undertake immediate corrective action for moderate complexity queries under moderate supervision
  • Research and recommend alternative actions for incident resolution
  • Contribute to command-and-control related activities focused on restoration of complex outages
  • Conduct complex maintenance procedures for applications independently
  • Monitor and evaluate the performance of the application by tracking and analyzing appropriate metrics
  • Perform maintenance (corrective, adaptive, perfective) and re-engineering activities
  • Analyze application logs, maintenance activity data, performance data, and provide analysis
  • Evaluate change requests to identify those which are valid and feasible
  • Troubleshoot performance and availability bottlenecks for assigned application independently
  • Triage to detect and determine symptom versus cause of defects
  • Actively provide data for and participate in RCA
  • Build, maintain, and enhance effective internal and external partnerships
  • Influence technical outcomes and assist in communicating shared goals with diverse groups and parties
  • Identify and address additional partner technical needs and educate them on value creation
  • Communicate with other individuals or teams to solve shared business problems cooperatively
  • Bring ideas and technical solutions proactively to business partners and stakeholders

Requirements:

  • Strong communication and interpersonal skills
  • Experience with Jira, Looper, and Kubernetes
  • Familiarity with Grafana and ability to write queries (PromQL)
  • GitHub experience
  • Database knowledge is preferable but not required
  • Ability to work independently and make decisions with guidance
  • Comprehension of changes to methodologies and resources, and ability to articulate the same
  • Experience with cloud applications and ability to pull logs
  • Strong analytical and problem-solving skills
  • Ability to work collaboratively with cross-functional teams
  • Experience with incident management and troubleshooting
  • Strong technical skills, including proficiency in monitoring and alerting, incident management, and DevOps orientation
  • Immigration sponsorship is not available for this role

Nice to have:

  • Experience in site reliability operations, site and system administration, infrastructure management, or related area
  • Master's degree in site reliability operations, site and system administration, infrastructure management, or related area.
  • SRE certification (for example, IBM Cloud Site Reliability Engineer).
  • We value candidates with a background in creating inclusive digital experiences, demonstrating knowledge in implementing Web Content Accessibility Guidelines (WCAG) 2.2 AA standards, assistive technologies, and integrating digital accessibility seamlessly. The ideal candidate would have knowledge of accessibility best practices and join us as we continue to create accessible products and services following Walmart’s accessibility standards and guidelines for supporting an inclusive culture.
What we offer:
  • Multiple health plan options, including vision & dental plans for you & dependents
  • Financial benefits including 401(k), stock purchase plans, life insurance and more
  • Associate discounts in-store and online
  • Education assistance for Associate and dependents
  • Parental Leave
  • Pay during military service
  • Paid Time off - to include vacation, sick, parental
  • Short-term and long-term disability for when you can't work because of injury, illness, or childbirth
  • incentive awards for your performance
  • maternity and parental leave, PTO, health benefits
  • performance-based bonus awards
  • company discounts
  • adoption and surrogacy expense reimbursement

Additional Information:

Job Posted:
January 07, 2026

Employment Type:
Fulltime
Work Type:
Hybrid work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Site Reliability Operations III

New

Site Reliability Engineer III

Under limited supervision, the Site Reliability Engineer III is responsible for ...
Location
Location
United States , Birmingham
Salary
Salary:
Not provided
allianceautomotive.co.uk Logo
Alliance Automotive UK LV Ltd
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Typically requires a bachelor's degree and five (5) or more years of related experience or an equivalent combination
  • Understanding of Kubernetes, containers, clusters, and elastic scalability
  • Expertise in SRE principles
  • Mindset of continually finding ways to drive scalability, stability, and performance
  • Cloud Services experience with Google Cloud Platform (GCP)
  • Experience with API, service-based or microservice-based architecture
  • Proficiency in infrastructure, network, database, operating systems, or security troubleshooting and remediation
  • Architecture-level knowledge of Windows and Linux and Infrastructure systems
  • Experience with production deployment, monitoring, and operational support for enterprise-class applications (Dynatrace a plus)
  • Experience working with Continuous Integration/ Continuous Deployment tools
Job Responsibility
Job Responsibility
  • Gathers and analyzes metrics from monitoring platforms to assist in performance tuning and fault tolerance
  • Partners with development teams to improve services through testing and release procedures
  • Participates in system design, platform management and capacity planning
  • Balances feature development speed and reliability with service-level objectives
  • Works closely with the incident response team and restoring service to normal operation
  • Understands debugging and applying troubleshooting skills
  • Investigates, blocks and rate-limits unwanted traffic
  • Utilizes monitoring systems and dashboards for proactive changes and alerting
  • Establishes continuous process improvement cycles where the process, performance, and supporting technologies are reviewed and enhanced where applicable
  • Performs other duties as assigned
What we offer
What we offer
  • options for healthcare coverage, 401(k), tuition reimbursement, vacation, sick, and holiday pay
  • Fulltime
Read More
Arrow Right

Site Reliability Engineer III

Zuora’s Cloud Engineering teams are responsible for Cloud infrastructures, monit...
Location
Location
India , Chennai
Salary
Salary:
Not provided
zuora.com Logo
Zuora
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 6-8 years of relevant experience on SRE/DevOps
  • Proven hands-on working experience with core AWS services (e.g., EC2, VPC, S3, RDS, IAM, CloudWatch, EKS/ECS)
  • Deep expertise in infrastructure-as-code principles using Terraform for provisioning and state management
  • Expert-level knowledge and practical experience with configuration management tools such as Puppet and/or Ansible
  • Strong experience setting up, maintaining, and enhancing Continuous Integration/Continuous Deployment pipelines using Jenkins
  • Proficiency in scripting languages, particularly Python and/or Shell scripting, for developing automation tools and performing system administration tasks
  • Advanced knowledge of Linux operating systems, including performance tuning, troubleshooting, security, and networking fundamentals
  • Working knowledge and operational experience with distributed messaging queues, specifically Kafka
Job Responsibility
Job Responsibility
  • Maintain and improve the reliability, scalability, and performance of our production systems, targeting a high-availability environment
  • Design, implement, and maintain automation solutions for infrastructure provisioning, deployment, configuration management, and monitoring using Terraform and Jenkins
  • Administer, manage, and optimize our cloud infrastructure primarily hosted on AWS, focusing on cost efficiency and secure operations
  • Develop and maintain infrastructure-as-code using Puppet and/or Ansible to ensure consistent and reproducible environments
  • Participate in on-call rotation, troubleshoot and resolve critical production incidents, and conduct comprehensive post-mortems to prevent recurrence
  • Apply strong Linux administration skills to manage, patch, and secure operating systems and underlying infrastructure
  • Manage and optimize distributed messaging systems, specifically Kafka, ensuring high throughput and data integrity
What we offer
What we offer
  • Competitive compensation, variable bonus and performance reward opportunities, and retirement programs
  • Medical Insurance
  • Generous, flexible time off
  • Paid holidays, “wellness” days and company wide end of year break
  • Learning & Development stipend
  • Opportunities to volunteer and give back, including charitable donation match
  • Free resources and support for your mental wellbeing
Read More
Arrow Right

Site Reliability Engineer III

Under limited supervision, the Site Reliability Engineer III is responsible for ...
Location
Location
United States , Birmingham, Alabama
Salary
Salary:
Not provided
genpt.com Logo
Genuine Parts Company
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Typically requires a bachelor's degree and five (5) or more years of related experience or an equivalent combination
  • Understanding of Kubernetes, containers, clusters, and elastic scalability
  • Expertise in SRE principles
  • Mindset of continually finding ways to drive scalability, stability, and performance
  • Cloud Services experience with Google Cloud Platform (GCP)
  • Experience with API, service-based or microservice-based architecture
  • Proficiency in infrastructure, network, database, operating systems, or security troubleshooting and remediation
  • Architecture-level knowledge of Windows and Linux and Infrastructure systems
  • Experience with production deployment, monitoring, and operational support for enterprise-class applications (Dynatrace a plus)
  • Experience working with Continuous Integration/ Continuous Deployment tools
Job Responsibility
Job Responsibility
  • Gathers and analyzes metrics from monitoring platforms to assist in performance tuning and fault tolerance
  • Partners with development teams to improve services through testing and release procedures
  • Participates in system design, platform management and capacity planning
  • Balances feature development speed and reliability with service-level objectives
  • Works closely with the incident response team and restoring service to normal operation
  • Understands debugging and applying troubleshooting skills
  • Investigates, blocks and rate-limits unwanted traffic
  • Utilizes monitoring systems and dashboards for proactive changes and alerting
  • Establishes continuous process improvement cycles where the process, performance, and supporting technologies are reviewed and enhanced where applicable
  • Performs other duties as assigned.
What we offer
What we offer
  • Options for healthcare coverage, 401(k), tuition reimbursement, vacation, sick, and holiday pay.
  • Fulltime
Read More
Arrow Right

Maintenance Technician III

Summit Skilled Solutions is seeking an experienced Maintenance Technician III to...
Location
Location
United States , Orlando
Salary
Salary:
28.00 - 35.00 USD / Hour
summitskilledsolutions.com Logo
Summit Skilled Solutions
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • High School Diploma or equivalent required
  • Minimum of two (2) years of training through an accredited trade school or college
  • certificate or degree preferred
  • Minimum of 7 years of experience in mechanical, electrical, plumbing, carpentry, or industrial maintenance
  • Valid state driver’s license
  • Forklift operator certification
  • Scissor lift and aerial lift (JLG) certification
  • Universal CFC certification required
  • Any required state or national trade licenses must be obtained and maintained by the employee
  • Proficient in the use of hand tools and small and large power tools
Job Responsibility
Job Responsibility
  • Comply with all safety policies and procedures, including OSHA regulations and lockout/tagout requirements
  • Conduct routine “shift rounds” to inspect systems and equipment, identify issues, and document performance data
  • Maintain, troubleshoot, and repair facility equipment, including: Electrical installation, repair, and maintenance of equipment and controls
  • Installation, maintenance, and repair of plumbing and piping systems and related components
  • Installation, repair, and maintenance of mechanical and electrical operating equipment and machinery
  • Perform routine and ongoing assessments of building system operations
  • Conduct testing and data analysis to verify proper operation of site equipment
  • Monitor mechanical, electrical, and other facility systems to ensure reliable operation
  • Perform work in accordance with manufacturing standards and approved change-management processes
  • Complete administrative tasks including parts ordering, purchase order creation, vendor coordination, and participation in job and project meetings
  • Fulltime
Read More
Arrow Right
New

Electronics Technician III

We are seeking a AV Software Engineer to join our Security and Electronic System...
Location
Location
Germany , Stuttgart
Salary
Salary:
Not provided
mcdean.com Logo
M.C. Dean, Inc
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Active Secret Clearance Required
  • U.S. Citizenship
  • Ability to travel up to 25%
  • HS diploma or GED
  • Military Electronics Training (minimum 720 classroom hours) or Graduation from an accredited Electronics Technician program or Graduation from an Electrical Apprenticeship program
  • An additional three (3) years of electronics installation and/or maintenance activities
  • 6+ years of electronics installation and/or maintenance activities on multiple systems and with multiple customer programs
  • Strong Oral, Written and Presentation Skills
  • Demonstrated background working with multidisciplinary teams
  • Demonstrated time management and organization skills to meet deadlines and quality objectives
Job Responsibility
Job Responsibility
  • Executes various technical tasks and responsibilities within field operations
  • Performs on-site installations, maintenance, troubleshooting, and repairs of equipment and systems
  • Ensures the functionality and reliability of various technologies
  • Conducting site surveys, configuring hardware and software, testing systems for proper operation, and providing technical support to customers
  • Utilizes and comprehends project Safety plan (JHA, AHA, PFW), enforcing M.C. Dean handbook and policies
  • Participates in quality reviews of M.C. Dean design documentation, analyzing and interpreting drawing packages to evaluate constructability
  • Tracks project metrics and participates in weekly resource allocation meetings
  • Verifies correct charges in timesheets on projects
  • Executes installation and maintenance activities within planned durations, ensuring completion of detailed documentation
  • Tracks and inventories tools, conducts tool inspections, and organizes material ordering and receiving
What we offer
What we offer
  • A collaborative team inspired by the way engineering and innovation enhance customer outcomes, improve lives, and change the world for the better
  • An opportunity to lead and build a business with the support of an industry-leading firm that has been in business for 75 years
  • Investment in your skills and expertise through a combination of professional and technical training programs, including leadership training and tuition reimbursement
  • Open and transparent communication with senior leadership as well as local office management
  • Fulltime
Read More
Arrow Right

Principal Process Engineer III

The Principal Process Engineer III provides process engineering support to the I...
Location
Location
United States , Houston
Salary
Salary:
Not provided
airswift.com Logo
Airswift Sweden
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • A Bachelor of Science degree in Process / Chemical Engineering or a relevant Engineering discipline
  • 15+ years’ industry-related experience, with a minimum of 10 years supporting the production of oil & gas facilities
  • Strong technical knowledge of separation, dehydration, and compression is required
  • Knowledge of flow assurance risks associated with offshore deepwater production facilities
  • Experience with process & hydraulic simulation tools (e.g., Aspentech HYSYS)
  • Detailed expertise in Process Safety
  • Experience with relevant process safety risk assessments and functional safety tools
  • Experience in Incident Response (or willing to participate)
  • Professional Engineer / Chartered Engineer status preferred
  • Working knowledge of API codes affecting process design (e.g., API RP 14C, 14E, 520, 521)
Job Responsibility
Job Responsibility
  • Provide technical leadership in process engineering across topsides equipment, subsea flowline, and onshore storage and off-take equipment to the operations and maintenance teams
  • Foster effective relationships with both office and site-based teams
  • Subsea flowline and topsides equipment process engineering support to Operations
  • Develop process solutions to ensure safe and reliable operations, which may include additional new or temporary equipment
  • Lead or participate as a member of Root Cause Analysis (RCA) teams as required
  • Manage the design, detailed engineering, and work management planning of facilities process modifications using an MOC process
  • Participate in HAZOPs, HAZIDs, LOPAs, other risk assessments, and Design Reviews
  • Provides governance and oversight to ensure engineering practices are carried out in accordance with the required technical standards, processes, and procedures while operating in compliance with local regulations and corporate standards
  • Own and continually improve the process surveillance tools and KPIs across the operated facilities
  • Provision of technical expertise and direction in “Case to Operate” scenarios related to process engineering
Read More
Arrow Right

Maintenance Supervisor

Maintenance Supervisor for Puma Energy, responsible for managing contractors, pr...
Location
Location
South Africa , Sandton
Salary
Salary:
Not provided
pumaenergy.com Logo
Puma Energy
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • University Degree in Mechanical, civil or Electrical Engineering / other Engineering discipline or 5+ years of experience, preferably related to petroleum industry maintenance and construction related
  • Proven experience in an industrial capacity with proven record of contractor management
  • Experience establishing and promoting best practices to help improve project visibility and predictability
  • Previous supervisory and management experience is preferred
  • Proficiency in maintenance management software TAG, MMS etc
  • Financial Acumen is required
  • Strong understanding of Maintenance methods, materials, and regulations
  • Excellent leadership and contractor management skills
  • Strong problem-solving abilities and the capacity to make quick, effective decisions
  • Excellent communication skills, with the ability to liaise effectively with various stakeholders
Job Responsibility
Job Responsibility
  • Manage the Puma energy contractors
  • To provide an effective and efficient Management Service translating these into functional requirements, technical specifications, project formulation, design, tender formulation, change management, standard setting, search and selection of vendors, installers and supports, construction and budget
  • Providing efficient leadership and management to the construction and maintenance field operations and ensure that Puma’s standards and policies are complied with
  • Improve equipment availability and reliability. Align equipment availability and reliability according with the operation needs. (Increase MTBF and decrease MTTR)
  • Approve the Annual Maintenance Plan and the Annual Maintenance Budget (Opex)
  • Cost Control based on equipment budget. Correct deviations from budget based on the analysis of the cost structure
  • i) Materials, ii) Outside Services Providers, iii) Tools and iv) Labor
  • Ensure contractor management for projects and Maintenance to reflect quality of work in a lower TCO of equipment/components, to sustain equipment performance over the assets’ lifecycle
  • Drive contractor Key Performance Indicators across all functions to ensure all Retail, Terminals, and Business-to-Business maintenance programs are timely executed and delivered to the highest workmanship and standards
  • Maintain an updated "5 Year Replacement Program" of equipment and components
  • Fulltime
Read More
Arrow Right

Mobile Maintenance Engineer Electrical Bias

As a Foot Mobile Maintenance Engineer, you will be responsible for delivering bo...
Location
Location
United Kingdom , London
Salary
Salary:
44000.00 - 46000.00 GBP / Year
paretofm.com Logo
paretofm
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • City & Guilds Level III (or equivalent) in Electrical discipline
  • City & Guilds Level III (or equivalent) in Mechanical discipline
  • 18th Edition I.E.E
  • Strong understanding of Electricity at Work Regulations and safe working practices
  • Proven experience in maintenance, testing, and fault finding across building services systems
Job Responsibility
Job Responsibility
  • Carry out planned preventative maintenance (PPM) and reactive works on mechanical and electrical systems
  • Maintain plant, equipment, and associated components in good working order
  • Review and monitor subcontractor performance during specialist service visits and repair works
  • Complete simPRO instruction sets accurately, including photographic evidence for completed tasks
  • Maintain site logbooks and compliance records for electrical, mechanical, water systems, and health & safety
  • Escalate any issues relating to statutory planned maintenance to the Supervisor or Management team
  • Maintain comfortable environmental conditions through effective operation of HVAC systems
  • Respond to plant and equipment failures, carrying out fault finding on mechanical and electrical systems
  • Respond to BMS alarms and use the system for first-line diagnostics
  • Identify and report potential hazards or safety concerns promptly
  • Fulltime
Read More
Arrow Right