This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Site Reliability Engineering Support Lead role focused on application support, developing and maintaining CI/CD frameworks, and leading technical support teams in a hybrid work environment. The job involves addressing production issues, mentoring team members in SRE principles, and ensuring the availability and reliability of critical systems.
Job Responsibility:
Taking end-to-end Ownership of Application Support for Production Systems Issues resolution
Implementing, monitoring, and maintaining CI/CD frameworks
Developing new capabilities, coordinating implementation across a large number of teams including infrastructure, developer tools and information security
Influencing a culture of Site Reliability Engineering. Engaging in training and mentoring to help develop other engineers with SRE mind set
Providing the first line of after-deployment technical support at L1 and L2 level for applications and and/or associated production systems diagnostics, and network health monitoring
Coordination and/or for deploying hands-on fixes, patches and software updates at the application level, and as appropriate at the network level
Managing a team of technical support engineers who provide technical support to users
Escalating complex problems to the L3 level of expertise within organization, along with observations from investigative and diagnostic assessments
Co-ordinating in the investigation of repeated technical issues affecting user system and seeing through to resolution
Escalating, resolving, guiding team, and tracking production incidents to closure
Monitoring and tracking issue/incident tickets, through service desk and other channels, and researching, diagnosing, troubleshooting, and identifying solutions to resolve system/application issues, in a timely manner, while ensuring service level requisites (SLAs) are met or exceeded on the better side
Providing some periodical and/or emergency support (i.e. on-call support), as may be needed from time-to-time
Planning, testing, and driving execution of system contingency for Production systems to ensure availability
Understanding and driving impact analysis to identify the root cause of production issues
Using ServiceNow or any other tool
Testing from an Operations and availability perspective. Reviewing of releases for operational gaps and test the same to ensure a smooth rollout.
Requirements:
Solid SRE process experience
5+ years of Leading high-performance, 24x7, DevOps or SysOps team
Proficiency in Windows administration, Office 365, Exchange, SharePoint, Active Directory, Backup, Networking and Infrastructure
Experience with Microsoft OS Windows & Server
Experience in ticket tracking and resolving on time
Hands-on experience on ticketing tools (ServiceNow)
Excellent verbal, written, presentation and interpersonal communication skills
Ability to make complex technical matters easy-to-comprehend for non-technical persons.
What we offer:
Competitive base salary (which is annually reviewed)
Hybrid working model (up to 2 days working at home per week)
Additional benefits to support you and your family to be well, live well and save well.
Welcome to CrawlJobs.com – Your Global Job Discovery Platform
At CrawlJobs.com, we simplify finding your next career opportunity by bringing job listings directly to you from all corners of the web. Using cutting-edge AI and web-crawling technologies, we gather and curate job offers from various sources across the globe, ensuring you have access to the most up-to-date job listings in one place.
We use cookies to enhance your experience, analyze traffic, and serve personalized content. By clicking “Accept”, you agree to the use of cookies.