This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
BTI Professionals provide expert third-line reliability and operational support for BT International’s Global Fabric Network-as-a-Service (NaaS) product, ensuring high availability, performance, and service resilience. We are seeking a Site Reliability Engineering Specialist to support the reliability and operability of the Global Fabric service. This role focuses on service reliability for a live NaaS product, working closely with network and platform teams rather than owning the underlying platform.
Job Responsibility:
Provide SRE ownership for the Global Fabric NaaS service, ensuring availability, performance, and resilience
Support safe, automated change into production using CI/CD, GitOps, and automated testing
Operate and improve monitoring and observability using Dynatrace, Prometheus, and Elasticsearch
Troubleshoot incidents across Kubernetes-hosted applications, Linux systems, networking, and service integrations
Act as a third-line escalation point, participating in a 24x7 on-call rota
Manage incidents via ServiceNow and track defects and improvements in Jira
Contribute to Scrum ceremonies and PI planning, supporting Agile delivery
Drive automation using Ansible and scripting to reduce operational toil
Mentor and support L2 engineers, improving runbooks, troubleshooting practices, and operational readiness
Requirements:
Experience supporting large-scale, high-availability services in an ISP / NaaS / network-centric environment
Strong Linux troubleshooting and systems knowledge
Hands-on Kubernetes experience operating applications in production
Experience delivering changes using GitOps and CI/CD pipelines (including release validation and rollback awareness)
Working knowledge of incident/problem management in ServiceNow and delivery tracking in Jira (Scrum / PI planning)
Experience with observability tooling: Dynatrace, Prometheus, Elasticsearch, plus event/messaging platforms such as Kafka
Solid networking fundamentals to support effective troubleshooting
Automation experience with Ansible and at least one of Python / Go / Bash
Experience integrating or operating services with LDAP (authentication/authorisation, troubleshooting access issues)
Nice to have:
Exposure to platform or infrastructure operations (VMs, Kubernetes upgrades, storage troubleshooting)
Knowledge of BGP, IS-IS
Experience with Cisco, Juniper, or Nokia platforms
Experience supporting automated testing and controlled production deployments
What we offer:
Cafeteria package - HUF 600,000/ year
Performance-based bonus
Comprehensive private health care package for all the employees, which can be extended to family members
Nursery support for mothers returning from maternity
Extended paternity leave: 10+10 day fully paid days
Commuting allowance
Home office allowance
Employee discount opportunities
Highly affordable mobile packages for the family as well
Car allowance
New high-class offices both in Budapest and Debrecen
Wide-range of company and community programmes (including support for different sport activities)
Family-friendly culture
Smart working approach (hybrid working model, 3 together, 2 wherever)