This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Azure is the fastest-growing business in Microsoft’s history and the foundation of Microsoft’s commercial cloud services. Within Azure Core, we are advancing Foundational Observability, elevating existing standards and introducing innovations that set a new benchmark for reliability and resilience. As a Senior Software Engineer, you will design and build solutions that deliver step-change improvements in telemetry, detection, and recovery across core infrastructure and foundational services. Your work will enable rapid, localized issue detection and resilient recovery at global scale, ensuring Azure Core meets the highest standards of performance and operational excellence. You will collaborate across Azure and partner teams to integrate with existing systems while introducing modern approaches that maximize impact and efficiency. This role also involves leveraging and contributing to open-source frameworks and communities. You will work in a fast-paced environment, solving complex problems that require creativity and teamwork to deliver meaningful business outcomes.
Job Responsibility:
Collaborates with appropriate stakeholders to determine user requirements for a scenario
Drives identification of dependencies and the development of design documents for a product, application, service, or platform
Creates, implements, optimizes, debugs, refactors, and reuses code to establish and improve performance and maintainability, effectiveness, and return on investment (ROI)
Leverages subject-matter expertise of product features and partners with appropriate stakeholders (e.g., project managers) to drive a workgroup's project plans, release plans, and work items
Acts as a Designated Responsible Individual (DRI) and guides other engineers by developing and following the playbook, working on call to monitor system/product/service for degradation, downtime, or interruptions, alerting stakeholders about status and initiates actions to restore system/product/service for simple and complex problems when appropriate
Proactively seeks new knowledge and adapts to new trends, technical solutions, and patterns that will improve the availability, reliability, efficiency, observability, and performance of products while also driving consistency in monitoring and operations at scale
Requirements:
Bachelor's Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
3 years of experience designing and building backend distributed systems or cloud-scale services
3 years of experience collaborating with cross-teams and delivering high-quality solutions
2 years of experience with service reliability engineering and incident management for mission-critical systems
Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter
Nice to have:
Bachelor's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, - OR Python OR Master's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
1 year of experience with observability (telemetry, logging, metrics, detection) and operational excellence
Familiarity with open-source frameworks and standards related to observability
Track record of improving system reliability and performance at scale