This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
If you've ever watched a massive data pipeline process billions of records without breaking a sweat, felt genuine satisfaction debugging a business critical schema migration, or gotten excited about shaving milliseconds off pipeline latency then you're in the right spot. We're looking for the best data engineers. Data engineers who have built and scaled data systems that others depend on, who take pride in delivering rock-solid data quality, and who genuinely enjoy the craft of data engineering. If you're the type of person who celebrates when your monitoring dashboards show all green and gets energized by the challenge of making data flow seamlessly across complex systems, this role is for you. Join us to architect and implement the data backbone that powers Copilot for millions of users worldwide. You'll own the full data lifecycle - from building lightning-fast ETL pipelines that handle massive scale to crafting experimentation frameworks that drive product decisions. We need someone who thrives on solving complex data challenges, loves collaborating with brilliant teammates, and gets genuinely excited about building infrastructure that just works. In our fast-paced environment, you'll have the freedom to innovate and the support to build world-class data products that make a real impact. By applying to this position, you are required to be local to the San Francisco area or Redmond area and in office 3 days a week. Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond. Starting January 26, 2026, MAI employees are expected to work from a designated Microsoft office at least four days a week if they live within 50 miles (U.S.) or 25 miles (non-U.S., country-specific) of that location. This expectation is subject to local law and may vary by jurisdiction.
Job Responsibility:
Build, maintain, and enhance data ETL pipelines for processing large-scale data with low latency and high throughput to support Copilot operations
Design and maintain high throughput, low latency experimentation reporting pipelines that enable data scientists and product teams to measure model performance and user engagement
Own data quality initiatives including monitoring, alerting, validation, and remediation processes to ensure data integrity across all downstream systems
Implement robust schema management solutions that enable quick and seamless schema evolution without disrupting downstream consumers
Develop and maintain data infrastructure that supports real-time and batch processing requirements for machine learning model training and inference
Collaborate with ML engineers and data scientists to optimize data access patterns and improve pipeline performance for model evaluation workflows
Design scalable data architectures that can handle growing data volumes and evolving business requirements
Implement comprehensive monitoring and observability solutions for data pipelines, including SLA tracking and automated alerting
Partner with cross-functional teams to understand data requirements and translate them into efficient technical solutions
Requirements:
Master's Degree in Computer Science, Math, Software Engineering, Computer Engineering, or related field AND 4+ years experience in business analytics, data science, software development, data modeling, or data engineering
OR Bachelor's Degree in Computer Science, Math, Software Engineering, Computer Engineering, or related field AND 6+ years experience in business analytics, data science, software development, data modeling, or data engineering
OR equivalent experience
Experience building and maintaining production data pipelines at scale using technologies such as Apache Spark, Kafka, or similar distributed processing frameworks
Experience writing production-quality Python, Scala, or Java code for data processing applications
Experience building and scaling experimentation frameworks
Experience with cloud data platforms (Azure, AWS, or GCP) and their data services
Experience with schema management and data governance practices
Experience with real-time data processing and streaming architectures
Experience with data orchestration frameworks such as Airflow, Prefect, Dagster or similar workflow management systems
Experience with containerization technologies (Docker, Kubernetes) for data pipeline deployment
Demonstrated experience with data quality frameworks and monitoring solutions
Nice to have:
Experience building and maintaining production data pipelines at scale using technologies such as Apache Spark, Kafka, or similar distributed processing frameworks
Experience writing production-quality Python, Scala, or Java code for data processing applications
Experience building and scaling experimentation frameworks
Experience with cloud data platforms (Azure, AWS, or GCP) and their data services
Experience with schema management and data governance practices
Experience with real-time data processing and streaming architectures
Experience with data orchestration frameworks such as Airflow, Prefect, Dagster or similar workflow management systems
Experience with containerization technologies (Docker, Kubernetes) for data pipeline deployment
Demonstrated experience with data quality frameworks and monitoring solutions