This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
About the Role: Partners with stakeholders and leads team efforts to build and maintain Machine Learning backend services and solutions to support user-facing products, downstream services, or infrastructure tools and platforms used across Uber.
Job Responsibility:
Partners with stakeholders and leads team efforts to build and maintain Machine Learning backend services and solutions to support user-facing products, downstream services, or infrastructure tools and platforms used across Uber
Design and build tools to empower production teams to innovate and productionize state-of-the-art deep learning models at Uber
Develop and maintain scalable, end-to-end deep learning training systems and frameworks
Ensure distributed training tools are reliable, efficient, flexible to use for new production use cases
Collaborate with cross-functional teams including machine learning engineers, backend engineers, data scientists, and data engineers to deliver robust ML solutions for Uber
Requirements:
Master in relevant fields (CS, EE, Math, Stats, etc.) AND 6-years full-time Software Engineering work experience in deep learning
Proficiency in Python and PyTorch
Expertise in designing, debugging, and optimizing distributed deep learning systems
Working experience of distributed training in PyTorch at Scale (e.g., data parallelism, model parallelism)
Strong ability to translate complex DL requirements and problems into scalable solutions
Nice to have:
Expertise in distributed training frameworks such as DDP, DeepSpeed, FSDP, or TorchRec
Familiarity with C++, Go or CUDA programming
Expertise in optimizing GPU/TPU training performance and data loading efficiency
Familiarity with large-scale distributed infrastructure tools like Ray, OpenAI Triton, PyTorch Lightning
Built and deployed end-to-end machine learning systems in production
Experience training large models (10B+ parameters), such as large recommendation systems or large language models (LLMs)
PhD in relevant fields (CS, EE, Math, Stats, etc.)
What we offer:
Eligible to participate in Uber's bonus program
May be offered an equity award & other types of comp
All full-time employees are eligible to participate in a 401(k) plan