This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
This newly established team will operate independently from our core trading operations, serving as a long-term innovation engine to explore, develop, and apply the most advanced AI and machine learning technologies to complex problems. From foundational model development to novel AI systems, the AI Lab will be at the forefront of pushing boundaries.
Job Responsibility:
Building the compute platform and machine learning libraries for large scale machine learning and simulation workloads
Focus on compute platform stability and efficiency on both CPU and GPU clusters, making the platform observable and scalable
Utilize cluster monitoring and profiling tools to identify bottlenecks and optimize both infrastructure and software system
Design, build and improve our compute platform for PB scale data model training and simulations with a wide range of machine learning models by leveraging our existing research infrastructure
Requirements:
Solid experience in running production machine learning infrastructure at a large scale
Experience in designing, deploying, profiling and troubleshooting in Linux-based computing environments
Proficiency in containerization, parallel computing and distributed training algorithms
Experience with storage solutions for large scale, cluster-based data intensive workloads
Nice to have:
Experience of supporting machine learning researchers or data scientists for production workloads
What we offer:
Competitive compensation and long-term incentives
Access to cutting-edge infrastructure, compute, and proprietary datasets
Operate in a high-trust, low-bureaucracy environment with exceptional resources