Reinforcement learning intern Job at Enchanted Tools (Paris)

AI Research Engineer - Reinforcement Learning

At Helsing we deliver AI-based capabilities and the enabling infrastructure that...

Location

Germany , Munich

Salary:

Not provided

Helsing

Expiration Date

Until further notice

Requirements

Hold MSc in machine learning with a speciality in either reinforcement learning, multi-agent systems, automation and control, or robotics
Have excellent communication skills and the ability to report and present research findings clearly and efficiently both internally and externally
Are passionate about keeping up-to-date with current research and enjoy reimplementing / extending papers on state-of-the-art Deep Learning-based approaches
Possess solid software engineering skills, writing clean and well-structured code in Python and/or languages like Rust, Java, or modern C++, and experience deploying AI software to production including testing, QA, and monitoring

Job Responsibility

Design, train and deploy agents in complex multi-agent environments
Contribute to our reinforcement learning stack by implementing, improving and extending the current state of the art in multi-agent reinforcement learning
Be a part of impactful projects and will collaborate with people across several teams and backgrounds to integrate cutting edge ML/AI in our production systems

What we offer

Competitive compensation and stock options
Relocation support
Social and education allowances
Regular company events and all-hands to bring together employees as one team across Europe
A hands-on onboarding program (affectionately labelled “AI-duction”), in which you will be familiarising yourself with our tools and ML pipelines used across the company

Fulltime

Associate Director, Reinforcement Learning (ML)

Lead Amgen’s strategy and execution for Reinforcement Learning from Human Feedba...

Location

United States , Thousand Oaks; Jacksonville

Salary:

Not provided

Amgen

Expiration Date

Until further notice

Requirements

Doctorate degree and 3 years of Computer Science, IT or related field experience
Master’s degree and 5 years of Computer Science, IT or related field experience
Bachelor’s degree and 7 years of Computer Science, IT or related field experience
Associate’s degree and 12 years of Computer Science, IT or related field experience
High school diploma / GED and 14 years of Computer Science, IT or related field experience
Deep, hands-on expertise in Reinforcement Learning from Human Feedback (RLHF) and/or advanced reinforcement learning, including reward modeling, policy optimization, exploration strategies, and offline/online evaluation
Demonstrated experience deploying RLHF or RL systems into production for real-world applications (e.g., large language models, recommendation systems, decision support tools, or workflow automation), ideally in healthcare, life sciences, or other regulated domains
Strong background in modern machine learning and deep learning, with practical experience in Python and frameworks such as PyTorch or TensorFlow, and familiarity with LLM ecosystems and tooling
Experience driving sophisticated, cross-functional initiatives, collaborating with non-technical stakeholders (e.g., physicians, scientists, commercial leaders, compliance, legal) and translating needs into impactful AI solutions
Strong ability to communicate complex technical topics simply, tailoring content to senior executives and non-technical audiences

Job Responsibility

Lead the design and development of RLHF systems including reward modeling, policy optimization, safety and alignment mechanisms, and evaluation frameworks for large language models and other AI systems
Drive hands-on technical execution, particularly for high-impact projects, reviewing architectures, experimentation plans, and code, and helping the team navigate scientific and engineering trade-offs
Establish best-practice pipelines for human feedback, partnering closely with internal customer teams to define feedback protocols, annotation quality standards, and governance for RLHF data
Define and track success metrics for RLHF systems, balancing offline and online evaluation, A/B tests, safety and robustness criteria, and business or scientific outcomes
Collaborate across Amgen leaders to ensure RLHF solutions are aligned with strategy, compliant with policy, and integrated into real workflows
Partner with Data, Platform and Technology teams to ensure that RLHF workloads are supported by scalable data platforms, model hosting, experimentation infrastructure, and MLOps best practices
Champion responsible and compliant AI, working with Legal, Compliance, and Information Security to implement governance around human feedback, data usage, model behavior, transparency, and risk management in a regulated environment
Communicate insights and influence senior stakeholders, creating clear narratives, roadmaps, and recommendations that help executives understand RLHF trade-offs, risks, and opportunities

What we offer

A comprehensive employee benefits package, including a Retirement and Savings Plan with generous company contributions, group medical, dental and vision coverage, life and disability insurance, and flexible spending accounts
A discretionary annual bonus program, or for field sales representatives, a sales-based incentive plan
Stock-based long-term incentives
Award-winning time-off plans
Flexible work models where possible

Machine Learning Research Associate

The Machine Learning research team at Hewlett Packard Labs seeks highly motivate...

Location

United States , Milpitas

Salary:

43.27 - 93.15 USD / Hour

Hewlett Packard Enterprise

Expiration Date

Until further notice

Requirements

Pursuing a Ph.D. degree (with significant research and innovation experience) in a relevant discipline (e.g. machine learning, computer science, electrical engineering, statistics, etc.)
Track record of world-class innovative contributions and ideas in machine learning
Experience in deep learning, LLM, Agentic AI, and reinforcement learning research
Experience in developing deep learning software with high proficiency in data structures and algorithms
Experience in Machine Learning frameworks like PyTorch - required
Strong programming skills and experience with Python
Software development experience in Deep Learning, GPU acceleration, and Model Optimization
Demonstrated effective communication and collaboration skills
Demonstrated ability for original research papers published in top-tier conferences or journals.

Job Responsibility

Provide thought leadership and technical influence both internally and externally to HPE
Work on cutting-edge machine learning research focusing on Large Language Models, Agentic AI, and Reinforcement Learning
Contribute along the full range from initial novel ideas to design, development, implementation, evaluation, and technology transfer
Publish in top AI conferences and workshops, including NeurIPS, AAAI, and ICML.

What we offer

Health & Wellbeing
Personal & Professional Development
Unconditional Inclusion

Fulltime

Machine Learning Research Scientist

This role focuses on cutting-edge research and development in Artificial Intelli...

Location

United States , Milpitas

Salary:

117500.00 - 270000.00 USD / Year

Hewlett Packard Enterprise

Expiration Date

Until further notice

Requirements

PhD in Computer Science, Electrical Engineering, or related fields focusing on Machine Learning for the dissertation
extensive experience in deep learning research, preferably in Large Language Models or Reinforcement Learning
experience developing applications with deep learning frameworks like PyTorch with a high software proficiency
strong programming skills in Python, data structures, and algorithms are required
experience with ML model optimization, GPU acceleration, heterogeneous computation, system software, and performance optimization desired
experience in Python Web Frameworks – Django, Flask - a plus but not required.

Job Responsibility

conducting research, developing solutions, and creating intellectual property in emerging fields like reinforcement learning, LLMs, digital twins, clean energy, data center optimization, and sustainability
developing advanced technologies for analysis, optimization, time series forecasting, uncertainty quantification, and control
providing thought leadership, collaborating internally and externally, and contributing to HPE’s strategy by identifying emerging technologies
publishing in top conferences like NeurIPS, AAAI, and ACL
developing patent applications
software development, GPU acceleration, model optimization, and real-time data streaming to create robust AI solutions for real-world use cases.

What we offer

a competitive salary and extensive social benefits
diverse and dynamic work environment
work-life balance and support for career development
health and wellbeing programs
personal and professional development programs
diversity, inclusion, and belonging initiatives.

Fulltime

Engineering Director

We are seeking a seasoned Engineering Director who thrives in challenging and fa...

Location

Puerto Rico , Aguadilla

Salary:

Not provided

Hewlett Packard Enterprise

Expiration Date

Until further notice

Requirements

Significant work experience as a director or similar position working across multiple stakeholder organizations, with at least 10+ years of people leadership experience specific to SW and Cloud engineering
Solid experience leading SW development across storage, networking, on-prem, and SaaS is a must
Experience in setting up geographically distributed sites
Must have a strong background in software development lifecycle including cloud infrastructure
Familiarity with agile methodologies and tools like JIRA
Prior experience in cloud product development and deployments
end to end ownership and accountability
Solid understanding of fundamental AI and machine learning concepts, including supervised and unsupervised learning, deep learning, reinforcement learning, natural language processing, computer vision, and statistical modeling
Extensive business acumen, technical knowledge, and industry experience encompassing one or more engineering, technology, and product domains
Demonstrated abilities to drive transformation across a business with exceptional skills in the management of change

Job Responsibility

Oversee the Puerto Rico Site daily operations, strategic planning and cross-functional team leadership for Hybrid Cloud
Recruit, mentor, and manage teams of AI/ML engineers, QA Engineers, Design Engineers and innovation specialists to deliver cutting-edge solutions
Continuously evaluate new tools, platforms, and frameworks in AI/ML to drive competitive advantage and operational efficiency
Ensure alignment with corporate goals while fostering a high-performance culture, operational efficiency, and employee engagement
Lead the development and execution of AI/ML strategies that align with business goals and drive innovation across products, services, or operations
Create strategic and tactical operations and resource plans, goals, and priorities for assigned organization based on business and technology roadmap and functional objectives
Engage with various senior leaders across the organization, program managers, R&D, support, Quality, product managers, technical leaders and executives to communicate program status, escalate issues, and guide and influence strategic decision-making
Manage senior relationships and escalated issues with outsourced partners and suppliers, including setting expectations regarding deliverables, product quality, schedules, and costs
ensures that organization is effectively leveraging outsourced resources
Identify opportunities for and drive organizational initiatives and programs to support business process improvements and cost reductions

What we offer

Health & Wellbeing
Personal & Professional Development
Unconditional Inclusion

Fulltime

Research Scientist Intern, Reinforcement Learning

We’re looking for a curious and motivated Reinforcement Learning Intern to help ...

Location

United Kingdom , London

Salary:

Not provided

Wayve

Expiration Date

Until further notice

Requirements

Currently pursuing a PhD or Masters in Computer Science, Robotics, Electrical Engineering, or a related field, with a focus on Machine Learning, AI, or Computer Vision
Experience in research in Reinforcement Learning
Interest in one or more: synthetic data, representation learning, and Offline RL
Comfortable working in Python and libraries like PyTorch, NumPy, and Pandas
A principled mindset: you enjoy brainstorming, making assumptions, building, testing, and iterating on ideas to see what works

Job Responsibility

Help advance the next generation of decision-making systems for autonomous driving
Work embedded in a research team to develop scalable RL algorithms that enable vehicles to learn complex behaviors directly from experience — both in simulation and the real world

What we offer

Competitive compensation and benefits
A dynamic and fast-paced work environment in which you will grow every day - learning on the job, from the a diverse team of the brightest researchers and engineers in this space
A culture that is ego-free, respectful and welcoming
Potential to publish your research work at a top flight conference
The chance to be part of a truly mission driven organisation and an opportunity to shape the future of autonomous driving

PhD Autonomy Engineer Intern - Planning & Controls (Reinforcement Learning)

Skydio builds the world’s most advanced autonomous drones used across inspection...

Location

Switzerland , Zurich

Salary:

50.00 EUR / Hour

Skydio

Expiration Date

Until further notice

Requirements

PhD student in Robotics, Machine Learning, Controls, or related field
Strong fundamentals in RL, control theory, and motion planning
comfort with safety/robustness concepts
Proficient in Python (PyTorch/JAX/Ray RLlib) and at least one of C++ or CUDA
Hands-on experience with robotics simulation (Isaac Lab/MuJoCo/PyBullet) and sim2real techniques
Experience training/deploying policies for navigation, manipulation, or locomotion on real robots or autonomous vehicles

Job Responsibility

Develop and deploy reinforcement learning (and adjacent policy-learning methods) that make Skydio aircraft plan, navigate, and control themselves more intelligently—safely, reliably, and efficiently—across our ecosystem: handheld apps, ground control, cloud autonomy services, and fleet workflows
Navigation & avoidance in the wild: Train policies that adapt online to cluttered 3D scenes (forests, bridges, urban canyons), complementing our geometric stack for robust obstacle avoidance and dynamic goal-seeking
RL-augmented planning: Fuse learned cost shaping / value functions with trajectory optimization for smooth, agile flight with tight safety envelopes and mission constraints
Sim → Real at scale: Build scalable datasets and training loops with Isaac Lab, domain randomization, residual learning, and safety filters
validate on real drones weekly
Human-in-the-loop shared control: Learn assistive policies that blend pilot intent, autonomy priors, and uncertainty-aware behaviors for intuitive control handoffs
Fleet & multi-agent: Explore decentralized coordination for coverage, pursuit, and collaborative mapping with minimal comms

New

Critical Environment Technical Trainer

As a CO+I Learning CE Technical Trainer you will contribute to establishing tech...

Location

United Kingdom , London

Salary:

Not provided

Microsoft Corporation

Expiration Date

Until further notice

Requirements

Bachelor's Degree AND 4+ years' experience in training, education, critical environments (CE), cloud systems, datacenter environments, server environments, or computer technologies experience - OR equivalent experience
Ability to meet Microsoft, customer and/or government security screening requirements
Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter

Job Responsibility

Align learning preparation with relevant metrics for measuring success (e.g., cloud consumption, survey feedback)
Build awareness of the latest features, technologies, and processes and incorporate those into current learning experiences to ensure quality learning environment and delivery
Deliver single to multi-day training or other learning experiences in-person or virtually with minimal supervision leveraging a variety of delivery methods, including presentations, discussions, labs, and simulations, to deliver training
Use digitally enhanced instructor-led resources to effectively deliver technology-based learning experiences that prepare our Critical Environment technicians for local qualifications
Apply classroom management techniques such as flexible learning, behavioral reinforcement, and time management to reinforce learning while engaging with learners to ensure content and concepts are understood and aligned with operational goals
Create a friendly, supportive environment and encourage learners to ask questions
monitor the progress of learners to appropriately reinforce important learning
participate in hands-on lab activities to further build upon learning
and provide course content feedback to broader Learning team using appropriate internal team channels as necessary to ensure consistency and relevance across our curriculum portfolio
Embody our culture and values

Fulltime

Reinforcement learning intern

Enchanted Tools

Location:
France , Paris

Category:
Research and Development

Contract Type:
Not provided

Salary:

Job Description:

Job Responsibility:

Requirements:

Additional Information:

Job Posted:
December 08, 2025

Looking for more opportunities? Search for other job offers that match your skills and interests.

Similar Jobs for Reinforcement learning intern

AI Research Engineer - Reinforcement Learning