CrawlJobs Logo

Reinforcement learning intern

enchanted.tools Logo

Enchanted Tools

Location Icon

Location:
France , Paris

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

As a Reinforcement Learning Intern, you will help develop and implement learning-based navigation and control algorithms for the Mirokai humanoid robot, which balances dynamically on a ball. You will work closely with the team to extend our simulation environments, train agents, and validate policies on real hardware. This internship offers deep hands-on experience in RL for real-world robotics — from simulation to deployment.

Job Responsibility:

  • Develop, debug, and test reinforcement learning algorithms for locomotion and navigation on a dynamically balancing base
  • Extend simulation environments (Isaac Sim / Isaac Lab) to support training and evaluation of RL policies
  • Integrate trained policies into the Mirokai software stack and validate them on physical robots
  • Analyze performance, stability, and sim-to-real transfer aspects
  • Stay up to date with recent research in reinforcement learning for robotics

Requirements:

  • BSc holder in Robotics, Engineering, Computer Science, or related field
  • Coursework or project experience in reinforcement learning or learning-based control
  • Strong Python skills and knowledge of a deep learning framework PyTorch, JAX, or TensorFlow
  • Familiarity with simulation environments such as Isaac Sim, Mujoco, or Gazebo
  • Solid analytical and problem-solving abilities

Additional Information:

Job Posted:
December 08, 2025

Work Type:
On-site work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Reinforcement learning intern

AI Research Engineer - Reinforcement Learning

At Helsing we deliver AI-based capabilities and the enabling infrastructure that...
Location
Location
Germany , Munich
Salary
Salary:
Not provided
helsing.ai Logo
Helsing
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Hold MSc in machine learning with a speciality in either reinforcement learning, multi-agent systems, automation and control, or robotics
  • Have excellent communication skills and the ability to report and present research findings clearly and efficiently both internally and externally
  • Are passionate about keeping up-to-date with current research and enjoy reimplementing / extending papers on state-of-the-art Deep Learning-based approaches
  • Possess solid software engineering skills, writing clean and well-structured code in Python and/or languages like Rust, Java, or modern C++, and experience deploying AI software to production including testing, QA, and monitoring
Job Responsibility
Job Responsibility
  • Design, train and deploy agents in complex multi-agent environments
  • Contribute to our reinforcement learning stack by implementing, improving and extending the current state of the art in multi-agent reinforcement learning
  • Be a part of impactful projects and will collaborate with people across several teams and backgrounds to integrate cutting edge ML/AI in our production systems
What we offer
What we offer
  • Competitive compensation and stock options
  • Relocation support
  • Social and education allowances
  • Regular company events and all-hands to bring together employees as one team across Europe
  • A hands-on onboarding program (affectionately labelled “AI-duction”), in which you will be familiarising yourself with our tools and ML pipelines used across the company
  • Fulltime
Read More
Arrow Right

Associate Director, Reinforcement Learning (ML)

Lead Amgen’s strategy and execution for Reinforcement Learning from Human Feedba...
Location
Location
United States , Thousand Oaks; Jacksonville
Salary
Salary:
Not provided
amgen.com Logo
Amgen
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Doctorate degree and 3 years of Computer Science, IT or related field experience
  • Master’s degree and 5 years of Computer Science, IT or related field experience
  • Bachelor’s degree and 7 years of Computer Science, IT or related field experience
  • Associate’s degree and 12 years of Computer Science, IT or related field experience
  • High school diploma / GED and 14 years of Computer Science, IT or related field experience
  • Deep, hands-on expertise in Reinforcement Learning from Human Feedback (RLHF) and/or advanced reinforcement learning, including reward modeling, policy optimization, exploration strategies, and offline/online evaluation
  • Demonstrated experience deploying RLHF or RL systems into production for real-world applications (e.g., large language models, recommendation systems, decision support tools, or workflow automation), ideally in healthcare, life sciences, or other regulated domains
  • Strong background in modern machine learning and deep learning, with practical experience in Python and frameworks such as PyTorch or TensorFlow, and familiarity with LLM ecosystems and tooling
  • Experience driving sophisticated, cross-functional initiatives, collaborating with non-technical stakeholders (e.g., physicians, scientists, commercial leaders, compliance, legal) and translating needs into impactful AI solutions
  • Strong ability to communicate complex technical topics simply, tailoring content to senior executives and non-technical audiences
Job Responsibility
Job Responsibility
  • Lead the design and development of RLHF systems including reward modeling, policy optimization, safety and alignment mechanisms, and evaluation frameworks for large language models and other AI systems
  • Drive hands-on technical execution, particularly for high-impact projects, reviewing architectures, experimentation plans, and code, and helping the team navigate scientific and engineering trade-offs
  • Establish best-practice pipelines for human feedback, partnering closely with internal customer teams to define feedback protocols, annotation quality standards, and governance for RLHF data
  • Define and track success metrics for RLHF systems, balancing offline and online evaluation, A/B tests, safety and robustness criteria, and business or scientific outcomes
  • Collaborate across Amgen leaders to ensure RLHF solutions are aligned with strategy, compliant with policy, and integrated into real workflows
  • Partner with Data, Platform and Technology teams to ensure that RLHF workloads are supported by scalable data platforms, model hosting, experimentation infrastructure, and MLOps best practices
  • Champion responsible and compliant AI, working with Legal, Compliance, and Information Security to implement governance around human feedback, data usage, model behavior, transparency, and risk management in a regulated environment
  • Communicate insights and influence senior stakeholders, creating clear narratives, roadmaps, and recommendations that help executives understand RLHF trade-offs, risks, and opportunities
What we offer
What we offer
  • A comprehensive employee benefits package, including a Retirement and Savings Plan with generous company contributions, group medical, dental and vision coverage, life and disability insurance, and flexible spending accounts
  • A discretionary annual bonus program, or for field sales representatives, a sales-based incentive plan
  • Stock-based long-term incentives
  • Award-winning time-off plans
  • Flexible work models where possible
Read More
Arrow Right

Machine Learning Research Associate

The Machine Learning research team at Hewlett Packard Labs seeks highly motivate...
Location
Location
United States , Milpitas
Salary
Salary:
43.27 - 93.15 USD / Hour
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Pursuing a Ph.D. degree (with significant research and innovation experience) in a relevant discipline (e.g. machine learning, computer science, electrical engineering, statistics, etc.)
  • Track record of world-class innovative contributions and ideas in machine learning
  • Experience in deep learning, LLM, Agentic AI, and reinforcement learning research
  • Experience in developing deep learning software with high proficiency in data structures and algorithms
  • Experience in Machine Learning frameworks like PyTorch - required
  • Strong programming skills and experience with Python
  • Software development experience in Deep Learning, GPU acceleration, and Model Optimization
  • Demonstrated effective communication and collaboration skills
  • Demonstrated ability for original research papers published in top-tier conferences or journals.
Job Responsibility
Job Responsibility
  • Provide thought leadership and technical influence both internally and externally to HPE
  • Work on cutting-edge machine learning research focusing on Large Language Models, Agentic AI, and Reinforcement Learning
  • Contribute along the full range from initial novel ideas to design, development, implementation, evaluation, and technology transfer
  • Publish in top AI conferences and workshops, including NeurIPS, AAAI, and ICML.
What we offer
What we offer
  • Health & Wellbeing
  • Personal & Professional Development
  • Unconditional Inclusion
  • Fulltime
Read More
Arrow Right

Machine Learning Research Scientist

This role focuses on cutting-edge research and development in Artificial Intelli...
Location
Location
United States , Milpitas
Salary
Salary:
117500.00 - 270000.00 USD / Year
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • PhD in Computer Science, Electrical Engineering, or related fields focusing on Machine Learning for the dissertation
  • extensive experience in deep learning research, preferably in Large Language Models or Reinforcement Learning
  • experience developing applications with deep learning frameworks like PyTorch with a high software proficiency
  • strong programming skills in Python, data structures, and algorithms are required
  • experience with ML model optimization, GPU acceleration, heterogeneous computation, system software, and performance optimization desired
  • experience in Python Web Frameworks – Django, Flask - a plus but not required.
Job Responsibility
Job Responsibility
  • conducting research, developing solutions, and creating intellectual property in emerging fields like reinforcement learning, LLMs, digital twins, clean energy, data center optimization, and sustainability
  • developing advanced technologies for analysis, optimization, time series forecasting, uncertainty quantification, and control
  • providing thought leadership, collaborating internally and externally, and contributing to HPE’s strategy by identifying emerging technologies
  • publishing in top conferences like NeurIPS, AAAI, and ACL
  • developing patent applications
  • software development, GPU acceleration, model optimization, and real-time data streaming to create robust AI solutions for real-world use cases.
What we offer
What we offer
  • a competitive salary and extensive social benefits
  • diverse and dynamic work environment
  • work-life balance and support for career development
  • health and wellbeing programs
  • personal and professional development programs
  • diversity, inclusion, and belonging initiatives.
  • Fulltime
Read More
Arrow Right

Engineering Director

We are seeking a seasoned Engineering Director who thrives in challenging and fa...
Location
Location
Puerto Rico , Aguadilla
Salary
Salary:
Not provided
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Significant work experience as a director or similar position working across multiple stakeholder organizations, with at least 10+ years of people leadership experience specific to SW and Cloud engineering
  • Solid experience leading SW development across storage, networking, on-prem, and SaaS is a must
  • Experience in setting up geographically distributed sites
  • Must have a strong background in software development lifecycle including cloud infrastructure
  • Familiarity with agile methodologies and tools like JIRA
  • Prior experience in cloud product development and deployments
  • end to end ownership and accountability
  • Solid understanding of fundamental AI and machine learning concepts, including supervised and unsupervised learning, deep learning, reinforcement learning, natural language processing, computer vision, and statistical modeling
  • Extensive business acumen, technical knowledge, and industry experience encompassing one or more engineering, technology, and product domains
  • Demonstrated abilities to drive transformation across a business with exceptional skills in the management of change
Job Responsibility
Job Responsibility
  • Oversee the Puerto Rico Site daily operations, strategic planning and cross-functional team leadership for Hybrid Cloud
  • Recruit, mentor, and manage teams of AI/ML engineers, QA Engineers, Design Engineers and innovation specialists to deliver cutting-edge solutions
  • Continuously evaluate new tools, platforms, and frameworks in AI/ML to drive competitive advantage and operational efficiency
  • Ensure alignment with corporate goals while fostering a high-performance culture, operational efficiency, and employee engagement
  • Lead the development and execution of AI/ML strategies that align with business goals and drive innovation across products, services, or operations
  • Create strategic and tactical operations and resource plans, goals, and priorities for assigned organization based on business and technology roadmap and functional objectives
  • Engage with various senior leaders across the organization, program managers, R&D, support, Quality, product managers, technical leaders and executives to communicate program status, escalate issues, and guide and influence strategic decision-making
  • Manage senior relationships and escalated issues with outsourced partners and suppliers, including setting expectations regarding deliverables, product quality, schedules, and costs
  • ensures that organization is effectively leveraging outsourced resources
  • Identify opportunities for and drive organizational initiatives and programs to support business process improvements and cost reductions
What we offer
What we offer
  • Health & Wellbeing
  • Personal & Professional Development
  • Unconditional Inclusion
  • Fulltime
Read More
Arrow Right

Research Scientist Intern, Reinforcement Learning

We’re looking for a curious and motivated Reinforcement Learning Intern to help ...
Location
Location
United Kingdom , London
Salary
Salary:
Not provided
wayve.ai Logo
Wayve
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Currently pursuing a PhD or Masters in Computer Science, Robotics, Electrical Engineering, or a related field, with a focus on Machine Learning, AI, or Computer Vision
  • Experience in research in Reinforcement Learning
  • Interest in one or more: synthetic data, representation learning, and Offline RL
  • Comfortable working in Python and libraries like PyTorch, NumPy, and Pandas
  • A principled mindset: you enjoy brainstorming, making assumptions, building, testing, and iterating on ideas to see what works
Job Responsibility
Job Responsibility
  • Help advance the next generation of decision-making systems for autonomous driving
  • Work embedded in a research team to develop scalable RL algorithms that enable vehicles to learn complex behaviors directly from experience — both in simulation and the real world
What we offer
What we offer
  • Competitive compensation and benefits
  • A dynamic and fast-paced work environment in which you will grow every day - learning on the job, from the a diverse team of the brightest researchers and engineers in this space
  • A culture that is ego-free, respectful and welcoming
  • Potential to publish your research work at a top flight conference
  • The chance to be part of a truly mission driven organisation and an opportunity to shape the future of autonomous driving
Read More
Arrow Right

PhD Autonomy Engineer Intern - Planning & Controls (Reinforcement Learning)

Skydio builds the world’s most advanced autonomous drones used across inspection...
Location
Location
Switzerland , Zurich
Salary
Salary:
50.00 EUR / Hour
skydio.com Logo
Skydio
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • PhD student in Robotics, Machine Learning, Controls, or related field
  • Strong fundamentals in RL, control theory, and motion planning
  • comfort with safety/robustness concepts
  • Proficient in Python (PyTorch/JAX/Ray RLlib) and at least one of C++ or CUDA
  • Hands-on experience with robotics simulation (Isaac Lab/MuJoCo/PyBullet) and sim2real techniques
  • Experience training/deploying policies for navigation, manipulation, or locomotion on real robots or autonomous vehicles
Job Responsibility
Job Responsibility
  • Develop and deploy reinforcement learning (and adjacent policy-learning methods) that make Skydio aircraft plan, navigate, and control themselves more intelligently—safely, reliably, and efficiently—across our ecosystem: handheld apps, ground control, cloud autonomy services, and fleet workflows
  • Navigation & avoidance in the wild: Train policies that adapt online to cluttered 3D scenes (forests, bridges, urban canyons), complementing our geometric stack for robust obstacle avoidance and dynamic goal-seeking
  • RL-augmented planning: Fuse learned cost shaping / value functions with trajectory optimization for smooth, agile flight with tight safety envelopes and mission constraints
  • Sim → Real at scale: Build scalable datasets and training loops with Isaac Lab, domain randomization, residual learning, and safety filters
  • validate on real drones weekly
  • Human-in-the-loop shared control: Learn assistive policies that blend pilot intent, autonomy priors, and uncertainty-aware behaviors for intuitive control handoffs
  • Fleet & multi-agent: Explore decentralized coordination for coverage, pursuit, and collaborative mapping with minimal comms
Read More
Arrow Right
New

Critical Environment Technical Trainer

As a CO+I Learning CE Technical Trainer you will contribute to establishing tech...
Location
Location
United Kingdom , London
Salary
Salary:
Not provided
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree AND 4+ years' experience in training, education, critical environments (CE), cloud systems, datacenter environments, server environments, or computer technologies experience - OR equivalent experience
  • Ability to meet Microsoft, customer and/or government security screening requirements
  • Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter
Job Responsibility
Job Responsibility
  • Align learning preparation with relevant metrics for measuring success (e.g., cloud consumption, survey feedback)
  • Build awareness of the latest features, technologies, and processes and incorporate those into current learning experiences to ensure quality learning environment and delivery
  • Deliver single to multi-day training or other learning experiences in-person or virtually with minimal supervision leveraging a variety of delivery methods, including presentations, discussions, labs, and simulations, to deliver training
  • Use digitally enhanced instructor-led resources to effectively deliver technology-based learning experiences that prepare our Critical Environment technicians for local qualifications
  • Apply classroom management techniques such as flexible learning, behavioral reinforcement, and time management to reinforce learning while engaging with learners to ensure content and concepts are understood and aligned with operational goals
  • Create a friendly, supportive environment and encourage learners to ask questions
  • monitor the progress of learners to appropriately reinforce important learning
  • participate in hands-on lab activities to further build upon learning
  • and provide course content feedback to broader Learning team using appropriate internal team channels as necessary to ensure consistency and relevance across our curriculum portfolio
  • Embody our culture and values
  • Fulltime
Read More
Arrow Right