CrawlJobs Logo

Research Scientist / Engineer – Multimodal Capabilities

lumalabs.ai Logo

Luma AI

Location Icon

Location:
United States , Palo Alto

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

187500.00 - 395000.00 USD / Year

Job Description:

This is a high-impact opportunity to define the future of what our models can do. As a first-principles researcher, you will tackle the most ambitious questions at the heart of our mission: how can the fusion of vision, audio, and language unlock entirely new, magical behaviors in Al? You will not just be improving existing systems, you will be charting the course for the next generation of model capabilities, designing the core experiments that will shape the future of our technology and products.

Job Responsibility:

  • Research and Define the next frontier of multimodal capabilities, identifying key gaps in our current models and designing the experiments to solve them
  • Design and Execute novel experiments, datasets, and methodologies to systematically improve model performance across vision, audio, and language
  • Develop and Pioneer new evaluation frameworks and benchmarking approaches to precisely measure novel multimodal behaviors and capabilities
  • Collaborate Deeply with other research teams to translate your findings into our core training recipes and unlock new product experiences
  • Build and Prototype compelling demonstrations that showcase the groundbreaking multimodal capabilities you have unlocked

Requirements:

  • PhD or equivalent research experience in a field related to AI, Machine Learning, or Computer Science
  • Strong programming skills in Python and deep, hands-on experience with PyTorch
  • Proven track record of working with multimodal data pipelines and curating large-scale datasets for research
  • Deep, fundamental understanding of at least one of the core modalities: computer vision, audio processing, or natural language processing
  • Thrive on tackling the most ambitious, open-ended research challenges in a fast-paced, collaborative environment

Nice to have:

  • Direct expertise working with complex, interleaved multimodal data (video, audio, text)
  • Hands-on experience training or fine-tuning Vision Language Models (VLMs), Audio Language Models, or large-scale generative video models from scratch
  • A strong publication record in top-tier AI conferences (e.g., NeurIPS, ICML, CVPR, ICLR)
  • Experience leading ambitious, open-ended research projects from ideation to tangible results

Additional Information:

Job Posted:
January 13, 2026

Employment Type:
Fulltime
Work Type:
Remote work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Research Scientist / Engineer – Multimodal Capabilities

Sr. Applied Research Scientist

We’re looking for a Sr. Applied Research Scientist to lead efforts in building l...
Location
Location
United States
Salary
Salary:
280000.00 - 380000.00 USD / Year
runwayml.com Logo
Runway
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 4+ years of relevant ML engineering or research experience in language models
  • Very strong programming skills and ability to write clean and maintainable research code
  • Deep interest in building human-in-the-loop systems for creativity
  • Passion for seeing research through from initial conception to eventual application
  • Experience mentoring and teaching other researchers
  • Strong communication, collaboration, and documentation skills
Job Responsibility
Job Responsibility
  • Lead efforts in building large language models and vision language models that power Runway’s research and tools, with a focus on multimodal capabilities and reasoning
  • Fulltime
Read More
Arrow Right

Research Scientist - Generative AI

As a Research Scientist in the Emergent Machine Intelligence Team at Hewlett Pac...
Location
Location
United States , Santa Barbara
Salary
Salary:
101900.00 - 234500.00 USD / Year
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • PhD in Computer Science, Artificial Intelligence, Machine Learning, Physics, Mathematics, or other related fields
  • 3-5 years working experience with training and fine-tuning generative AI models including LLMs, diffusion models, or Energy-Based Models
  • Proven track record of research in generative models, demonstrated through publications, patents, or publicly available projects
  • Proficiency in programming languages commonly used in AI research, such as Python, and experience with AI/ML frameworks (e.g., TensorFlow, PyTorch)
  • Deep understanding of machine learning algorithms and principles, especially in the context of generative AI
  • Strong mathematical background, with excellent skills in areas such as statistics, probability, linear algebra
  • Creative and analytical thinking abilities, with a passion for solving complex problems
  • Excellent communication skills, capable of conveying complex ideas clearly and engaging with both technical and non-technical audiences.
Job Responsibility
Job Responsibility
  • Conduct high-quality research in generative AI, including but not limited to designing algorithms for pre-training and post-training current autoregressive and diffusion models for multimodal data
  • Design, implement, and validate new algorithms and models for augmented LLMs, pushing the boundaries of AI capabilities
  • Developing and prototyping novel algorithms for fine-tuning, retrieval augmented generation, and in-context learning for various generative models
  • Developing algorithms for training and inference in Energy-Based Models
  • Collaborate with cross-functional teams to apply research findings to develop new products or enhance existing ones
  • Publish research papers in top-tier journals and conferences, sharing findings with the broader scientific community
  • Stay abreast of the latest AI research and trends, identifying opportunities for innovation and improvement
  • Mentor junior researchers and engineers, fostering a culture of knowledge sharing and collaboration
  • Develop prototypes and proof-of-concept implementations to demonstrate the potential of research findings
  • Engage with the academic community by attending conferences, workshops, and seminars.
What we offer
What we offer
  • A competitive salary and extensive social benefits
  • Diverse and dynamic work environment
  • Work-life balance and support for career development.
  • Fulltime
Read More
Arrow Right

Research Scientist - Generative AI

This role involves conducting high-quality research in generative AI, designing ...
Location
Location
United States
Salary
Salary:
101900.00 - 234500.00 USD / Year
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • PhD in Computer Science, Artificial Intelligence, Machine Learning, Physics, Mathematics, or other related fields
  • 3-5 years working experience with training and fine-tuning generative AI models including LLMs, diffusion models, or Energy-Based Models
  • Proven track record of research in generative models, demonstrated through publications, patents, or publicly available projects
  • Proficiency in programming languages commonly used in AI research, such as Python, and experience with AI/ML frameworks (e.g., TensorFlow, PyTorch)
  • Deep understanding of machine learning algorithms and principles, especially in the context of generative AI
  • Strong mathematical background, with excellent skills in areas such as statistics, probability, linear algebra
  • Creative and analytical thinking abilities, with a passion for solving complex problems
  • Excellent communication skills, capable of conveying complex ideas clearly and engaging with both technical and non-technical audiences
Job Responsibility
Job Responsibility
  • Conduct high-quality research in generative AI, including but not limited to designing algorithms for pre-training and post-training current autoregressive and diffusion models for multimodal data
  • Design, implement, and validate new algorithms and models for augmented LLMs, pushing the boundaries of AI capabilities
  • Developing and prototyping novel algorithms for fine-turning, retrieval augmented generation, and in-context learning for various generative models
  • Developing algorithms for training and inference in Energy-Based Models
  • Collaborate with cross-functional teams to apply research findings to develop new products or enhance existing ones
  • Publish research papers in top-tier journals and conferences, sharing findings with the broader scientific community
  • Stay abreast of the latest AI research and trends, identifying opportunities for innovation and improvement
  • Mentor junior researchers and engineers, fostering a culture of knowledge sharing and collaboration
  • Develop prototypes and proof-of-concept implementations to demonstrate the potential of research findings
  • Engage with the academic community by attending conferences, workshops, and seminars
What we offer
What we offer
  • A competitive salary and extensive social benefits
  • Diverse and dynamic work environment
  • Work-life balance and support for career development
  • Health & Wellbeing
  • Personal & Professional Development
  • Unconditional Inclusion
  • Fulltime
Read More
Arrow Right

AI Research Scientist

We are seeking AI Researchers to join the Product and Applied Research (PAR) Med...
Location
Location
United States , Menlo Park
Salary
Salary:
154000.00 - 217000.00 USD / Year
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
  • Has obtained a PhD in Computer Science, AI/ML, or a relevant technical field
  • 1+ year of industry research experience in LLM/NLP, computer vision, or related AI/ML models
  • Experience owning and/or driving complex technical projects from end-to-end
  • Skilled in model training, data, or inference & efficiency for image, video, and/or related multimodal models
  • Proficient in media generation, understanding, and/or grounding
  • Programming experience in Python and hands-on experience with frameworks like PyTorch or Spark
  • Demonstrated significant industry influence in the field of AI and/or recently published research in leading peer-reviewed conferences (e.g., ACL, NeurIPS, ICML, ICLR, AAAI, KDD, CVPR, ICCV)
Job Responsibility
Job Responsibility
  • Contribute to the training of next-generation multimodal foundation models, advance their capabilities in understanding, generation, and grounding, and enable them for downstream product use-cases
  • Support creative data sourcing, high-quality pre/mid/post-training data curation, and scale and optimize data pipelines for multimodal large language models (LLMs)
  • Lead, collaborate, and execute on research that pushes forward the state of the art in multimodal reasoning and generation research, and prioritize research that can be directly applied to Meta’s product development
What we offer
What we offer
  • bonus
  • equity
  • benefits
  • Fulltime
Read More
Arrow Right

Research Scientist Intern, Real-Time Multimodal AI

Reality Labs is building the future of connection through world-class AR/VR hard...
Location
Location
United States , Burlingame
Salary
Salary:
7650.00 - 12134.00 USD / Month
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Currently has, or is in the process of obtaining, a PhD degree in Computer Science, Machine Learning, Electrical Engineering, or a related field
  • 2+ years of research experience in one or more of the following areas: multimodal learning, vision-language models, large language models, or foundation model fine-tuning
  • Hands-on experience fine-tuning large foundation models (e.g., LLaVA, InternVL, Qwen-VL, LLaMA, or similar)
  • Strong programming skills in Python
  • Experience with deep learning frameworks such as PyTorch
  • Excellent communication skills and ability to work independently
  • Must obtain work authorization in the country of employment at the time of hire, and maintain ongoing work authorization during employment
Job Responsibility
Job Responsibility
  • Research and develop novel approaches for fine-tuning large multimodal foundation models (vision-language, audio-visual) for real-time applications
  • Design and implement efficient inference pipelines for deploying fine-tuned models in real-time communication scenarios
  • Explore agentic architectures that leverage fine-tuned models as tools within larger AI systems
  • Collaborate with cross-functional teams to integrate models into prototype experiences
  • Document and present research progress with the goal of publishing findings at top-tier ML/CV conferences
  • Contribute to building working prototypes that demonstrate the capabilities of fine-tuned multimodal models
Read More
Arrow Right

AI Research Scientist, Media - MSL PAR

We are seeking AI Researchers to join the Product and Applied Research (PAR) Med...
Location
Location
United States , Menlo Park
Salary
Salary:
122000.00 - 181000.00 USD / Year
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Currently has, or is in the process of obtaining a Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience. Degree must be completed prior to joining Meta
  • Currently has or is in the process of obtaining a PhD in Computer Science, AI/ML, or a relevant technical field
  • PhD research and/or industry research experience in LLM/NLP, computer vision, or related AI/ML models
  • Experience working on complex technical projects from end-to-end
  • Skilled in model training, data, or inference & efficiency for image, video, and/or related multimodal models
  • Proficient in media generation, understanding, and/or grounding
  • Programming experience in Python and hands-on experience with frameworks like PyTorch or Spark
  • Demonstrated significant industry influence in the field of AI and/or recently published research in leading peer-reviewed conferences (e.g., ACL, NeurIPS, ICML, ICLR, AAAI, KDD, CVPR, ICCV)
Job Responsibility
Job Responsibility
  • Contribute to the training of next-generation multimodal foundation models, advance their capabilities in understanding, generation, and grounding, and enable them for downstream product use-cases
  • Support creative data sourcing, high-quality pre/mid/post-training data curation, and scale and optimize data pipelines for multimodal large language models (LLMs)
  • Lead, collaborate, and execute on research that pushes forward the state of the art in multimodal reasoning and generation research, and prioritize research that can be directly applied to Meta’s product development
What we offer
What we offer
  • bonus
  • equity
  • benefits
  • Fulltime
Read More
Arrow Right

ML Tech Lead Manager

Abridge is building an AI platform for clinical conversations to solve the admin...
Location
Location
United States , San Francisco
Salary
Salary:
246500.00 - 290000.00 USD / Year
abridge.com Logo
Abridge
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years of experience in machine learning/NLP, with a strong track record of impactful publications and deployed systems
  • 3+ years of experience as technical lead for a project of 4 or more individuals
  • 2+ years of direct management experience, managing Researchers and Applied Scientists
  • Deeply technical, with expertise across NLP and LLMs—e.g., pre-training/post-training/SFT/RL, retrieval augmented generation (RAG), reasoning capabilities, multilinguality, multimodality, synthetic data generation, noise robustness, judges, and evaluation
  • Comfortable bridging research, product, and engineering, translating novel ideas into robust, scalable systems that deliver real-world impact
  • Have fluency with libraries for scientific computing (e.g. SciPy, Numpy) and machine learning (e.g., PyTorch, TensorFlow, Scikit-learn, Pandas)
  • A thoughtful leader: you foster a feedback-rich culture, set a high bar for scientific rigor, and empower your team through mentorship, clear vision, and mutual trust
  • Up-to-date on the latest in NLP and ML research, with excitement for continuous learning
  • Motivated to work in a fast-paced, collaborative environment where your team’s science has direct clinical impact
  • Must be willing to work from our SF office at least 3x per week
Job Responsibility
Job Responsibility
  • Lead, mentor, and scale a high-impact team of NLP/ML researchers, fostering a culture of technical rigor, creativity, and scientific excellence
  • Set R&D strategy that aligns with company priorities
  • Grow talent pipelines by recruiting world-class researchers and nurturing career development through coaching and mentorship
  • Balance long-term vision with near-term impact, ensuring research not only advances the field but also translates into product-grade systems
  • Partner with engineering teams and clinicians to operationalize novel algorithms into clinician-facing workflows, optimizing both quality and efficiency
  • Ensure safety and robustness in deployment, shaping the standards for evaluation in clinical AI
  • Represent Abridge in the global ML/NLP research community through publications, conference presentations, and industry/academic partnerships
What we offer
What we offer
  • Generous Time Off: 14 paid holidays, flexible PTO for salaried employees, and accrued time off for hourly employees
  • Comprehensive Health Plans: Medical, Dental, and Vision coverage for all full-time employees and their families
  • Generous HSA Contribution: If you choose a High Deductible Health Plan, Abridge makes monthly contributions to your HSA
  • Paid Parental Leave: Generous paid parental leave for all full-time employees
  • Family Forming Benefits: Resources and financial support to help you build your family
  • 401(k) Matching: Contribution matching to help invest in your future
  • Personal Device Allowance: Tax free funds for personal device usage
  • Pre-tax Benefits: Access to Flexible Spending Accounts (FSA) and Commuter Benefits
  • Lifestyle Wallet: Monthly contributions for fitness, professional development, coworking, and more
  • Mental Health Support: Dedicated access to therapy and coaching to help you reach your goals
  • Fulltime
Read More
Arrow Right

Tech Lead Manager - Behaviour Learning for Embodied AI

The Science organisation at Wayve advances foundational research in embodied AI ...
Location
Location
United Kingdom , London
Salary
Salary:
Not provided
wayve.ai Logo
Wayve
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Years of experience in applied ML/AI roles with strong hands-on contributions
  • Demonstrated track record of impactful technical work in one or more of: multimodal learning, reinforcement learning, generative models, latent action modelling, optimisation, or planning
  • Experience building large-scale ML infrastructure and working with high-dimensional temporal data (e.g., video, multi-sensor inputs)
  • Deep understanding of the end-to-end lifecycle of ML research and deployment
  • Strong Python and PyTorch engineering fundamentals, with experience developing research-grade, production-oriented tools
  • Proven ability to shape technical strategy and lead architectural design for ML systems
  • Publications at top-tier ML conferences such as NeurIPS, ICML, CoRL or ICLR
  • Clear and thoughtful communicator, capable of influencing technical direction and mentoring others without formal reporting lines
Job Responsibility
Job Responsibility
  • Architect the future – Design and evolve models for efficient, robust, and adaptable autonomy, setting a high technical bar for quality and innovation
  • Accelerate research impact – Partner with team members to test, scale, and productionise research ideas - from architecture design to data strategy. Provide technical guidance and feedback on research design, implementation, and evaluation. Implement scalable, high-throughput training pipelines for models with temporal context and develop and evaluate novel data sampling strategies to accelerate training and generalisation
  • Get hands-on when it matters – Lead from the front by contributing directly to key system components, codebases, and experiments, especially during high-leverage moments. Contribute directly as an IC on core research and development tasks (~60-70% of time)
  • Disrupt thoughtfully – Challenge assumptions, ask sharp questions, and champion bold ideas that push us beyond incremental gains and toward breakthrough advances
  • Make things happen – Lead a high-performing, cross-functional team of applied scientists and ML engineers working across ML, RL, representation learning, planning, among many more. Work closely with the team manager to drive quarterly planning and execution of research-engineering initiatives, enabling rapid iteration and delivery in high-ambiguity environments. Translate ambiguity into action and ensure technical progress tracks with our mission
  • Champion change – Lead through ambiguity. Balance structure and adaptability to help your team navigate evolving priorities, novel research, and complex organisational change
Read More
Arrow Right