CrawlJobs Logo

Junior Research Infrastructure Engineer

meshy.ai Logo

Meshy LLC

Location Icon

Location:
United States , Sunnyvale

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

We are seeking a Product-Minded Junior Research Infrastructure Engineer to join our growing team. This is a '70/30' role: you will spend 70% of your time on hardcore backend and infrastructure—tackling complex distributed systems—and 30% of your time building intuitive internal tools that transform our platform capabilities into a seamless product experience for researchers. You will design, build, and operate distributed data systems that power large-scale ingestion, processing, and transformation of datasets used for AI model training. This is a versatile role: you’ll own end-to-end pipelines, ensure data quality and scalability, and collaborate closely with ML researchers to prepare diverse datasets for cutting-edge model training. You’ll thrive in our fast-paced startup environment, where problem-solving, adaptability, and wearing multiple hats are the norm.

Job Responsibility:

  • Participate in the design and implementation of distributed task orchestration systems using Temporal or Celery
  • Architect pipelines across cloud object storage (S3, GCS), data lakes, and metadata catalogs
  • Implement partitioning, sharding, and caching strategies to ensure data processing pipelines are resilient, highly available, and consistent
  • Design, implement, and maintain distributed ingestion pipelines for structured and unstructured data (images, 3D/2D assets, binaries)
  • Build scalable ETL/ELT workflows to transform, validate, and enrich datasets for AI/ML model training and analytics
  • Support preprocessing of unstructured assets (e.g., images, 3D/2D models, video) for training pipelines, including format conversion, normalization, augmentation, and metadata extraction
  • Implement validation and quality checks to ensure datasets meet ML training requirements
  • Collaborate with ML researchers to quickly adapt pipelines to evolving pretraining and evaluation needs
  • Use infrastructure-as-code (Terraform, Kubernetes, etc.) to manage scalable and reproducible environments
  • Manage data assets using Databricks Asset Bundles (DABs) and build rigorous CI/CD pipelines (GitHub Actions)
  • Focus on maximizing cluster utilization (CPU/Memory) and optimizing EC2 instance allocation to aggressively reduce compute costs
  • Take ownership of the platform's 'Interface' by building Data Explorers and management consoles using React or Next.js
  • Actively listen to researchers and data scientists to iterate on UI/UX based on their feedback
  • Simplify complex CLI operations into intuitive GUI interactions to boost overall developer experience (DevEx)

Requirements:

  • 2+ years of experience in software engineering, backend development, or distributed systems
  • Strong programming skills in Python (plus Scala/Java/C++ a plus)
  • Familiarity with distributed frameworks (Spark, Dask, Ray) and cloud platforms (AWS/GCP/Azure)
  • Experience with workflow orchestration tools (Temporal, Celery, or Airflow)
  • Proficiency with Infrastructure as Code (Terraform) and CI/CD tools (GitHub Actions)
  • Experience building web applications or internal tools using React or Next.js
  • A 'product-first' mindset: an interest in how users interact with infrastructure and a desire to build clean, functional interfaces

Nice to have:

  • Experience handling large-scale unstructured datasets (images, video, binaries, or 3D/2D assets)
  • Familiarity with AI/ML training data pipelines, including dataset versioning, augmentation, and sharding
  • Exposure to computer graphics or 3D/2D data processing
  • Kubernetes (K8s) for distributed workloads and cluster orchestration
  • Data lakehouse platforms (specifically Databricks and DABs)
  • Familiarity with GPU-accelerated computing and HPC clusters
  • Experience with 3D/2D asset processing (geometry transformations, rendering pipelines)
  • Located in or near one of our employee hubs — Bay Area, CA
  • Seattle, WA
What we offer:
  • Competitive salary, equity, and benefits package
  • Opportunity to work with a talented and passionate team at the forefront of AI and 3D technology
  • Flexible work environment, with options for remote and on-site work
  • Opportunities for fast professional growth and development
  • An inclusive culture that values creativity, innovation, and collaboration
  • Unlimited, flexible time off
  • Stock options available for core team members
  • 401(k) plan for employees
  • Comprehensive health, dental, and vision insurance
  • The latest and best office equipment

Additional Information:

Job Posted:
February 18, 2026

Employment Type:
Fulltime
Work Type:
Hybrid work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Junior Research Infrastructure Engineer

Senior Machine Learning Infrastructure Engineer

As a Senior ML Infrastructure Engineer at Plus, you will design scalable archite...
Location
Location
United States , Santa Clara
Salary
Salary:
160000.00 - 200000.00 USD / Year
plus.ai Logo
PlusAI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Phd or MS in Computer Science, Electrical Engineering, or related field
  • Good oral and written communication skills
  • Phd new grad or Masters with 3+ years of software engineering experience with a focus on ML infrastructure or distributed systems
  • Proficiency in in Python, C++, SQL
  • Deep understanding of containerization, orchestration technologies, distributed ML workload, and experiment tracking tools (e.g., Docker, Kubernetes, multiprocessing, Kubeflow, and mlflow)
  • Deploy and manage resources across multiple cloud platforms (AWS, GCP, or on-prem environments)
  • Proficiency in at least one deep learning framework, such as PyTorch and data pipeline tools (e.g., Apache Airflow, Prefect)
  • Strong knowledge of distributed systems, databases, and storage solutions
  • Extensive software design and development skills
  • Ability to learn and adapt to new technologies and contribute in a productive environment
Job Responsibility
Job Responsibility
  • Design and develop scalable, high-performance systems for training, inference, deploying, and monitoring ML models at scale
  • Build and maintain efficient data pipelines, model versioning systems, and experiment tracking frameworks
  • Collaborate with cross-functional teams, including ML researchers and engineers, to identify bottlenecks and improve platform usability
  • Implement distributed systems and storage solutions optimized for machine learning workloadsDrive improvements in CI/CD workflows for ML models and infrastructure
  • Ensure high availability and reliability of the ML platform by implementing robust monitoring, logging, and alerting systems
  • Stay current with industry trends and integrate relevant tools and frameworks to enhance the platform
  • Mentor junior engineers and contribute to a culture of technical excellence
  • Ensure that your work is performed in accordance with the company’s Quality Management System (QMS) requirements and contribute to continuous improvement efforts
  • Ensure team compliance with QMS, monitor quality, and drive process improvements
What we offer
What we offer
  • Work, learn and grow in a highly future-oriented, innovative and dynamic field
  • Wide range of opportunities for personal and professional development
  • Catered free lunch, unlimited snacks and beverages
  • Highly competitive salary and benefits package, including 401(k) plan
  • Fulltime
Read More
Arrow Right

Principal AI/ML & Innovation Engineer

We are seeking Principal AI/ML & Innovation Engineer who will be leading initiat...
Location
Location
Puerto Rico , Aguadilla
Salary
Salary:
Not provided
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's or master’s degree in computer science, engineering, data science, machine learning, artificial intelligence, or closely related quantitative discipline
  • Typically, 10-15 years’ experience
  • Solid understanding of fundamental AI and machine learning concepts, including supervised and unsupervised learning, deep learning, reinforcement learning, natural language processing, computer vision, and statistical modeling
  • Proficient in implementing and deploying various machine learning algorithms, such as decision trees, random forests, support vector machines, and neural networks
  • Knowledge of popular machine learning frameworks and libraries like TensorFlow, PyTorch, or sci-kit
  • Strong understanding of GitHub CoPilot, Cursor, N8N, vibe coding, Windsurf, and similar technologies
  • Experience in Cloud Infrastructure (AWS, Azure, etc)
  • Knowledge of Open Source, Linux, etc
  • Understanding of Devops, SRE
  • Expertise in deep learning techniques, architectures, and frameworks (e.g., convolutional neural networks (CNN), recurrent neural networks (RNN), generative adversarial networks (GAN), etc.)
Job Responsibility
Job Responsibility
  • Designing, developing, and deploying advanced machine learning models and algorithms
  • Leading research initiatives to explore novel approaches and technologies
  • Designing the architecture of AI systems and ensuring scalability, performance, and reliability
  • Collaborating with other teams, such as data scientists, software engineers, and product managers
  • Providing technical leadership and mentorship to junior engineers
  • Overseeing and guiding multiple design review sessions across different projects
  • Partnering with the engineering manager and team lead to establish long-term design and implementation strategies
  • Leading efforts to incorporate feedback loops and continuous improvement processes
  • Leading meetings, ensuring efficient progress tracking, issue resolution, and team coordination
  • Creating and delivering high-level presentations and reports to executive stakeholders
What we offer
What we offer
  • Health & Wellbeing
  • Personal & Professional Development
  • Unconditional Inclusion
  • Fulltime
Read More
Arrow Right

Senior AI and Machine Learning Engineer

We are seeking Senior AI/ML & Innovation Engineer who will be leading initiative...
Location
Location
United States , Aguadilla
Salary
Salary:
Not provided
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's or master’s degree in computer science, engineering, data science, machine learning, artificial intelligence, or closely related quantitative discipline
  • Typically, 7-10 years’ experience
  • Deep understanding of machine learning algorithms, such as linear regression, decision trees, support vector machines, random forests, deep learning models (e.g., neural networks), and reinforcement learning
  • A strong foundation in mathematics and statistics
  • Proficiency in programming languages such as Python, R, or Java
  • Strong understanding of GitHub CoPilot, Cursor, N8N, vibe coding, Windsurf, and similar technologies
  • Experience in Cloud Infrastructure (AWS, Azure, etc)
  • Knowledge of Open Source, Linux, etc
  • Understanding of Devops, SRE
  • Advanced knowledge and experience in deep learning
Job Responsibility
Job Responsibility
  • Conducts research and stays up to date with the latest advancements in AI and machine learning technologies, frameworks, and algorithms
  • Collaborates with cross-functional teams to understand business requirements and design AI and machine learning solutions
  • Develops, implements, and optimizes machine learning models and algorithms
  • Deploys machine learning models into production environments
  • Monitors the performance of deployed models
  • Organizes and leads comprehensive design review sessions
  • Works collaboratively with the engineering manager and team lead to set design and implementation standards
  • Regularly leads meetings
  • Has experience in providing technical leadership, mentorship, and guidance to junior team members
  • Develops and delivers strategic presentations and reports to senior stakeholders
What we offer
What we offer
  • Health & Wellbeing
  • Personal & Professional Development
  • Unconditional Inclusion
  • Fulltime
Read More
Arrow Right

Senior Staff Security Infrastructure Engineer

Bloomreach is building the world’s premier agentic platform for personalization....
Location
Location
Slovakia , Bratislava; Brno; Prague
Salary
Salary:
5000.00 EUR / Month
bloomreach.com Logo
Bloomreach
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 6+ years of relevant experience
  • proficiency in cloud security, network security, URL filtering, common security frameworks, and CVE lifecycle management
  • practical IaC and scripting for automation
  • strong cross-functional and external communication
  • experience mentoring junior staff
  • Hands-on cloud security for AWS and GCP: design secure architectures, perform threat modeling, apply platform-native controls, and build/validate secure IaC
  • SIEM ownership and detection engineering: deploy, configure, tune, and maintain SIEM
  • author and test detection rules and playbooks
  • integrate data sources
  • and operate with SLA-driven alerting and incident workflows
Job Responsibility
Job Responsibility
  • Owns current and target-state data architectures and reporting
  • designing, implementing, and monitoring cloud (AWS/GCP) infrastructure security controls
  • deploying, securing, configuring, and operating SIEM and other security resources
  • identifying, triaging, and remediating infrastructure and web vulnerabilities
  • leading incident triage and external-researcher engagement
  • and mentoring junior staff
What we offer
What we offer
  • Restricted stock units
  • company performance bonus
  • great deal of freedom and trust
  • flexible working hours
  • work virtual-first
  • company events
  • 5 paid days off to volunteer
  • People Development Program
  • communication coach available
  • Leader Development Program
  • Fulltime
Read More
Arrow Right

Senior Staff Security Infrastructure Engineer

Bloomreach is building the world’s premier agentic platform for personalization....
Location
Location
Czechia , Bratislava; Brno; Prague
Salary
Salary:
Not provided
bloomreach.com Logo
Bloomreach
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 6+ years of relevant experience
  • proficiency in cloud security, network security, URL filtering, common security frameworks, and CVE lifecycle management
  • practical IaC and scripting for automation
  • strong cross-functional and external communication
  • experience mentoring junior staff
  • hands-on cloud security for AWS and GCP: design secure architectures, perform threat modeling, apply platform-native controls, and build/validate secure IaC
  • SIEM ownership and detection engineering: deploy, configure, tune, and maintain SIEM
  • author and test detection rules and playbooks
  • integrate data sources
  • and operate with SLA-driven alerting and incident workflows
Job Responsibility
Job Responsibility
  • owns current and target-state data architectures and reporting
  • designing, implementing, and monitoring cloud (AWS/GCP) infrastructure security controls
  • deploying, securing, configuring, and operating SIEM and other security resources
  • identifying, triaging, and remediating infrastructure and web vulnerabilities
  • leading incident triage and external-researcher engagement
  • mentoring junior staff
What we offer
What we offer
  • A great deal of freedom and trust
  • flexible working hours
  • virtual-first work with several Bloomreach Hubs
  • company events
  • 5 paid days off to volunteer
  • People Development Program
  • communication coach available
  • Leader Development Program
  • $1,500 professional education budget annually
  • Employee Assistance Program with counselors
  • Fulltime
Read More
Arrow Right

Staff Software Engineer - AI/ML Infra

GEICO AI platform and Infrastructure team is seeking an exceptional Senior ML Pl...
Location
Location
United States , Palo Alto
Salary
Salary:
90000.00 - 300000.00 USD / Year
geico.com Logo
Geico
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in computer science, Engineering, or related technical field (or equivalent experience)
  • 8+ years of software engineering experience with focus on infrastructure, platform engineering, or MLOps
  • 3+ years of hands-on experience with machine learning infrastructure and deployment at scale
  • 2+ years of experience working with Large Language Models and transformer architectures
  • Proficient in Python
  • strong skills in Go, Rust, or Java preferred
  • Proven experience working with open source LLMs (Llama 2/3, Qwen, Mistral, Gemma, Code Llama, etc.)
  • Proficient in Kubernetes including custom operators, helm charts, and GPU scheduling
  • Deep expertise in Azure services (AKS, Azure ML, Container Registry, Storage, Networking)
  • Experience implementing and operating feature stores (Chronon, Feast, Tecton, Azure ML Feature Store, or custom solutions)
Job Responsibility
Job Responsibility
  • Design and implement scalable infrastructure for training, fine-tuning, and serving open source LLMs (Llama, Mistral, Gemma, etc.)
  • Architect and manage Kubernetes clusters for ML workloads, including GPU scheduling, autoscaling, and resource optimization
  • Design, implement, and maintain feature stores for ML model training and inference pipelines
  • Build and optimize LLM inference systems using frameworks like vLLM, TensorRT-LLM, and custom serving solutions
  • Ensure 99.9%+ uptime for ML platforms through robust monitoring, alerting, and incident response procedures
  • Design and implement ML platforms using DataRobot, Azure Machine Learning, Azure Kubernetes Service (AKS), and Azure Container Instances
  • Develop and maintain infrastructure using Terraform, ARM templates, and Azure DevOps
  • Implement cost-effective solutions for GPU compute, storage, and networking across Azure regions
  • Ensure ML platforms meet enterprise security standards and regulatory compliance requirements
  • Evaluate and potentially implement hybrid cloud solutions with AWS/GCP as backup or specialized use cases
What we offer
What we offer
  • Comprehensive Total Rewards program that offers personalized coverage tailor-made for you and your family’s overall well-being
  • Financial benefits including market-competitive compensation
  • a 401K savings plan vested from day one that offers a 6% match
  • performance and recognition-based incentives
  • and tuition assistance
  • Access to additional benefits like mental healthcare as well as fertility and adoption assistance
  • Supports flexibility- We provide workplace flexibility as well as our GEICO Flex program, which offers the ability to work from anywhere in the US for up to four weeks per year
  • Fulltime
Read More
Arrow Right

Staff Software Engineer - AI/ML Platform

GEICO AI platform and Infrastructure team is seeking an exceptional Senior ML Pl...
Location
Location
United States , Chevy Chase; New York City; Palo Alto
Salary
Salary:
115000.00 - 300000.00 USD / Year
geico.com Logo
Geico
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in computer science, Engineering, or related technical field (or equivalent experience)
  • 8+ years of software engineering experience with focus on infrastructure, platform engineering, or MLOps
  • 3+ years of hands-on experience with machine learning infrastructure and deployment at scale
  • 2+ years of experience working with Large Language Models and transformer architectures
  • Proficient in Python
  • strong skills in Go, Rust, or Java preferred
  • Proven experience working with open source LLMs (Llama 2/3, Qwen, Mistral, Gemma, Code Llama, etc.)
  • Proficient in Kubernetes including custom operators, helm charts, and GPU scheduling
  • Deep expertise in Azure services (AKS, Azure ML, Container Registry, Storage, Networking)
  • Experience implementing and operating feature stores (Chronon, Feast, Tecton, Azure ML Feature Store, or custom solutions)
Job Responsibility
Job Responsibility
  • Design and implement scalable infrastructure for training, fine-tuning, and serving open source LLMs (Llama, Mistral, Gemma, etc.)
  • Architect and manage Kubernetes clusters for ML workloads, including GPU scheduling, autoscaling, and resource optimization
  • Design, implement, and maintain feature stores for ML model training and inference pipelines
  • Build and optimize LLM inference systems using frameworks like vLLM, TensorRT-LLM, and custom serving solutions
  • Ensure 99.9%+ uptime for ML platforms through robust monitoring, alerting, and incident response procedures
  • Design and implement ML platforms using DataRobot, Azure Machine Learning, Azure Kubernetes Service (AKS), and Azure Container Instances
  • Develop and maintain infrastructure using Terraform, ARM templates, and Azure DevOps
  • Implement cost-effective solutions for GPU compute, storage, and networking across Azure regions
  • Ensure ML platforms meet enterprise security standards and regulatory compliance requirements
  • Evaluate and potentially implement hybrid cloud solutions with AWS/GCP as backup or specialized use cases
What we offer
What we offer
  • Comprehensive Total Rewards program that offers personalized coverage tailor-made for you and your family’s overall well-being
  • Financial benefits including market-competitive compensation
  • a 401K savings plan vested from day one that offers a 6% match
  • performance and recognition-based incentives
  • and tuition assistance
  • Access to additional benefits like mental healthcare as well as fertility and adoption assistance
  • Supports flexibility- We provide workplace flexibility as well as our GEICO Flex program, which offers the ability to work from anywhere in the US for up to four weeks per year
  • Fulltime
Read More
Arrow Right

Staff Software Engineer - AI/ML Infra

GEICO AI platform and Infrastructure team is seeking an exceptional Senior ML Pl...
Location
Location
United States , Chevy Chase; New York City; Palo Alto
Salary
Salary:
115000.00 - 300000.00 USD / Year
geico.com Logo
Geico
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in computer science, Engineering, or related technical field (or equivalent experience)
  • 8+ years of software engineering experience with focus on infrastructure, platform engineering, or MLOps
  • 3+ years of hands-on experience with machine learning infrastructure and deployment at scale
  • 2+ years of experience working with Large Language Models and transformer architectures
  • Proficient in Python
  • strong skills in Go, Rust, or Java preferred
  • Proven experience working with open source LLMs (Llama 2/3, Qwen, Mistral, Gemma, Code Llama, etc.)
  • Proficient in Kubernetes including custom operators, helm charts, and GPU scheduling
  • Deep expertise in Azure services (AKS, Azure ML, Container Registry, Storage, Networking)
  • Experience implementing and operating feature stores (Chronon, Feast, Tecton, Azure ML Feature Store, or custom solutions)
Job Responsibility
Job Responsibility
  • Design and implement scalable infrastructure for training, fine-tuning, and serving open source LLMs (Llama, Mistral, Gemma, etc.)
  • Architect and manage Kubernetes clusters for ML workloads, including GPU scheduling, autoscaling, and resource optimization
  • Design, implement, and maintain feature stores for ML model training and inference pipelines
  • Build and optimize LLM inference systems using frameworks like vLLM, TensorRT-LLM, and custom serving solutions
  • Ensure 99.9%+ uptime for ML platforms through robust monitoring, alerting, and incident response procedures
  • Design and implement ML platforms using DataRobot, Azure Machine Learning, Azure Kubernetes Service (AKS), and Azure Container Instances
  • Develop and maintain infrastructure using Terraform, ARM templates, and Azure DevOps
  • Implement cost-effective solutions for GPU compute, storage, and networking across Azure regions
  • Ensure ML platforms meet enterprise security standards and regulatory compliance requirements
  • Evaluate and potentially implement hybrid cloud solutions with AWS/GCP as backup or specialized use cases
What we offer
What we offer
  • Comprehensive Total Rewards program that offers personalized coverage tailor-made for you and your family’s overall well-being
  • Financial benefits including market-competitive compensation
  • a 401K savings plan vested from day one that offers a 6% match
  • performance and recognition-based incentives
  • and tuition assistance
  • Access to additional benefits like mental healthcare as well as fertility and adoption assistance
  • Supports flexibility- We provide workplace flexibility as well as our GEICO Flex program, which offers the ability to work from anywhere in the US for up to four weeks per year
  • Fulltime
Read More
Arrow Right