CrawlJobs Logo

Cloud Data Engineer / SRE

https://www.randstad.com Logo

Randstad

Location Icon

Location:
Japan , Tokyo

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

7000000.00 - 9000000.00 JPY / Year

Job Description:

WFH flexibility! Global Environment! Competitive salary!

Requirements:

  • 3+ years of experience in an SRE role or similar position
  • Strong experience in creating and managing data projects, with a focus on ETL Processes and Data Lake / Data Factories management
  • Strong experience in deploying, managing, and monitoring Azure Cloud environments, particularly with data management tools
  • Hands-on experience with Azure Data Factory, Azure Synapse Analytics, Cosmos DB, and related data technologies
  • Proficiency in Infrastructure as Code (IaC) practices (e.g., Terraform, Azure Resource Manager, or Azure Bicep)
  • Proficient with Azure monitoring tools (Azure Monitor, Log Analytics, Application Insights)
  • Strong experience in building CI/CD pipelines using tools like Azure DevOps or Jenkins
  • Solid understanding of containers (Docker, Kubernetes) and serverless computing (Azure Functions)
  • Knowledge of scripting languages such as PowerShell, Python, or Bash

Nice to have:

  • Experience with distributed systems and large-scale data architecture
  • Familiarity with microservices and event-driven architectures
  • Experience with SQL and NoSQL databases
  • Knowledge of cost optimization strategies in Azure environments
What we offer:
  • 健康保険
  • 厚生年金保険
  • 雇用保険
  • 土曜日
  • 日曜日
  • 祝日

Additional Information:

Job Posted:
February 04, 2026

Expiration:
April 14, 2026

Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Cloud Data Engineer / SRE

Cloud Technical Architect / Data DevOps Engineer

The role involves designing, implementing, and optimizing scalable Big Data and ...
Location
Location
United Kingdom , Bristol
Salary
Salary:
Not provided
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • An organised and methodical approach
  • Excellent time keeping and task prioritisation skills
  • An ability to provide clear and concise updates
  • An ability to convey technical concepts to all levels of audience
  • Data engineering skills – ETL/ELT
  • Technical implementation skills – application of industry best practices & designs patterns
  • Technical advisory skills – experience in researching technological products / services with the intent to provide advice on system improvements
  • Experience of working in hybrid environments with both classical and DevOps
  • Excellent written & spoken English skills
  • Excellent knowledge of Linux operating system administration and implementation
Job Responsibility
Job Responsibility
  • Detailed development and implementation of scalable clustered Big Data solutions, with a specific focus on automated dynamic scaling, self-healing systems
  • Participating in the full lifecycle of data solution development, from requirements engineering through to continuous optimisation engineering and all the typical activities in between
  • Providing technical thought-leadership and advisory on technologies and processes at the core of the data domain, as well as data domain adjacent technologies
  • Engaging and collaborating with both internal and external teams and be a confident participant as well as a leader
  • Assisting with solution improvement activities driven either by the project or service
  • Support the design and development of new capabilities, preparing solution options, investigating technology, designing and running proof of concepts, providing assessments, advice and solution options, providing high level and low level design documentation
  • Cloud Engineering capability to leverage Public Cloud platform using automated build processes deployed using Infrastructure as Code
  • Provide technical challenge and assurance throughout development and delivery of work
  • Develop re-useable common solutions and patterns to reduce development lead times, improve commonality and lowering Total Cost of Ownership
  • Work independently and/or within a team using a DevOps way of working
What we offer
What we offer
  • Extensive social benefits
  • Flexible working hours
  • Competitive salary
  • Shared values
  • Equal opportunities
  • Work-life balance
  • Evolving career opportunities
  • Comprehensive suite of benefits that supports physical, financial and emotional wellbeing
  • Fulltime
Read More
Arrow Right

Intermediate Software Engineer SRE – AI

At PointClickCare our mission is simple: to help providers deliver exceptional c...
Location
Location
Canada , Mississauga
Salary
Salary:
115000.00 - 128000.00 CAD / Year
pointclickcare.com Logo
PointClickCare
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years' experience in software engineering
  • Experience with SRE principles
  • Experience with AI/ML in production environments
  • A passion for automation, intelligent systems, and operational excellence
  • Strong debugging, problem-solving, and system design skills
  • Languages: Python, Java, Bash, Terraform
  • Platforms: Azure, Kubernetes, Docker
  • Tools: Datadog, Prometheus, AppDynamics, ELK, GitHub Actions
  • ML/AI: MCP framework, AI agents, Vector store, Agent orchestration (LangChain), RAG
  • CI/CD: Jenkins, ArgoCD, Spinnaker
Job Responsibility
Job Responsibility
  • Build ML-based anomaly detection and pattern recognition systems
  • Enhance telemetry with smart tagging and metadata for better AI insights
  • Develop event-driven workflows and self-healing systems using AI triggers
  • Automate incident response with generative AI and custom AI agent orchestration
  • Use time-series forecasting and predictive modelling to anticipate failures
  • Optimise infrastructure with AI-powered autoscaling and cost-aware resource allocation
  • Build scalable, fault-tolerant systems in a cloud-native environment
  • Participate in on-call rotations and lead incident response for critical systems
  • Skilled in API integration for streamlined data exchange and system connectivity
  • Run internal AIOps workshops and help teams adopt AI maturity models
What we offer
What we offer
  • Benefits starting from Day 1
  • Retirement Plan Matching
  • Flexible Paid Time Off
  • Wellness Support Programs and Resources
  • Parental & Caregiver Leaves
  • Fertility & Adoption Support
  • Continuous Development Support Program
  • Employee Assistance Program
  • Allyship and Inclusion Communities
  • Employee Recognition … and more
  • Fulltime
Read More
Arrow Right

Cloud Scale Test Engineer

As a Cloud Scale Test Engineer, you will ensure the delivery of high performance...
Location
Location
United States , San Jose
Salary
Salary:
90400.00 - 208500.00 USD / Year
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's or master's degree in computer science, engineering, information systems, or related field
  • 1-4 years of experience
  • Strong Python Coding Skills
  • Passion for AI-driven automation and process optimization
  • Experience working on cloud platforms like AWS/Azure/Google Cloud
  • Experience with Observability platforms like Prometheus, Grafana. Open Search, New Relic etc
  • Knowledge of distributed tracing and debugging in cloud-native environments
  • Proficiency in GIT, Jira, Jenkins, and CI/CD tool
  • Basic networking knowledge
  • Experience in testing containerized applications and Kubernetes-based environments
Job Responsibility
Job Responsibility
  • Design, develop, and maintain robust test automation frameworks for cloud-scale distributed systems
  • Architect performance, load, and stress tests to validate system resilience under high traffic conditions
  • Build fault-injection and chaos engineering strategies to assess the reliability of distributed services
  • Develop and execute end-to-end integration, API, and system-level tests across microservices-based architectures
  • Implement continuous testing pipelines within CI/CD workflows to accelerate deployment cycles
  • Collaborate closely with development, SRE, and infrastructure teams to ensure quality best practices are embedded within the SDLC
  • Analyze system logs, telemetry data, and observability metrics to identify and mitigate potential failures before they impact production
  • Drive automation of security testing, API contract validation, and infrastructure testing
  • Participate in diagnosing critical production issues related to system reliability and performance
What we offer
What we offer
  • Health & Wellbeing
  • Personal & Professional Development
  • Unconditional Inclusion
  • Fulltime
Read More
Arrow Right

Site Reliability Engineer SRE – ML platform

Location
Location
United States , Sunnyvale
Salary
Salary:
Not provided
thirdeyedata.ai Logo
Thirdeye Data
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 6+ years of experience in ML Ops with strong knowledge in Kubernetes, Python, MongoDB and AWS
  • Good understanding of Apache SOLR
  • Proficient with Linux administration
  • Knowledge of ML models and LLM
  • Ability to understand tools used by data scientists and experience with software development and test automation
  • Ability to design and implement cloud solutions and ability to build MLOps pipelines on cloud solutions (AWS)
  • Experience working with cloud computing and database systems
  • Experience building custom integrations between cloud-based systems using APIs
  • Experience developing and maintaining ML systems built with open-source tools
  • Experience with MLOps Frameworks like Kubeflow, MLFlow, DataRobot, Airflow etc., experience with Docker and Kubernetes
Job Responsibility
Job Responsibility
  • Continuous Deployment using GitHub Actions, Flux, Kustomize
  • Design and implement cloud solutions, build MLOps on AWS cloud
  • Data science model containerization, deployment using Docker, VLLM, Kubernetes
  • Communicate with a team of data scientists, data engineers, and architects, and document the processes
  • Develop and deploy scalable tools and services for our clients to handle machine learning training and inference
  • Knowledge of ML models and LLM
  • Fulltime
Read More
Arrow Right

Engineering Manager, Infrastructure

As an Engineering Manager for the Infrastructure team, you’ll lead the engineers...
Location
Location
Canada; United States
Salary
Salary:
195000.00 - 285000.00 USD / Year
apollo.io Logo
Apollo.io
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of hands-on software or infrastructure engineering experience
  • 2+ years of experience leading teams of senior and staff-level engineers in platform, SRE, or infrastructure domains
  • Proven ability to design and operate large-scale distributed systems in cloud environments (preferably GCP or AWS)
  • Expertise with Kubernetes, Docker, Terraform, Ubuntu, and CI/CD pipelines
  • Familiarity with observability tools (Grafana, Prometheus, ELK, Datadog, NewRelic) and performance tuning
  • Strong grounding in networking, security, and reliability principles
  • Experience managing infrastructure costs, availability SLAs, and high-throughput systems at scale
Job Responsibility
Job Responsibility
  • Lead, coach, and grow a distributed team of high-impact Infrastructure Engineers
  • Partner with senior engineering leadership on strategic initiatives such as cloud migration, infrastructure scaling, platform reliability, and cost efficiency
  • Define and implement modern operational excellence practices, including SLOs, error budgets, incident reviews, and performance monitoring
  • Guide technical decision-making across key areas like Kubernetes, GCP, observability, networking, CI/CD, and IaC (Terraform, Ansible)
  • Collaborate with AI, Data, and Product Engineering teams to ensure infrastructure scalability for ML and AI-native workloads
  • Run effective 1:1s, career development conversations, and quarterly performance reviews
  • Support recruiting efforts to attract top engineering talent across time zones
What we offer
What we offer
  • Equity
  • Company bonus or sales commissions/bonuses
  • 401(k) plan
  • At least 10 paid holidays per year
  • Flex PTO
  • Parental leave
  • Employee assistance program and wellbeing benefits
  • Global travel coverage
  • Life/AD&D/STD/LTD insurance
  • FSA/HSA and medical, dental, and vision benefits
  • Fulltime
Read More
Arrow Right

Senior Database Engineer

We’re looking for a skilled Data Reliability Engineer to join our team for a cli...
Location
Location
Salary
Salary:
Not provided
zoolatech.com Logo
Zoolatech
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of experience in Data Engineering, Database Reliability, or Infrastructure Operations
  • Strong expertise in PostgreSQL on AWS, including tuning, replication, backups, and HA configurations
  • Experience operating RDBMS databases (PostgreSQL, MySQL, etc.) and Kubernetes technologies is highly desirable
  • Experience provisioning and operating NoSQL databases at scale like Elasticsearch, Elastic Cache, DynamoDB, Neo4j, Mongo, Cassandra, etc.
  • Advanced SQL scripting and query optimization skills
  • Experience with data systems monitoring, alerting, and performance tuning
  • Strong programming/scripting in Java, Python, or Shell
  • Proven experience in designing or supporting complex data ecosystems
  • Solid understanding of cloud infrastructure (preferably AWS) and Infrastructure as Code tools (Terraform)
  • Familiarity with event streaming platforms (Kafka), and observability stacks (New Relic, ELK, etc.)
Job Responsibility
Job Responsibility
  • Own and optimize the reliability, availability, and performance of data infrastructure across production systems
  • Lead the design and implementation of resilient, secure, and observable data systems
  • Collaborate with SRE, Security, and Engineering teams to enforce data infrastructure standards and align on architectural decisions
  • Design and implement automation around provisioning, uptime monitoring, data refresh, integrity, backups, and disaster recovery
  • Support application developers with performance tuning, complex query optimization, and database design reviews
  • Analyze and resolve performance bottlenecks and incidents with a focus on long-term solutions
  • Participate in on-call rotation to support production systems and ensure high availability
  • Actively contribute to improving incident response and observability through metrics, alerting, and runbooks
  • Work with technologies such as Java, Ruby on Rails, PostgreSQL, AWS, Kafka, S3, Elasticsearch
What we offer
What we offer
  • Paid Vacation
  • Sick Days
  • Floating Holidays
  • Sport/Insurance Compensation
  • English Classes
  • Charity
  • Training Compensation
Read More
Arrow Right

Senior Database Engineer

We’re looking for a skilled Data Reliability Engineer to join our team for a cli...
Location
Location
United States
Salary
Salary:
Not provided
zoolatech.com Logo
Zoolatech
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of experience in Data Engineering, Database Reliability, or Infrastructure Operations
  • Strong expertise in PostgreSQL on AWS, including tuning, replication, backups, and HA configurations
  • Experience operating RDBMS databases (PostgreSQL, MySQL, etc.) and Kubernetes technologies is highly desirable
  • Experience provisioning and operating NoSQL databases at scale like Elasticsearch, Elastic Cache, DynamoDB, Neo4j, Mongo, Cassandra, etc.
  • Advanced SQL scripting and query optimization skills
  • Experience with data systems monitoring, alerting, and performance tuning
  • Strong programming/scripting in Java, Python, or Shell
  • Proven experience in designing or supporting complex data ecosystems
  • Solid understanding of cloud infrastructure (preferably AWS) and Infrastructure as Code tools (Terraform)
  • Familiarity with event streaming platforms (Kafka), and observability stacks (New Relic, ELK, etc.)
Job Responsibility
Job Responsibility
  • Own and optimize the reliability, availability, and performance of data infrastructure across production systems
  • Lead the design and implementation of resilient, secure, and observable data systems
  • Collaborate with SRE, Security, and Engineering teams to enforce data infrastructure standards and align on architectural decisions
  • Design and implement automation around provisioning, uptime monitoring, data refresh, integrity, backups, and disaster recovery
  • Support application developers with performance tuning, complex query optimization, and database design reviews
  • Analyze and resolve performance bottlenecks and incidents with a focus on long-term solutions
  • Participate in on-call rotation to support production systems and ensure high availability
  • Actively contribute to improving incident response and observability through metrics, alerting, and runbooks
  • Work with technologies such as Java, Ruby on Rails, PostgreSQL, AWS, Kafka, S3, Elasticsearch
What we offer
What we offer
  • Paid Vacation
  • Sick Days
  • Floating Holidays
  • Sport/Insurance Compensation
  • English Classes
  • Charity
  • Training Compensation
  • Fulltime
Read More
Arrow Right

Principal Network Operations Site Reliability Systems Engineer

This role entails incorporating Site Reliability Engineering (SRE) concepts into...
Location
Location
United States
Salary
Salary:
115500.00 - 266000.00 USD / Year
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s or master’s degree in computer science, Computer Engineering, Information Systems, or equivalent
  • Typically, 10+ years’ experience
  • Experience with cloud platforms
  • Experience with software development languages for console and web-based applications
  • Experience in User Interface (UI/UX) design
  • Understanding of and experience with common network infrastructure devices such as switches, routers, access points, authentication, authorization, and accounting systems and protocols, and network management utilities
  • Experience with network monitoring protocols
  • Ability to design and implement relational database solutions, time-series databases, and NoSQL database solutions
  • Excellent analytical and problem-solving skills
  • Experience in the overall architecture of software systems for products and solutions
Job Responsibility
Job Responsibility
  • Develop strategies and implement plans to incorporate SRE concepts into network, tool, and process designs and leads execution of those strategies and plans
  • Evaluates LAN, WLAN, SD-WAN, AAA, Private 5G, and other network designs for fit-for-use criteria, and designs prototype analysis tools to facilitate rapid iteration of network delivery service enhancements
  • Identifies and engineers new ways to leverage data from multiple platforms to identify network performance trends and detect anomalies
  • Prototypes machine learning anomaly detection, event signature identification, and trend identification
  • Automates common incident management and problem management procedures
  • Develops organization-wide architectures, methodologies, and prototypes for software systems design and development across multiple platforms and organizations within the Global Business Unit
  • Identifies and evaluates new technologies and innovations for alignment with technology roadmap and business value
  • creates plans for prototyping and prototype iteration
  • Reviews and evaluates designs and project activities for compliance with development guidelines and standards
  • provides tangible feedback to improve product quality and mitigate failure risk
What we offer
What we offer
  • Comprehensive suite of benefits supporting physical, financial and emotional wellbeing
  • Career development programs
  • Inclusive environment celebrating individual uniqueness
  • Fulltime
Read More
Arrow Right