This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We are looking for a Senior Data Engineer to design, develop, and optimize our data infrastructure on Databricks. You will architect scalable pipelines using BigQuery, Google Cloud Storage, Apache Airflow, dbt, Dataflow, and Pub/Sub, ensuring high availability and performance across our ETL/ELT processes. You will leverage Great Expectations to enforce data quality standards. The role also involves building our Data Mart (Data Mach) environment and implementing CI/CD best practices. A successful candidate has extensive knowledge of cloud-native data solutions, strong proficiency with ETL/ELT frameworks (including dbt), and a passion for building robust, cost-effective pipelines.
Job Responsibility:
Define and implement the overall data architecture on GCP, including data warehousing in BigQuery/Databricks, data lake patterns in Google Cloud Storage, and Data Mart (Data Mach) solutions
Integrate Terraform for Infrastructure as Code to provision and manage cloud resources efficiently
Establish both batch and real-time data processing frameworks to ensure reliability, scalability, and cost efficiency
Design, build, and optimize ETL/ELT pipelines using Apache Airflow for workflow orchestration
Implement dbt (Data Build Tool) transformations to maintain version-controlled data models in BigQuery, ensuring consistency and reliability across the data pipeline
Use Google Dataflow (based on Apache Beam) and Pub/Sub for large-scale streaming/batch data processing and ingestion
Automate job scheduling and data transformations to deliver timely insights for analytics, machine learning, and reporting
Implement event-driven or asynchronous data workflows between microservices
Employ Docker and Kubernetes (K8s) for containerization and orchestration, enabling flexible and efficient microservices-based data workflows
Implement CI/CD pipelines for streamlined development, testing, and deployment of data engineering components
Enforce data quality standards using Great Expectations or similar frameworks, defining and validating expectations for critical datasets
Define and uphold metadata management, data lineage, and auditing standards to ensure trustworthy datasets
Implement security best practices, including encryption at rest and in transit, Identity and Access Management (IAM), and compliance with GDPR or CCPA where applicable
Collaborate with Data Science, Analytics, and Product teams to ensure the data infrastructure supports advanced analytics, including machine learning initiatives
Maintain Data Mart (Data Mach) environments that cater to specific business domains, optimizing access and performance for key stakeholders
Requirements:
3+ years of professional experience in data engineering, with at least 1 year in mobile data
Proven track record building and maintaining BigQuery environments and Google Cloud Storage based data lakes
Deep knowledge of Apache Airflow for scheduling/orchestration and ETL/ELT design
Experience implementing dbt for data transformations, RabbitMQ for event-driven workflows, and Pub/Sub + Dataflow for streaming/batch data pipelines
Familiarity with designing and implementing Data Mart (Data Mach) solutions, as well as using Terraform for IaC
Strong coding capabilities in Python, Java, or Scala, plus scripting for automation
Experience with Docker and Kubernetes (K8s) for containerizing data-related services
Hands-on with CI/CD pipelines and DevOps tools (e.g., Terraform, Ansible, Jenkins, GitLab CI) to manage infrastructure and deployments
Proficiency in Great Expectations (or similar) to define and enforce data quality standards
Expertise in designing systems for data lineage, metadata management, and compliance (GDPR, CCPA)
Strong understanding of OLTP (Online Transaction Processing) and OLAP (Online Analytical Processing) systems
Excellent communication skills for both technical and non-technical audiences
High level of organization, self-motivation, and problem-solving aptitude
Nice to have:
Machine Learning (ML) Integration: Familiarity with end-to-end ML workflows and model deployment on GCP (e.g., Vertex AI)
Advanced Observability: Experience with Prometheus, Grafana, Datadog, or New Relic for system health and performance monitoring
Security & Compliance: Advanced knowledge of compliance frameworks such as HIPAA, SOC 2, or relevant regulations
Real-Time Data Architectures: Additional proficiency in Kafka, Spark Streaming, or other streaming solutions
Certifications: GCP-specific certifications (e.g., Google Professional Data Engineer) are highly desirable
What we offer:
Flexible career path with personalized internal training and an annual budget for external learning opportunities
Flexible schedule with flextime
Option of working full remote or from our Barcelona office
Free Friday afternoons with a 7-hour workday
35-hour workweek in July and August
Competitive salary
Full-time permanent contract
Top-tier private health insurance (including dental and psychological services)
25 days of vacation plus your birthday off, with flexible vacation options—no blackout days
Free coffee, fresh fruit, snacks, a game room, and a rooftop terrace with stunning Mediterranean views
Ticket restaurant and nursery vouchers, paid directly from your gross salary