This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
This role is a fantastic opportunity to start your career in data engineering for AI, focusing on the essential tasks that ensure our AI teams have the high-quality data they need. You will work under the guidance of senior engineers to build, maintain, and optimize the data pipelines that are the lifeblood of our machine learning models. You'll gain hands-on experience with large-scale, diverse datasets unique to the construction industry and learn the best practices for managing data in a modern cloud environment.
Job Responsibility:
Assist in the design, construction, and maintenance of foundational data pipelines (ETL/ELT) to support AI/ML model development
Contribute to data cleaning, transformation, and aggregation tasks to prepare datasets for machine learning applications
Perform routine data quality checks and write basic tests to ensure data integrity and reliability
Support the management of data storage solutions, including data warehouses and data lakes, under the supervision of senior engineers
Document data sources, pipeline logic, and data models to ensure clarity and maintainability for the team
Collaborate with AI/ML engineers to understand their data requirements and assist in providing them with accessible, analysis-ready datasets
Requirements:
Bachelor's degree in Computer Science, Engineering, Information Systems, or a related quantitative field
Foundational knowledge of database concepts and proficiency in SQL for data querying and manipulation
Proficiency in at least one programming language, preferably Python, for scripting and data processing tasks
A conceptual understanding of data warehousing, ETL processes, and data modeling
Strong problem-solving skills and an ability to analyze issues of a limited scope with a focus on finding clear answers
Eagerness to learn and a strong interest in data engineering, cloud technologies, and AI/ML
Nice to have:
Familiarity with a cloud platform (e.g., AWS, Azure, GCP) and its core data services (e.g., S3, Blob Storage, BigQuery) or unified data platforms like Databricks
Basic experience with data pipeline orchestration tools (e.g., Apache Airflow, Prefect)
Exposure to version control systems like Git
Relevant coursework or academic projects in data engineering, database management, or big data
Familiarity with the construction industry's data landscape is a plus