This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We are seeking a highly skilled and self-driven Data Testing Architect to oversee and own the design, build, and deployment of scalable ETL pipelines across hybrid environments including Cloudera Hadoop, Red Hat OpenShift, and AWS Cloud. This role focuses on developing robust PySpark-based data processing solutions, building testing frameworks for ETL jobs, and leveraging containerization and orchestration platforms like Docker and AWS EKS for scalable workloads.
Job Responsibility:
Build Data Pipelines
Testing and Validation
Containerization and Orchestration
Cloud Integration
Test Data Management
Build and maintain ETL validation and testing scripts
Work with Hive, HDFS, and Oracle data sources to extract, transform, and load large-scale datasets
Develop Dockerfiles and create container images for PySpark jobs
Deploy and orchestrate ETL jobs using AWS EKS
Leverage AWS services such as S3, Lambda, and Airflow for data ingestion, event-driven processing, and orchestration
Design and develop PySpark-based ETL pipelines on Cloudera Hadoop platform
Create reusable frameworks, libraries, and templates to accelerate automation and testing of ETL jobs
Participate in code reviews, CI/CD pipelines, and maintain best practices in Spark and cloud-native development
Ensure tooling can be run in CI/CD providing real-time on demand test execution
Lead a team of automation professionals and guide them on projects
Own and maintain automation best practices
Define the overall strategy for automating data processes and testing
Research and implement new automation tools and techniques
Work closely with other teams and partners to ensure smooth data operations and regulatory compliance
Track key performance indicators (KPIs) related to automation
Monitor and review code check-ins from the team
Requirements:
12-15 years of experience on data platform testing across data lineage especially with knowledge of regulatory compliance and risk management
Detailed knowledge data flows in relational database and Bigdata
Selenium BDD Cucumber using Java, Python
Strong experience with Python
broader understanding for batch and stream processing deploying PySpark workloads to AWS EKS
Proficiency in testing on Cloudera Hadoop ecosystem
Hands-on experience with ETL
Strong knowledge of Oracle SQL and HiveQL
Solid understanding of AWS services like S3, Lambda, EKS, Airflow, and IAM
Understanding of architecture on cloud with S3, Lamda, Airflow DAGs to orchestrate ETL jobs
Familiarity with CI/CD tools
Scripting knowledge in Python
Version Control: GIT, Bitbucket, GitHub
Experience on BI reports validations e.g., Tableau dashboards and views validation
Strong understanding of Wealth domain, data regulatory & governance for APAC, EMEA and NAM
Strong problem-solving and debugging skills
Excellent communication and collaboration abilities to lead and mentor a large techno-functional team across different geographical locations
Manage global teams and ability to support multiple time zones
Strong financial Acumen and great presentation skills
Able to work in an Agile environment and deliver results independently
Nice to have:
Strong problem-solving and debugging skills
Excellent communication and collaboration abilities to lead and mentor a large techno-functional team across different geographical locations
Strong financial Acumen and great presentation skills
Welcome to CrawlJobs.com – Your Global Job Discovery Platform
At CrawlJobs.com, we simplify finding your next career opportunity by bringing job listings directly to you from all corners of the web. Using cutting-edge AI and web-crawling technologies, we gather and curate job offers from various sources across the globe, ensuring you have access to the most up-to-date job listings in one place.
We use cookies to enhance your experience, analyze traffic, and serve personalized content. By clicking “Accept”, you agree to the use of cookies.