Pyspark Data Engineer Job at Citi (Chennai)

Senior Data Engineer – Data Engineering & AI Platforms

We are looking for a highly skilled Senior Data Engineer (L2) who can design, bu...

Location

India , Chennai, Madurai, Coimbatore

Salary:

Not provided

OptiSol Business Solutions

Expiration Date

Until further notice

Requirements

Strong hands-on expertise in cloud ecosystems (Azure / AWS / GCP)
Excellent Python programming skills with data engineering libraries and frameworks
Advanced SQL capabilities including window functions, CTEs, and performance tuning
Solid understanding of distributed processing using Spark/PySpark
Experience designing and implementing scalable ETL/ELT workflows
Good understanding of data modeling concepts (dimensional, star, snowflake)
Familiarity with GenAI/LLM-based integration for data workflows
Experience working with Git, CI/CD, and Agile delivery frameworks
Strong communication skills for interacting with clients, stakeholders, and internal teams

Job Responsibility

Design, build, and maintain scalable ETL/ELT pipelines across cloud and big data platforms
Contribute to architectural discussions by translating business needs into data solutions spanning ingestion, transformation, and consumption layers
Work closely with solutioning and pre-sales teams for technical evaluations and client-facing discussions
Lead squads of L0/L1 engineers—ensuring delivery quality, mentoring, and guiding career growth
Develop cloud-native data engineering solutions using Python, SQL, PySpark, and modern data frameworks
Ensure data reliability, performance, and maintainability across the pipeline lifecycle—from development to deployment
Support long-term ODC/T&M projects by demonstrating expertise during technical discussions and interviews
Integrate emerging GenAI tools where applicable to enhance data enrichment, automation, and transformations

What we offer

Opportunity to work at the intersection of Data Engineering, Cloud, and Generative AI
Hands-on exposure to modern data stacks and emerging AI technologies
Collaboration with experts across Data, AI/ML, and cloud practices
Access to structured learning, certifications, and leadership mentoring
Competitive compensation with fast-track career growth and visibility

Fulltime

Senior Data Engineering Architect

Location

Poland

Salary:

Not provided

Lingaro

Expiration Date

Until further notice

Requirements

Proven work experience as a Data Engineering Architect or a similar role and strong experience in in the Data & Analytics area
Strong understanding of data engineering concepts, including data modeling, ETL processes, data pipelines, and data governance
Expertise in designing and implementing scalable and efficient data processing frameworks
In-depth knowledge of various data technologies and tools, such as relational databases, NoSQL databases, data lakes, data warehouses, and big data frameworks (e.g., Hadoop, Spark)
Experience in selecting and integrating appropriate technologies to meet business requirements and long-term data strategy
Ability to work closely with stakeholders to understand business needs and translate them into data engineering solutions
Strong analytical and problem-solving skills, with the ability to identify and address complex data engineering challenges
Proficiency in Python, PySpark, SQL
Familiarity with cloud platforms and services, such as AWS, GCP, or Azure, and experience in designing and implementing data solutions in a cloud environment
Knowledge of data governance principles and best practices, including data privacy and security regulations

Job Responsibility

Collaborate with stakeholders to understand business requirements and translate them into data engineering solutions
Design and oversee the overall data architecture and infrastructure, ensuring scalability, performance, security, maintainability, and adherence to industry best practices
Define data models and data schemas to meet business needs, considering factors such as data volume, velocity, variety, and veracity
Select and integrate appropriate data technologies and tools, such as databases, data lakes, data warehouses, and big data frameworks, to support data processing and analysis
Create scalable and efficient data processing frameworks, including ETL (Extract, Transform, Load) processes, data pipelines, and data integration solutions
Ensure that data engineering solutions align with the organization's long-term data strategy and goals
Evaluate and recommend data governance strategies and practices, including data privacy, security, and compliance measures
Collaborate with data scientists, analysts, and other stakeholders to define data requirements and enable effective data analysis and reporting
Provide technical guidance and expertise to data engineering teams, promoting best practices and ensuring high-quality deliverables. Support to team throughout the implementation process, answering questions and addressing issues as they arise
Oversee the implementation of the solution, ensuring that it is implemented according to the design documents and technical specifications

What we offer

Stable employment. On the market since 2008, 1500+ talents currently on board in 7 global sites
Workation. Enjoy working from inspiring locations in line with our workation policy
Great Place to Work® certified employer
Flexibility regarding working hours and your preferred form of contract
Comprehensive online onboarding program with a “Buddy” from day 1
Cooperation with top-tier engineers and experts
Unlimited access to the Udemy learning platform from day 1
Certificate training programs. Lingarians earn 500+ technology certificates yearly
Upskilling support. Capability development programs, Competency Centers, knowledge sharing sessions, community webinars, 110+ training opportunities yearly
Grow as we grow as a company. 76% of our managers are internal promotions

Data Engineering Architect

Data engineering involves the development of solutions for the collection, trans...

Location

India

Salary:

Not provided

Lingaro

Expiration Date

Until further notice

Requirements

10+ years’ experience in the Data & Analytics area
4+ years’ experience into Data Engineering Architecture
Proficiency in Python, PySpark, SQL
Strong expertise in Azure cloud services such as: ADF, databricks, pyspark, Logic app
Strong understanding of data engineering concepts, including data modeling, ETL processes, data pipelines, and data governance
Expertise in designing and implementing scalable and efficient data processing frameworks
In-depth knowledge of various data technologies and tools, such as relational databases, NoSQL databases, data lakes, data warehouses, and big data frameworks (e.g., Hadoop, Spark)
Experience in selecting and integrating appropriate technologies to meet business requirements and long-term data strategy
Ability to work closely with stakeholders to understand business needs and translate them into data engineering solutions
Strong analytical and problem-solving skills, with the ability to identify and address complex data engineering challenges

Job Responsibility

Collaborate with stakeholders to understand business requirements and translate them into data engineering solutions
Design and oversee the overall data architecture and infrastructure, ensuring scalability, performance, security, maintainability, and adherence to industry best practices
Define data models and data schemas to meet business needs, considering factors such as data volume, velocity, variety, and veracity
Select and integrate appropriate data technologies and tools, such as databases, data lakes, data warehouses, and big data frameworks, to support data processing and analysis
Create scalable and efficient data processing frameworks, including ETL (Extract, Transform, Load) processes, data pipelines, and data integration solutions
Ensure that data engineering solutions align with the organization's long-term data strategy and goals
Evaluate and recommend data governance strategies and practices, including data privacy, security, and compliance measures
Collaborate with data scientists, analysts, and other stakeholders to define data requirements and enable effective data analysis and reporting
Provide technical guidance and expertise to data engineering teams, promoting best practices and ensuring high-quality deliverables
Support to team throughout the implementation process, answering questions and addressing issues as they arise

What we offer

Stable employment
“Office as an option” model
Flexibility regarding working hours and your preferred form of contract
Comprehensive online onboarding program with a “Buddy” from day 1
Cooperation with top-tier engineers and experts
Unlimited access to the Udemy learning platform from day 1
Certificate training programs
Upskilling support
Internal Gallup Certified Strengths Coach to support your growth
Grow as we grow as a company

Software Engineer (Data Engineering)

We are seeking a Software Engineer (Data Engineering) who can seamlessly integra...

Location

India , Hyderabad

Salary:

Not provided

NStarX

Expiration Date

Until further notice

Requirements

4+ years in Data Engineering and AI/ML roles
Bachelor’s or Master’s degree in Computer Science, Data Science, or a related field
Python, SQL, Bash, PySpark, Spark SQL, boto3, pandas
Apache Spark on EMR (driver/executor model, sizing, dynamic allocation)
Amazon S3 (Parquet) with lifecycle management to Glacier
AWS Glue Catalog and Crawlers
AWS Step Functions, AWS Lambda, Amazon EventBridge
CloudWatch Logs and Metrics, Kinesis Data Firehose (or Kafka/MSK)
Amazon Redshift and Redshift Spectrum
IAM (least privilege), Secrets Manager, SSM

Job Responsibility

Design, build, and maintain scalable ETL and ELT pipelines for large-scale data processing
Develop and optimize data architectures supporting analytics and ML workflows
Ensure data integrity, security, and compliance with organizational and industry standards
Collaborate with DevOps teams to deploy and monitor data pipelines in production environments
Build predictive and prescriptive models leveraging AI and ML techniques
Develop and deploy machine learning and deep learning models using TensorFlow, PyTorch, or Scikit-learn
Perform feature engineering, statistical analysis, and data preprocessing
Continuously monitor and optimize models for accuracy and scalability
Integrate AI-driven insights into business processes and strategies
Serve as the technical liaison between NStarX and client teams

What we offer

Competitive salary and performance-based incentives
Opportunity to work on cutting-edge AI and ML projects
Exposure to global clients and international project delivery
Continuous learning and professional development opportunities
Competitive base + commission
Fast growth into leadership roles

Fulltime

Data Engineer

At Adyen, we treat data and data artifacts as first-class citizens. They form ou...

Location

Netherlands , Amsterdam

Salary:

Not provided

Adyen

Expiration Date

Until further notice

Requirements

3+ years of experience working as a Data Engineer or in a similar role
Solid understanding of both Software and Data Engineering practices
Proficient in tools and languages such as: Python, PySpark, Airflow, Hadoop, Spark, Kafka, SQL, Git
Able to effectively communicate complex data-related concepts and outcomes to a diverse range of stakeholders
Capable of identifying opportunities, devising solutions, and handling projects independently
Experimental mindset with a ‘launch fast and iterate’ mentality
Skilled in promoting a data-centric culture within technical teams and advocating for setting standards and continuous improvement

Job Responsibility

Collaborative Solution Development: Engage with a diverse range of stakeholders, including data scientists, analysts, software engineers, product managers, and customers, to understand their requirements and craft effective solutions
Quality Pipelines and Architecture: Design, develop, deploy and operate high-quality production ELT pipelines and data architectures. Integrate data from various sources and formats, ensuring compatibility, consistency, and reliability
Data Best Practices: Help establish and share best practices in performance, code quality, data validation, data governance, and discoverability in your team and in other teams. Participate in mentoring and knowledge sharing initiatives
High Quality Data and Code: Ensure data is accurate, complete, reliable, relevant, and timely. Implement testing, monitoring and validation protocols for your code and data, leveraging tools such as Pytest
Performance Optimization: Identify and resolve performance bottlenecks in data pipelines and systems. Improve query performance and resource utilization to meet SLAs and performance requirements, using technologies Spark optimizations

Senior Data Engineer

Senior Data Engineer position at Checkr, building the data platform to power saf...

Location

United States , San Francisco

Salary:

162000.00 - 190000.00 USD / Year

Checkr

Expiration Date

Until further notice

Requirements

7+ years of development experience in the field of data engineering
5+ years writing PySpark
Experience building large-scale (100s of Terabytes and Petabytes) data processing pipelines - batch and stream
Experience with ETL/ELT, stream and batch processing of data at scale
Strong proficiency in PySpark and Python
Expertise in understanding of database systems, data modeling, relational databases, NoSQL (such as MongoDB)
Experience with big data technologies such as Kafka, Spark, Iceberg, Datalake and AWS stack (EKS, EMR, Serverless, Glue, Athena, S3, etc.)
Knowledge of security best practices and data privacy concerns
Strong problem-solving skills and attention to detail

Job Responsibility

Create and maintain data pipelines and foundational datasets to support product/business needs
Design and build database architectures with massive and complex data, balancing with computational load and cost
Develop audits for data quality at scale, implementing alerting as necessary
Create scalable dashboards and reports to support business objectives and enable data-driven decision-making
Troubleshoot and resolve complex issues in production environments
Work closely with product managers and other stakeholders to define and implement new features

What we offer

Learning and development reimbursement allowance
Competitive compensation and opportunity for professional and personal advancement
100% medical, dental, and vision coverage for employees and dependents
Additional vacation benefits of 5 extra days and flexibility to take time off
Reimbursement for work from home equipment
Lunch four times a week
Commuter stipend
Abundance of snacks and beverages

Fulltime

Senior Big Data Engineer

The Big Data Engineer is a senior level position responsible for establishing an...

Location

Canada , Mississauga

Salary:

94300.00 - 141500.00 USD / Year

Citi

Expiration Date

Until further notice

Requirements

5+ Years of Experience in Big Data Engineering (PySpark)
Data Pipeline Development: Design, build, and maintain scalable ETL/ELT pipelines to ingest, transform, and load data from multiple sources
Big Data Infrastructure: Develop and manage large-scale data processing systems using frameworks like Apache Spark, Hadoop, and Kafka
Proficiency in programming languages like Python, or Scala
Strong expertise in data processing frameworks such as Apache Spark, Hadoop
Expertise in Data Lakehouse technologies (Apache Iceberg, Apache Hudi, Trino)
Experience with cloud data platforms like AWS (Glue, EMR, Redshift), Azure (Synapse), or GCP (BigQuery)
Expertise in SQL and database technologies (e.g., Oracle, PostgreSQL, etc.)
Experience with data orchestration tools like Apache Airflow or Prefect
Familiarity with containerization (Docker, Kubernetes) is a plus

Job Responsibility

Partner with multiple management teams to ensure appropriate integration of functions to meet goals as well as identify and define necessary system enhancements to deploy new products and process improvements
Resolve variety of high impact problems/projects through in-depth evaluation of complex business processes, system processes, and industry standards
Provide expertise in area and advanced knowledge of applications programming and ensure application design adheres to the overall architecture blueprint
Utilize advanced knowledge of system flow and develop standards for coding, testing, debugging, and implementation
Develop comprehensive knowledge of how areas of business, such as architecture and infrastructure, integrate to accomplish business goals
Provide in-depth analysis with interpretive thinking to define issues and develop innovative solutions
Serve as advisor or coach to mid-level developers and analysts, allocating work as necessary
Appropriately assess risk when business decisions are made, demonstrating consideration for the firm's reputation and safeguarding Citigroup, its clients and assets

Fulltime

Senior Big Data Engineer

The Big Data Engineer is a senior level position responsible for establishing an...

Location

Canada , Mississauga

Salary:

94300.00 - 141500.00 USD / Year

Citi

Expiration Date

Until further notice

Requirements

5+ Years of Experience in Big Data Engineering (PySpark)
Data Pipeline Development: Design, build, and maintain scalable ETL/ELT pipelines to ingest, transform, and load data from multiple sources
Big Data Infrastructure: Develop and manage large-scale data processing systems using frameworks like Apache Spark, Hadoop, and Kafka
Proficiency in programming languages like Python, or Scala
Strong expertise in data processing frameworks such as Apache Spark, Hadoop
Expertise in Data Lakehouse technologies (Apache Iceberg, Apache Hudi, Trino)
Experience with cloud data platforms like AWS (Glue, EMR, Redshift), Azure (Synapse), or GCP (BigQuery)
Expertise in SQL and database technologies (e.g., Oracle, PostgreSQL, etc.)
Experience with data orchestration tools like Apache Airflow or Prefect
Familiarity with containerization (Docker, Kubernetes) is a plus

Job Responsibility

Partner with multiple management teams to ensure appropriate integration of functions to meet goals as well as identify and define necessary system enhancements to deploy new products and process improvements
Resolve variety of high impact problems/projects through in-depth evaluation of complex business processes, system processes, and industry standards
Provide expertise in area and advanced knowledge of applications programming and ensure application design adheres to the overall architecture blueprint
Utilize advanced knowledge of system flow and develop standards for coding, testing, debugging, and implementation
Develop comprehensive knowledge of how areas of business, such as architecture and infrastructure, integrate to accomplish business goals
Provide in-depth analysis with interpretive thinking to define issues and develop innovative solutions
Serve as advisor or coach to mid-level developers and analysts, allocating work as necessary
Appropriately assess risk when business decisions are made, demonstrating consideration for the firm's reputation and safeguarding Citigroup, its clients and assets, by driving compliance with applicable laws, rules and regulations, adhering to Policy, applying sound ethical judgment regarding personal behavior, conduct and business practices, and escalating, managing and reporting control issues with transparency

What we offer

Well-being support
Growth opportunities
Work-life balance support

Fulltime

Pyspark Data Engineer

Citi

Location:
India , Chennai

Category:
IT - Software Development

Contract Type:
Not provided

Salary:

Job Description:

Job Responsibility:

Requirements:

Additional Information:

Job Posted:
April 26, 2025

Looking for more opportunities? Search for other job offers that match your skills and interests.

Similar Jobs for Pyspark Data Engineer