CrawlJobs Logo

AI/ML & Data Engineer

congruentsoft.com Logo

Congruent

Location Icon

Location:
India , Chennai

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Responsibility:

  • Data Ingestion and Preprocessing: Ability to build and maintain data pipelines to ingest unstructured data from PDFs, gazettes, HTML circulars etc. and process data extraction, parsing, and normalization
  • NLP & LLM Modeling: Ability to fine-tune or prompt-tune LLMs for summarization, classification, and change detection in regulations. Ability to develop embeddings for semantic similarity.
  • Knowledge Graph Engineering: Ability to design entity relationships (regulation, control, policy) and implement retrieval over Neo4j or similar graph DBs.
  • Information Retrieval (RAG): Ability to build RAG pipelines for natural language querying of regulations.
  • Annotation and Validation: Ability to annotate training data by collaborating with SMEs and validate model outputs
  • MLOps: Ability to build CI/CD for model retraining, versioning, and evaluation (precision, recall, BLEU, etc.)
  • API and Integration: Ability to expose ML models as REST APIs (FastAPI) for integration with product frontend.

Requirements:

  • 4~6 years in ML/NLP, preferably in document-heavy domains (finance, legal, policy)
  • Languages: Python, SQL
  • AI/ML/NLP: Hugging face transformers, OpenAI API, Spacy, Scikit-Learn, LangChain, RAG, LLM prompt-tuning, LLM fine-tuning
  • Vector Search: Pinecone, Weaviate, FAISS
  • Data Engineering: Airflow, Kafka, OCR (Tesseract, pdfminer)
  • MLOps: MLflow, Docker

Additional Information:

Job Posted:
January 01, 2026

Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for AI/ML & Data Engineer

Software Engineer (Data Engineering)

We are seeking a Software Engineer (Data Engineering) who can seamlessly integra...
Location
Location
India , Hyderabad
Salary
Salary:
Not provided
nstarxinc.com Logo
NStarX
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 4+ years in Data Engineering and AI/ML roles
  • Bachelor’s or Master’s degree in Computer Science, Data Science, or a related field
  • Python, SQL, Bash, PySpark, Spark SQL, boto3, pandas
  • Apache Spark on EMR (driver/executor model, sizing, dynamic allocation)
  • Amazon S3 (Parquet) with lifecycle management to Glacier
  • AWS Glue Catalog and Crawlers
  • AWS Step Functions, AWS Lambda, Amazon EventBridge
  • CloudWatch Logs and Metrics, Kinesis Data Firehose (or Kafka/MSK)
  • Amazon Redshift and Redshift Spectrum
  • IAM (least privilege), Secrets Manager, SSM
Job Responsibility
Job Responsibility
  • Design, build, and maintain scalable ETL and ELT pipelines for large-scale data processing
  • Develop and optimize data architectures supporting analytics and ML workflows
  • Ensure data integrity, security, and compliance with organizational and industry standards
  • Collaborate with DevOps teams to deploy and monitor data pipelines in production environments
  • Build predictive and prescriptive models leveraging AI and ML techniques
  • Develop and deploy machine learning and deep learning models using TensorFlow, PyTorch, or Scikit-learn
  • Perform feature engineering, statistical analysis, and data preprocessing
  • Continuously monitor and optimize models for accuracy and scalability
  • Integrate AI-driven insights into business processes and strategies
  • Serve as the technical liaison between NStarX and client teams
What we offer
What we offer
  • Competitive salary and performance-based incentives
  • Opportunity to work on cutting-edge AI and ML projects
  • Exposure to global clients and international project delivery
  • Continuous learning and professional development opportunities
  • Competitive base + commission
  • Fast growth into leadership roles
  • Fulltime
Read More
Arrow Right

Senior Data & AI/ML Engineer - GCP Specialization Lead

We are on a bold mission to create the best software services offering in the wo...
Location
Location
United States , Menlo Park
Salary
Salary:
Not provided
techjays.com Logo
techjays
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • GCP Services: BigQuery, Dataflow, Pub/Sub, Vertex AI
  • ML Engineering: End-to-end ML pipelines using Vertex AI / Kubeflow
  • Programming: Python & SQL
  • MLOps: CI/CD for ML, Model deployment & monitoring
  • Infrastructure-as-Code: Terraform
  • Data Engineering: ETL/ELT, real-time & batch pipelines
  • AI/ML Tools: TensorFlow, scikit-learn, XGBoost
  • Min Experience: 10+ Years
Job Responsibility
Job Responsibility
  • Design and implement data architectures for real-time and batch pipelines, leveraging GCP services such as BigQuery, Dataflow, Dataproc, Pub/Sub, Vertex AI, and Cloud Storage
  • Lead the development of ML pipelines, from feature engineering to model training and deployment using Vertex AI, AI Platform, and Kubeflow Pipelines
  • Collaborate with data scientists to operationalize ML models and support MLOps practices using Cloud Functions, CI/CD, and Model Registry
  • Define and implement data governance, lineage, monitoring, and quality frameworks
  • Build and document GCP-native solutions and architectures that can be used for case studies and specialization submissions
  • Lead client-facing PoCs or MVPs to showcase AI/ML capabilities using GCP
  • Contribute to building repeatable solution accelerators in Data & AI/ML
  • Work with the leadership team to align with Google Cloud Partner Program metrics
  • Mentor engineers and data scientists toward achieving GCP certifications, especially in Data Engineering and Machine Learning
  • Organize and lead internal GCP AI/ML enablement sessions
What we offer
What we offer
  • Best in class packages
  • Paid holidays and flexible paid time away
  • Casual dress code & flexible working environment
  • Medical Insurance covering self & family up to 4 lakhs per person
Read More
Arrow Right

Software Engineer - Data Scientist AI/ML

Hewlett Packard Enterprise is the global edge-to-cloud company advancing the way...
Location
Location
India , Bangalore
Salary
Salary:
Not provided
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • BS/MS in Computer Science or Data Science, Electrical Engineering, Statistics, Applied Math or equivalent fields with strong mathematical background
  • General understanding of machine learning techniques and algorithms, including clustering, anomaly detection, optimization, Neural network, Graph ML, etc
  • Experience building data science-driven solutions including data collection, feature selection, model training, post-deployment validation
  • Strong hands-on coding skills (preferably in Python) processing large-scale data set and developing machine learning models
  • Familiar with one or more machine learning or statistical modeling tools such as Numpy, ScikitLearn, MLlib, Tensorflow
  • Works well in a team setting and is self-driven
Job Responsibility
Job Responsibility
  • Collaborate with team to understand feature, work with domain experts to identify relevant “signals” during feature engineering, deliver generic and performant ML solutions
  • Keep up to date with newest technology trends
  • Communicate results and ideas to key decision makers
  • Implement new statistical or other mathematical methodologies as needed for specific models or analysis
  • Optimize joint development efforts through appropriate database use and project design
What we offer
What we offer
  • Health & Wellbeing
  • Personal & Professional Development
  • Unconditional Inclusion
  • Fulltime
Read More
Arrow Right

Senior Data Engineer

Adtalem is a data driven organization. The Data Engineering team builds data sol...
Location
Location
United States , Lisle
Salary
Salary:
84835.61 - 149076.17 USD / Year
adtalem.com Logo
Adtalem Global Education
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree Computer Science, Computer Engineering, Software Engineering, or other related technical field.
  • Master's Degree Computer Science, Computer Engineering, Software Engineering, or other related technical field.
  • Two (2) plus years experience in Google cloud with services like BigQuery, Composer, GCS, DataStream, Dataflows, BQML, Vertex AI.
  • Six (6) plus years experience in data engineering solutions such as data platforms, ingestion, data management, or publication/analytics.
  • Hands-on experience working with real-time, unstructured, and synthetic data.
  • Experience in Real Time Data ingestion using GCP PubSub, Kafka, Spark or similar.
  • Expert knowledge on Python programming and SQL.
  • Experience with cloud platforms (AWS, GCP, Azure) and their data services
  • Experience working with Airflow as workflow management tools and build operators to connect, extract and ingest data as needed.
  • Familiarity with synthetic data generation and unstructured data processing
Job Responsibility
Job Responsibility
  • Architect, develop, and optimize scalable data pipelines handling real-time, unstructured, and synthetic datasets
  • Collaborate with cross-functional teams, including data scientists, analysts, and product owners, to deliver innovative data solutions that drive business growth.
  • Design, develop, deploy and support high performance data pipelines both inbound and outbound.
  • Model data platform by applying the business logic and building objects in the semantic layer of the data platform.
  • Leverage streaming technologies and cloud platforms to enable real-time data processing and analytics
  • Optimize data pipelines for performance, scalability, and reliability.
  • Implement CI/CD pipelines to ensure continuous deployment and delivery of our data products.
  • Ensure quality of critical data elements, prepare data quality remediation plans and collaborate with business and system owners to fix the quality issues at its root.
  • Document the design and support strategy of the data pipelines
  • Capture, store and socialize data lineage and operational metadata
What we offer
What we offer
  • Health, dental, vision, life and disability insurance
  • 401k Retirement Program + 6% employer match
  • Participation in Adtalem’s Flexible Time Off (FTO) Policy
  • 12 Paid Holidays
  • Eligible to participate in an annual incentive program
  • Fulltime
Read More
Arrow Right

Software Engineer, Data Infrastructure

The Data Infrastructure team at Figma builds and operates the foundational platf...
Location
Location
United States , San Francisco; New York
Salary
Salary:
149000.00 - 350000.00 USD / Year
figma.com Logo
Figma
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of Software Engineering experience, specifically in backend or infrastructure engineering
  • Experience designing and building distributed data infrastructure at scale
  • Strong expertise in batch and streaming data processing technologies such as Spark, Flink, Kafka, or Airflow/Dagster
  • A proven track record of impact-driven problem-solving in a fast-paced environment
  • A strong sense of engineering excellence, with a focus on high-quality, reliable, and performant systems
  • Excellent technical communication skills, with experience working across both technical and non-technical counterparts
  • Experience mentoring and supporting engineers, fostering a culture of learning and technical excellence
Job Responsibility
Job Responsibility
  • Design and build large-scale distributed data systems that power analytics, AI/ML, and business intelligence
  • Develop batch and streaming solutions to ensure data is reliable, efficient, and scalable across the company
  • Manage data ingestion, movement, and processing through core platforms like Snowflake, our ML Datalake, and real-time streaming systems
  • Improve data reliability, consistency, and performance, ensuring high-quality data for engineering, research, and business stakeholders
  • Collaborate with AI researchers, data scientists, product engineers, and business teams to understand data needs and build scalable solutions
  • Drive technical decisions and best practices for data ingestion, orchestration, processing, and storage
What we offer
What we offer
  • equity
  • health, dental & vision
  • retirement with company contribution
  • parental leave & reproductive or family planning support
  • mental health & wellness benefits
  • generous PTO
  • company recharge days
  • a learning & development stipend
  • a work from home stipend
  • cell phone reimbursement
  • Fulltime
Read More
Arrow Right

Sap Btp Data Engineer

At LeverX, we have had the privilege of working on over 950 SAP projects, includ...
Location
Location
Salary
Salary:
Not provided
leverx.com Logo
LeverX
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 2+ years of experience designing and developing SAP data solutions within SAP and non-SAP enterprise landscapes
  • Strong knowledge of data modeling in Data Warehouses
  • Strong knowledge of the visualization patterns, approaches, and techniques in SAP and non-SAP landscapes
  • Understanding of Data Engineering solutions within the SAP BDC landscape, such as SAP Databricks
  • Proven experience in data transformation and integration with SAP ERP, S/4HANA, and equivalent external systems
  • Good understanding of SAP data integration techniques (SDI, SDA, APIs, ODP) and protocols (OData, REST, JDBC)
  • Bachelor’s degree in Computer Science, Information Systems, or equivalent
  • English B2+
Job Responsibility
Job Responsibility
  • Design, develop, and deploy enterprise data solutions on SAP BTP, integrating SAP and non-SAP systems
  • Analyze and resolve complex data and integration challenges, ensuring reliable and scalable solutions
  • Collaborate with data architects, functional analysts, and business stakeholders to translate requirements into data models, dashboards, and analytics
  • Lead small project teams (2–3 members) and contribute to cross-regional collaboration for consistent delivery
  • Facilitate client enablement through workshops, webinars, and hands-on sessions
  • Continuously grow expertise by staying current with SAP data and analytics technologies (e.g., SAP Business Data Cloud, AI/ML) and pursuing relevant certifications
What we offer
What we offer
  • 89% of projects use the newest SAP technologies and frameworks
  • Expert communities and internal courses
  • Valuable perks to support your growth and well-being
  • Employment security: We hire for our team, not just a specific project. If your project ends, we will find you a new one
  • Healthy work atmosphere: On average, our employees stay in the company for 4+ years
Read More
Arrow Right

Technical Lead – AI/ML & Data Platforms

We are seeking a Technical Lead with strong managerial capabilities to drive the...
Location
Location
United States , Sunnyvale
Salary
Salary:
Not provided
thirdeyedata.ai Logo
Thirdeye Data
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong expertise in data pipelines, architecture, and analytics platforms (e.g., Snowflake, Tableau)
  • Experience reviewing and optimizing data transformations, aggregations, and business logic
  • Hands-on familiarity with LLMs and practical RAG implementations
  • Knowledge of AI/ML workflows, model lifecycle management, and experimentation frameworks
  • Proven experience in managing complex, multi-track projects
  • Skilled in project tracking and collaboration tools (Jira, Confluence, or equivalent)
  • Excellent communication and coordination skills with technical and non-technical stakeholders
  • Experience working with cross-functional, globally distributed teams
Job Responsibility
Job Responsibility
  • Coordinate multiple workstreams simultaneously, ensuring timely delivery and adherence to quality standards
  • Facilitate daily stand-ups and syncs across global time zones, maintaining visibility and accountability
  • Understand business domains and technical architecture to enable informed decisions and proactive risk management
  • Collaborate with data engineers, AI/ML scientists, analysts, and product teams to translate business goals into actionable plans
  • Track project progress using Agile or hybrid methodologies, escalate blockers, and resolve dependencies
  • Own task lifecycle — from planning through execution, delivery, and retrospectives
  • Perform technical reviews of data pipelines, ETL processes, and architecture, identifying quality or design gaps
  • Evaluate and optimize data aggregation logic while ensuring alignment with business semantics
  • Contribute to the design and development of RAG pipelines and workflows involving LLMs
  • Create and maintain Tableau dashboards and reports aligned with business KPIs for stakeholders
  • Fulltime
Read More
Arrow Right

Senior ML Data Engineer

As a Senior Data Engineer, you will play a pivotal role in our AI/ML workstream,...
Location
Location
Poland , Warsaw
Salary
Salary:
Not provided
awin.com Logo
Awin Global
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor or Master’s degree in data science, data engineering, Computer Science with focus on math and statistics / Master’s degree is preferred
  • At least 5 years experience as AI/ML data engineer undertaking above task and accountabilities
  • Strong foundation in computer science principes and statistical methods
  • Strong experience with cloud technology (AWS or Azure)
  • Strong experience with creation of data ingestion pipeline and ET process
  • Strong knowledge of big data tool such as Spark, Databricks and Python
  • Strong understanding of common machine learning techniques and frameworks (e.g. mlflow)
  • Strong knowledge of Natural language processing (NPL) concepts
  • Strong knowledge of scrum practices and agile mindset
  • Strong Analytical and Problem-Solving Skills with attention to data quality and accuracy
Job Responsibility
Job Responsibility
  • Design and maintain scalable data pipelines and storage systems for both agentic and traditional ML workloads
  • Productionise LLM- and agent-based workflows, ensuring reliability, observability, and performance
  • Build and maintain feature stores, vector/embedding stores, and core data assets for ML
  • Develop and manage end-to-end traditional ML pipelines: data prep, training, validation, deployment, and monitoring
  • Implement data quality checks, drift detection, and automated retraining processes
  • Optimise cost, latency, and performance across all AI/ML infrastructure
  • Collaborate with data scientists and engineers to deliver production-ready ML and AI systems
  • Ensure AI/ML systems meet governance, security, and compliance requirements
  • Mentor teams and drive innovation across both agentic and classical ML engineering practices
  • Participate in team meetings and contribute to project planning and strategy discussions
What we offer
What we offer
  • Flexi-Week and Work-Life Balance: We prioritise your mental health and well-being, offering you a flexible four-day Flexi-Week at full pay and with no reduction to your annual holiday allowance. We also offer a variety of different paid special leaves as well as volunteer days
  • Remote Working Allowance: You will receive a monthly allowance to cover part of your running costs. In addition, we will support you in setting up your remote workspace appropriately
  • Pension: Awin offers access to an additional pension insurance to all employees in Germany
  • Flexi-Office: We offer an international culture and flexibility through our Flexi-Office and hybrid/remote work possibilities to work across Awin regions
  • Development: We’ve built our extensive training suite Awin Academy to cover a wide range of skills that nurture you professionally and personally, with trainings conveniently packaged together to support your overall development
  • Appreciation: Thank and reward colleagues by sending them a voucher through our peer-to-peer program
Read More
Arrow Right