Data Engineering Lead

Lahore, Multan, Karachi, Islamabad
WFH Flexible
Information Technology

We are seeking a Data Engineering Lead with 8+ years of hands-on experience and a strong background in real-time and batch data processing, containerization, and cloud-based data orchestration. This role is ideal for someone passionate about building robust, scalable, and efficient data pipelines, and who thrives in agile, collaborative environments.

Key Responsibilities

Design, build, and maintain real-time data pipelines using streaming frameworks such as Kafka, Apache Flink, and Spark Structured Streaming.
· Develop batch processing workflows with Apache Spark (PySpark)
Orchestrate and schedule data workflows using orchestration frameworks such as Apache Airflow and Azure Data Factory
Containerize applications using Docker, manage deployments with Helm, and run them on Kubernetes
Implement modern storage solutions using open formats such as Parquet, Delta Lake, and Apache Iceberg
Build high-performance analytics engines using tools like Trino or Presto
Collaborate with DevOps to manage infrastructure with Terraform and integrate with CI/CD pipelines via Azure DevOps
Ensure data quality and consistency using tools like Great Expectations
Write modular, well-tested, and maintainable Python and SQL code
Develop an observability layer to monitor and optimize performance across data pipelines
Participate in agile ceremonies and contribute to sprint planning and reviews

Required Skills & Experience

Advanced Python programming with a strong focus on modular and testable code
Strong knowledge of SQL and experience working with large-scale datasets
Hands-on experience with at least one major cloud platform (Azure preferred)
Solid experience with real-time data processing (Kafka, Flink, or Spark Streaming)
Expertise in Apache Spark (PySpark) for batch processing
Experience implementing lakehouse architectures and working with columnar storage (e.g., ClickHouse)
Proficient in using Azure Data Factory or Apache Airflow for data orchestration
Experience in building APIs to expose large datasets
Solid experience with Docker, Kubernetes, and Helm
Familiarity with data lake open formats such as Parquet, Delta Lake, and Iceberg
Basic experience with Terraform for infrastructure provisioning
Practical experience with data quality frameworks (e.g., Great Expectations)
Comfortable working in agile development teams
Proven ability in debugging and performance tuning of streaming and batch data jobs
Experience with AI-driven tools (e.g., text-to-SQL) is a plus

We have an amazing team of 700+ individuals working on highly innovative enterprise projects & products. Our customer base includes Fortune 100 retail and CPG companies, leading store chains, fast-growth fintech, and multiple Silicon Valley startups.

What makes Confiz stand out is our focus on processes and culture. Confiz is ISO 9001:2015 (QMS), ISO 27001:2022 (ISMS), ISO 20000-1:2018 (ITSM) and ISO 14001:2015 (EMS) Certified. We have a vibrant culture of learning via collaboration and making workplace fun.

People who work with us work with cutting-edge technologies while contributing success to the company as well as to themselves.

To know more about Confiz Limited, visit: https://www.linkedin.com/company/confiz-pakistan/

Apply Now