Data Engineering Lead

  • Lahore, Multan, Karachi, Islamabad
  • WFH Flexible
  • Information Technology

We are seeking a Data Engineering Lead with 8+ years of hands-on experience and a strong background in real-time and batch data processing, containerization, and cloud-based data orchestration. This role is ideal for someone passionate about building robust, scalable, and efficient data pipelines, and who thrives in agile, collaborative environments.

Key Responsibilities

  • Design, build, and maintain real-time data pipelines using streaming frameworks such as Kafka, Apache Flink, and Spark Structured Streaming.
  • · Develop batch processing workflows with Apache Spark (PySpark)
  • Orchestrate and schedule data workflows using orchestration frameworks such as Apache Airflow and Azure Data Factory
  • Containerize applications using Docker, manage deployments with Helm, and run them on Kubernetes
  • Implement modern storage solutions using open formats such as Parquet, Delta Lake, and Apache Iceberg
  • Build high-performance analytics engines using tools like Trino or Presto
  • Collaborate with DevOps to manage infrastructure with Terraform and integrate with CI/CD pipelines via Azure DevOps
  • Ensure data quality and consistency using tools like Great Expectations
  • Write modular, well-tested, and maintainable Python and SQL code
  • Develop an observability layer to monitor and optimize performance across data pipelines
  • Participate in agile ceremonies and contribute to sprint planning and reviews

Required Skills & Experience

  • Advanced Python programming with a strong focus on modular and testable code
  • Strong knowledge of SQL and experience working with large-scale datasets
  • Hands-on experience with at least one major cloud platform (Azure preferred)
  • Solid experience with real-time data processing (Kafka, Flink, or Spark Streaming)
  • Expertise in Apache Spark (PySpark) for batch processing
  • Experience implementing lakehouse architectures and working with columnar storage (e.g., ClickHouse)
  • Proficient in using Azure Data Factory or Apache Airflow for data orchestration
  • Experience in building APIs to expose large datasets
  • Solid experience with Docker, Kubernetes, and Helm
  • Familiarity with data lake open formats such as Parquet, Delta Lake, and Iceberg
  • Basic experience with Terraform for infrastructure provisioning
  • Practical experience with data quality frameworks (e.g., Great Expectations)
  • Comfortable working in agile development teams
  • Proven ability in debugging and performance tuning of streaming and batch data jobs
  • Experience with AI-driven tools (e.g., text-to-SQL) is a plus

We have an amazing team of 700+ individuals working on highly innovative enterprise projects & products. Our customer base includes Fortune 100 retail and CPG companies, leading store chains, fast-growth fintech, and multiple Silicon Valley startups.

What makes Confiz stand out is our focus on processes and culture. Confiz is ISO 9001:2015 (QMS), ISO 27001:2022 (ISMS), ISO 20000-1:2018 (ITSM) and ISO 14001:2015 (EMS) Certified. We have a vibrant culture of learning via collaboration and making workplace fun.

People who work with us work with cutting-edge technologies while contributing success to the company as well as to themselves. 

To know more about Confiz Limited, visit: https://www.linkedin.com/company/confiz-pakistan/