Data Quality & Test Automation Framework Engineer

  • Lahore, Multan, Karachi, Islamabad
  • WFH Flexible
  • Delivery

We are seeking an experienced engineer to design and implement a Quality Assurance (QA) framework for automated testing across our cloud-native, real-time, and batch data pipelines. The ideal candidate will have deep expertise in Python-based test frameworks, data validation at scale, and CI/CD integration, ensuring reliability, accuracy, and performance in our data platform.

Key Responsibilities

  • Design and develop a scalable automated data quality and testing framework for real-time (Kafka/Flink/Spark Streaming) and batch (PySpark) data pipelines.
  • Integrate automated data validation (schema checks, statistical profiling, anomaly detection) into data workflows using tools like Great Expectations, Deequ, or custom-built Python solutions.
  • Build test harnesses and reusable libraries for unit, integration, and regression testing of ETL pipelines and APIs.
  • Implement continuous testing pipelines integrated with CI/CD tools (Azure DevOps, GitHub Actions, or Jenkins).
  • Collaborate with data engineers to embed data quality gates within orchestration tools (Azure Data Factory / Airflow).
  • Define and monitor data quality KPIs, thresholds, and alerting using observability tools (Datadog, Prometheus, etc.).
  • Develop mock data generation and simulation frameworks for performance and stress testing.
  • Ensure comprehensive test coverage for schema evolution, transformation logic, and downstream data consumption APIs.
  • Contribute to best practices and documentation around test-driven data engineering and quality-first development culture.

Required Skills

  • Strong knowledge of data warehousing, lakehouse architectures, and data lake
  • Overall 6+ years of experience in data warehousing domain and 3+ in ETL testing and data quality
  • Strong Python development skills, emphasizing modular, testable, and reusable code.
  • Hands on knowledge of test automation tools and libraries for data warehouse solutions.
  • Expertise with data quality libraries such as Great Expectations, Deequ, Soda Core, or equivalent.
  • Solid understanding of SQL and ability to validate transformations over large-scale datasets.
  • Familiarity with Apache Spark (PySpark) and real-time data processing frameworks (Kafka, Flink, Spark Streaming).
  • Experience with Azure ecosystem (Data Factory, Databricks, Storage, Synapse).
  • Experience with CI/CD pipelines and automated testing workflows.
  • Exposure to monitoring and alerting for data quality metrics

We have an amazing team of 700+ individuals working on highly innovative enterprise projects & products. Our customer base includes Fortune 100 retail and CPG companies, leading store chains, fast-growth fintech, and multiple Silicon Valley startups.

What makes Confiz stand out is our focus on processes and culture. Confiz is ISO 9001:2015 (QMS), ISO 27001:2022 (ISMS), ISO 20000-1:2018 (ITSM), ISO 14001:2015 (EMS), ISO 45001:2018 (OHSMS) Certified. We have a vibrant culture of learning via collaboration and making workplace fun.

People who work with us work with cutting-edge technologies while contributing success to the company as well as to themselves. 

To know more about Confiz Limited, visit: https://www.linkedin.com/company/confiz-pakistan/