Data Quality & Test Automation Framework Engineer
- Lahore, Multan, Karachi, Islamabad
- WFH Flexible
- Delivery
We are seeking an experienced engineer to design and implement a Quality Assurance (QA) framework for automated testing across our cloud-native, real-time, and batch data pipelines. The ideal candidate will have deep expertise in Python-based test frameworks, data validation at scale, and CI/CD integration, ensuring reliability, accuracy, and performance in our data platform.
Key Responsibilities
- Design and develop a scalable automated data quality and testing framework for real-time (Kafka/Flink/Spark Streaming) and batch (PySpark) data pipelines.
- Integrate automated data validation (schema checks, statistical profiling, anomaly detection) into data workflows using tools like Great Expectations, Deequ, or custom-built Python solutions.
- Build test harnesses and reusable libraries for unit, integration, and regression testing of ETL pipelines and APIs.
- Implement continuous testing pipelines integrated with CI/CD tools (Azure DevOps, GitHub Actions, or Jenkins).
- Collaborate with data engineers to embed data quality gates within orchestration tools (Azure Data Factory / Airflow).
- Define and monitor data quality KPIs, thresholds, and alerting using observability tools (Datadog, Prometheus, etc.).
- Develop mock data generation and simulation frameworks for performance and stress testing.
- Ensure comprehensive test coverage for schema evolution, transformation logic, and downstream data consumption APIs.
- Contribute to best practices and documentation around test-driven data engineering and quality-first development culture.
Required Skills
- Strong knowledge of data warehousing, lakehouse architectures, and data lake
- Overall 6+ years of experience in data warehousing domain and 3+ in ETL testing and data quality
- Strong Python development skills, emphasizing modular, testable, and reusable code.
- Hands on knowledge of test automation tools and libraries for data warehouse solutions.
- Expertise with data quality libraries such as Great Expectations, Deequ, Soda Core, or equivalent.
- Solid understanding of SQL and ability to validate transformations over large-scale datasets.
- Familiarity with Apache Spark (PySpark) and real-time data processing frameworks (Kafka, Flink, Spark Streaming).
- Experience with Azure ecosystem (Data Factory, Databricks, Storage, Synapse).
- Experience with CI/CD pipelines and automated testing workflows.
- Exposure to monitoring and alerting for data quality metrics
We have an amazing team of 700+ individuals working on highly innovative enterprise projects & products. Our customer base includes Fortune 100 retail and CPG companies, leading store chains, fast-growth fintech, and multiple Silicon Valley startups.
What makes Confiz stand out is our focus on processes and culture. Confiz is ISO 9001:2015 (QMS), ISO 27001:2022 (ISMS), ISO 20000-1:2018 (ITSM), ISO 14001:2015 (EMS), ISO 45001:2018 (OHSMS) Certified. We have a vibrant culture of learning via collaboration and making workplace fun.
People who work with us work with cutting-edge technologies while contributing success to the company as well as to themselves.
To know more about Confiz Limited, visit: https://www.linkedin.com/company/confiz-pakistan/
