QA Engineer - Data (m/f/d)
Halian View all jobs
- Dubai
- Contract
- Full-time
- Develop and execute test strategies, plans, cases, and automated scripts for Azure data pipelines (ETL/ELT), data lakes, warehouses, and analytics layers.
- Validate end-to-end data flows: source ingestion → bronze/silver/gold layers (Medallion architecture) → downstream BI/ML models, ensuring accuracy, completeness, consistency, timeliness, and validity.
- Write complex SQL queries and Python to perform data reconciliation, profiling, anomaly detection, duplication checks, schema enforcement, and business rule validation.
- Build and maintain automated data quality tests using frameworks like Great Expectations, dbt tests, custom scripts, or Azure-native capabilities; integrate into CI/CD pipelines (e.g., Azure DevOps).
- Test Azure services including Azure Data Factory (pipelines/orchestration), Azure Databricks (Spark transformations), Azure Synapse Analytics (SQL pools, Spark pools), Azure Data Lake Storage Gen2 (ADLS), Azure Delta Lake, and integrations like Azure Health Data Services or FHIR APIs.
- Perform regression, integration, and performance testing on data transformations after code changes, new sources (e.g., HL7/FHIR feeds, wearables, claims data), or schema updates.
- Monitor production pipelines for data drifts, quality degradation, or anomalies using Azure Monitor, alerts, and logging; investigate root causes and collaborate on fixes.
- Ensure strict compliance with HIPAA, GDPR, HITECH, and other regulations: validate PHI handling, encryption (at rest/in transit), de-identification, • Collaborate with data engineers, analysts, clinical/product teams to translate healthcare business rules (e.g., ICD-10/CPT validation, patient matching) into testable assertions and quality thresholds.
- Document defects, test coverage, data lineage issues, and quality KPIs; contribute to dashboards in Azure Synapse or Power BI.
- Participate in Agile processes (sprints, stand-ups) and advocate for “shift-left” quality in Azure data workflows.