Enterprise Cloud Data Quality & Governance Training
Data Quality Engineering & Governance Training Course is a professional training program designed to build strong skills in data quality management, data governance, and enterprise data engineering. This course helps you learn how to improve, clean, and control data across cloud platforms like AWS, Microsoft Azure, and Google Cloud.
It covers essential topics including data profiling, data cleansing, data transformation, data governance frameworks, and data observability using modern industry tools.
The course also focuses on banking and financial compliance standards such as BCBS 239, KYC/AML, Basel III/IV, and IFRS 9, helping learners understand how data quality impacts real financial systems and regulatory reporting.
By the end of this training, you will be able to work confidently in roles like Data Engineer, Data Quality Analyst, and Data Governance Specialist, with practical experience in cloud-based data quality systems.
Course Content:
Module 1: Why Data Quality Matters
- The business case across industries.
- Covers the definition of data quality, the six dimensions (accuracy, completeness, consistency, timeliness, validity, uniqueness)
- Real-world cost of bad data
The difference between data quality and data governance.
Module 2: Data Quality in Corporate Banking
- Regulations, risk, and real examples.
- Covers BCBS 239, KYC/AML compliance, customer master data risks, Basel III/IV and IFRS 9 reporting requirements.
- A data lineage in banking
- A case study on the JP Morgan London Whale incident.
Module 3: Cloud Platforms for Data Quality
- AWS, Azure, and GCP compared.
- Covers cloud vs on-premise trade-offs, AWS Glue DataBrew, Azure Purview, Google Dataplex, SaaS tools like
- Informatica and Collibra.
- A hands on lab profiling a sample dataset.
Module 4: Data Profiling and Discovery
- Understanding what you have before fixing it.
- Covers column statistics, null rates, value distributions, duplicate detection, metadata management, data catalogues.
- A practical lab on a realistic messy customer dataset.
Module 5: Data Cleansing and Transformation
- Fixing data the right way, at scale.
- Covers standardisation, deduplication with fuzzy matching, handling missing values, validation rules, ETL vs
- ELT pipelines.
- A lab writing quality rules in AWS Glue or Google Dataprep.
Module 6: Data Governance and Stewardship
- Policies, roles, and accountability.
- Covers the DAMA-DMBOK framework, data owner/steward/custodian roles, building a data quality policy, data lineage, GDPR and PDPA implications, and setting up a governance committee.
Module 7: Monitoring and Observability
- Catching issues before they cause damage.
- Covers data quality KPIs, dashboards in Power BI/Looker/QuickSight, automated checks using Great
- Expectations and dbt tests, alerting workflows, and tools like Monte Carlo and Soda Core.
Module 8: AI and ML Data Quality
- Garbage in, garbage out at model scale.
- Covers how defects amplify in ML, training data bias, feature store quality, data versioning, generative AI hallucination risks from poor source data.
- EU AI Act data quality obligations.
Module 9: Industry
- Applied scenarios across sectors.
- Covers corporate banking (SWIFT messaging, nostro reconciliation), retail (product catalogues, customer 360), healthcare (patient matching, HL7 FHIR), government (open data standards), and building a cross industry data quality scorecard.
Module 10: Capstone Project
- Learners choose a real world scenario (retail bank, e-commerce, or public sector) and deliver a profiling report, cleansing plan, governance policy document, and monitoring dashboard mockup. Assessed through peer review and instructor evaluation.
Prerequisites:
- Basic computer literacy
- Familiarity with spreadsheets (Excel or Google Sheets)
- No prior cloud, coding, or data experience needed
- Open to all industries and backgrounds
Learning Outcomes:
- Define and distinguish the six dimensions of data quality and explain their business impact across corporate and everyday contexts.
- Identify data quality requirements tied to key regulations including BCBS 239, GDPR, Basel III/IV, and the EU AI Act.
- Select and configure cloud-native data quality tools on at least one major platform (AWS, Azure, or GCP) for profiling, cleansing, and monitoring.
- Design and implement a data cleansing workflow addressing duplication, missing values, format errors, and referential integrity issues.
- Construct a data governance framework with defined roles, ownership policies, and data lineage documentation suited to their organisation.
- Build and interpret a data quality monitoring system with automated checks, KPI dashboards, and incident response workflows.
- Evaluate data quality risks in AI and ML pipelines and propose controls to prevent model degradation from upstream defects.
International Student Fee: 1000 USD
Flexible Class Options
- Corporate Group Training | Fast-Track
- Week End Classes For Professionals SAT | SUN
- Online Classes – Live Virtual Class (L.V.C), Online Training
