Data Quality Validation
Use when designing data quality checks, validating pipeline outputs, setting up schema validation, or using Dataform/Dataplex/Cloud DQ. Covers GCP-PDE domain: Prepare and use data for analysis (~10-15%).
kienbui19950 スター2026/04/08 When to Use
- Designing data quality checks for a pipeline
- Schema validation after ingestion
- Setting up monitoring for data freshness and completeness
- Preparing for GCP Professional Data Engineer exam
Core Jobs
1. Data Quality Dimensions
- Completeness — no unexpected NULLs; all required fields populated
- Accuracy — values within expected ranges, valid formats
- Consistency — referential integrity, no duplicates, cross-table agreement
- Freshness — data arrived within expected SLA (lag monitoring)
- Uniqueness — no duplicate records on primary key