Data Warehouse (part 7): ETL design based on ETL architecture, schema architecture, SDC & DW components

Hang Nguyen
3 min readJun 3, 2022

In this part, we will combine all knowledge from previous parts into ETL design.

  • Process dimension tables before fact tables
  • Opportunities for parallel processing

ETL Design for Dimension Table

Dimension Table Increment ETL:

Step 1: Data prep

“Change Data Capture” techniques:

  • Transactional data timestamps (compare timestamps)
  • Database logs
  • Last resort: database scan-and-compare

Step 2: Data transformation

Common transformation models: Data value unification, Data type and size unification, de-duplication, dropping columns (vertical slicing), value-based row filtering (horizontal slicing) and correcting known errors.

--

--

Hang Nguyen
Hang Nguyen

Written by Hang Nguyen

Just sharing (data) knowledge

No responses yet