数据仓库建模：深入解析Data Vault方法论

A data vault is a Data Modeling approach used to build a data warehouse for enterprise-scale analytics. The data vault has three types of entities: hubs, links, and satellites.

Hubs represent core business concepts, links represent relationships between hubs, and satellites store information about hubs and the relationships between them.

# Features

The Data Vault methodology represents a dynamic and flexible approach to managing Big Data and evolving data connection points in your Data Warehouse. Recently, there has been a significant shift towards using Data Vaults as governed Data Lakes. This shift addresses the key challenges we’ve identified in Data Warehousing:

Adapting to changing business environmentsHandling massive data setsReducing the complexities of Data Warehouse designEnhancing accessibility for business users by modeling close to the business domainAllowing seamless integration of new data sources without affecting the existing architecture

This method is proving to be highly effective and efficient, facilitating easier design, build, population, and modification of Data Warehouses. This is where Data Warehouse Automation can be particularly beneficial.

# Why Data Vault 2.0?

Data Vault 2.0 is the prescriptive, industry-standard methodology for turning raw data into actionable intelligence, leading to tangible business outcomes. Follow our proactive, proven recipe and transform your raw data into information that will allow you to produce the results that your business finds most valuable.

Video about “Behind the Hype: Should you ever build a Data Vault in a Lakehouse?”
Write-optimized approach (opposed to snowflake for querying) Video Lin

# When to Use

Managing numerous disparate data sourcesAccommodating frequent schema changes (DDL) in source OLTP databases

# Layers

? Lanzing Zone (LZN)Raw Data Vault (RDV)Business Data Vault (BDV)Universal Data Model (UDM)

# Difference between 1.0 and 2.0

Data Vault 1.0, introduced by Dan Linstedt in the early 2000s, established the core principles:

Hub, Link, and Satellite structureBusiness keys in HubsRelationships captured in LinksDescriptive data in SatellitesFocus on historical tracking and auditability

Data Vault 2.0, released around 2013, built upon 1.0 by adding:

hash key