Second Brain: Crafted, Curated, Connected, Compounded on 10月02日
VertiPaq引擎解析
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

VertiPaq引擎是一款内存中的列式数据库,为MS Excel Power Pivot、SQL Server Analysis Services (SSAS) Tabular和Power BI提供支持。当数据加载到数据模型中时,VertiPaq引擎会将其压缩并存储在RAM中,以实现高效的处理。该引擎采用内存优化的列式存储索引技术,类似于SQL Server引擎中的非聚集列式索引。VertiPaq通过预计算的哈希表(称为关系)处理连接,这些哈希表在连接的表发生变化时重建,并驻留在CPU缓存中,从而避免了常规连接性能问题。VertiPaq引擎最初于2006年推出,以其列式存储而闻名,通过优化数据压缩和查询性能,显著提高了数据检索和分析速度。

🔹VertiPaq引擎是一款内存中的列式数据库,为MS Excel Power Pivot、SQL Server Analysis Services (SSAS) Tabular和Power BI提供支持。它通过将数据压缩并存储在RAM中,实现了高效的处理。

🔸该引擎采用内存优化的列式存储索引技术,类似于SQL Server引擎中的非聚集列式索引,有效提升了数据检索和分析速度。

🔶VertiPaq通过预计算的哈希表(称为关系)处理连接,这些哈希表驻留在CPU缓存中,避免了常规连接性能问题。

🔷VertiPaq引擎最初于2006年推出,以其列式存储而闻名,通过优化数据压缩和查询性能,显著提高了数据检索和分析速度。

🔺VertiPaq引擎的开发始于2008年,由Amir Netz开发革命性算法,Christian Petculescu实现初始版本,约包含200万行代码,彻底改变了自助式BI,实现了即时结果查看。

The VertiPaq engine is an in-memory Columnar database powering MS Excel Power Pivot, SQL Server Analysis Services (SSAS) Tabular, and Power BI. When data is loaded into a data model, the VertiPaq engine compresses and stores it in RAM for efficient processing.

# How does it work?

VertiPaq memory optimized ColumnStore index, is the same technology implemented in the SQL Server engine, in the form of non-clustered columnar indexes Source

Source: Inside the VertiPaq Engine by Marco Russo

This technology is comparable to the Photon Engine used by Databricks for their compute operations, as well as similar solutions employed by Snowflake and other platforms.

Another thing to note is that VertiPaq handles joins through pre-computed hashmaps (called relationships) that are rebuilt anytime a table in the join changes. The hashmap sits in the CPU cache and this avoids regular join perf concerns. Another read-focused design choice. This doesn’t apply for CCI in SQL Server or Synapse, just the read-heavy analytics-focused implementations of VertiPaq. Alex Barbeau on Bsky

# Different Versions

Got multiple flavors of xVelocity:

    VertiPaq (xVelocity): Originally introduced as VertiPaq in 2006, this is the in-memory analytics engine that was integrated into Microsoft’s suite of data tools like Excel, PowerPivot, SQL Server Analysis Services (SSAS), and Power BI. It’s known for its columnar storage which significantly speeds up data retrieval and analysis by optimizing data compression and query performance. Citation: VertiPaq vs ColumnStore Comparison - SQLBIOptimized ColumnStore Index: This is another implementation of the xVelocity technology within SQL Server. Here, it’s used to create non-clustered columnar indexes, which are particularly effective for large data warehousing operations. The performance comparison between VertiPaq and ColumnStore indicates different scenarios where one might be preferred over the other for query performance. Citation: VertiPaq vs ColumnStore Comparison - SQLBIV-Order with Microsoft Fabric: V-Order is a specific optimization applied to Parquet files within Microsoft Fabric, which enhances read performance under various compute engines like Power BI, SQL, and Spark. It’s not directly a version of VertiPaq but leverages similar principles of columnar storage and in-memory-like access for improved efficiency. V-Order applies sorting, row group distribution, dictionary encoding, and compression, which aligns with VertiPaq’s core philosophy of optimizing data for fast retrieval. Citation: Understand V-Order - Microsoft Fabric | Microsoft Learn
      Note: While V-Order is specific to Fabric and not directly a version of VertiPaq, it represents the ongoing evolution of Microsoft’s approach to data storage and retrieval, leveraging columnar structures for performance gains, much like VertiPaq initially did.

The 8-byte rule applies to all the following technologies: xVelocity memory-optimized columnstore index can see a 400% gain in performance when using datatype 8 bytes or less in size. Some history knowledge in:

# Used in

More shared by Rif Kiamil:

    Azure SQL Database: Supports columnstore indexes, which utilize xVelocity (VertiPaq) technology.Azure SQL Managed Instance: Supports columnstore indexes, leveraging xVelocity technology.SQL Server 2012 (On-Premises) and later: Introduced non-clustered columnstore indexes in SQL Server 2012, utilizing xVelocity technology.Azure Synapse Dedicated SQL Pool (formerly SQL DW): non-clustered columnstore indexes. Azure Synapse’s Dedicated SQL Pool are tailored for its massively parallel processing (MPP Database) architecture,

# History

Based on the discussion in RW The rise and history of Power BI with Amir Netz by Kasper On BI between Amir Netz and Kasper On:

The VertiPaq development journey began around 2008 when Microsoft’s team sought to create a column store engine capable of delivering unprecedented performance. The breakthrough occurred when Amir Netz, inspired by an article stating that “data is not distributed uniformly in systems,” developed a revolutionary algorithm during one sleepless night. Christian Petculescu, one of Microsoft’s top developers, implemented the initial version, which comprised approximately 2 million lines of code.

VertiPaq’s revolutionary nature stemmed from its fundamental departure from traditional OLAP approaches. Rather than depending on disk-stored pre-computed aggregations, it utilized in-memory processing with innovative compression algorithms that could operate on compressed data directly. The engine was optimized for small to medium-sized datasets that fit in memory. Its initial performance tests were so remarkable that they raised skepticism, with Amir noting it performed “an order of magnitude faster than any other column store” – a distinction that remains true over a decade later.

The technology transformed Self-Service BI by eliminating the lengthy pre-computation times typical of traditional OLAP systems. This advancement allowed business users to make changes and view results instantly, similar to the Excel experience. Amir considers VertiPaq a “once-in-a-lifetime” technological breakthrough, describing it as “perfect” – so well-optimized that its core algorithm has remained largely unchanged for over a decade. The development team carefully introduced it to avoid disrupting users’ existing technology adoption. This shift from disk-based, pre-computed OLAP systems to interactive, in-memory analytics proved crucial for Power BI’s success and modern business intelligence.

See also Microsoft Fabric.

# Renaming

From 2022-03-19: VertiPaq has been rebranded to xVelocity. The official name is now “xVelocity in-memory analytics engine (VertiPaq)”, though both terms – xVelocity and VertiPaq – are used interchangeably when discussing Analysis Services. Source

# Other In-Memory Formats/Engines

https://www.ssp.sh/brain/VertiPaq-20250128184134529.webp0128184134529.webp">
Shared on Twitter

See more on In-Memory Formats.

# Further Readings


Origin: Tweet
References: In-Memory Formats
Created: 2023-03-03

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

VertiPaq xVelocity 内存数据库 列式存储 Power BI SQL Server
相关文章