Google Spanner Adds Columnar Engine |
Monday, 11 August 2025 |
Google has announced a columnar engine for Spanner to extend analytical capabilities in Spanner databases. Google describes Spanner as combining the best of relational and NoSQL databases - scalable, distributed, but supporting ACID transactions and SQL queries. Spanner is used internally at Google for many of Google's apps, including Gmail, Photos, and Calendar. It is available as a database as a service running on Google Cloud. Google's internal implementation of Spanner is the largest single database on earth and spans all Google’s data centers - hence the name Spanner. Online transaction processing (OLTP) systems such as Spanner are optimized for high-volume, low-latency transactions, and use row-oriented storage that's efficient for individual record access. This isn't ideal for analytical workloads that need rapid aggregations and scans across large datasets. Spanner's new columnar engine combines columnar storage with vectorized query execution. Unlike traditional row-oriented storage, where an entire row is stored contiguously, columnar storage stores data column by column. This offers several advantages for analytical workloads, including reduced I/O and more efficient scans. The reduced I/O comes from the fact that analytical queries often access only a few columns at a time, and columnar storage supports the reading of just the relevant columns. Compression is also improved because data within a single column is typically of the same data type and often exhibits similar storage patterns, leading to much higher compression ratios. In addition, when scanning a column consecutive values can be processed together, for more efficient data processing. The columnar engine makes use of Spanner’s vectorized execution feature. Unlike traditional query engines that process data tuple-by-tuple (row by row), a vectorized engine processes data in batches (vectors) of rows. This improves CPU utilization because instead of calling a function for each individual row, vectorized engines call functions once for an entire batch, significantly reducing overhead. It can also result in more cache-friendly memory access patterns. Spanner columnar engine also improves the integration between Spanner and BigQuery via Data Boost, Spanner's fully managed service for analytical workloads. When BigQuery issues a federated query to Spanner, and that query can benefit from columnar scans and vectorized execution, Data Boost automatically makes use of the Spanner columnar engine. This means that complex analytical queries execute faster. Data Boost also helps ensure that analytical workloads are offloaded from your primary Spanner compute resources to reduce the impact on transactional operations. Google Spanner with Columnar storage is available now. More InformationRelated ArticlesGoogle Spanner Wins Hall Of Fame Award Google Releases Spanner Emulator Google Spanner To Settle Relational Vs NoSQL Debate Google's F1 - Scalable Alternative to MySQL To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.
Comments
or email your comment to: comments@i-programmer.info |