PipelineDB Released As PostgreSQL Extension

Written by Kay Ewbank

Thursday, 01 November 2018

PipelineDB 1.0 has been released as a PostgreSQL extension. It is intended to be used for high-performance time-series aggregation based on continuous SQL queries.

The original version of PipelineDB was released as a fork of PostgreSQL, but user demand was for an extension rather than a standalone fork, and having made money from the original version, the developers have invested into reworking PipelineDB as a standard PostgreSQL extension.

PipelineDB is open-source, and is designed for use in large scale data sets where storing lots of raw time-series data and aggregating it over and over again becomes inefficient. The developers say the amount of value that PipelineDB adds is directly proportional to the amount of continuous aggregation that an analytics use case can benefit from. They are also clear that PipelineDB should be used for analytics use cases that only require summary data, like realtime reporting dashboards.

PipelineDB's strength lies in being used for running continuous aggregations over streaming time-series data, and only storing the compact output of these continuous queries as incrementally updated table rows that can be evaluated with minimal query latency. The ideal situation is where you know in advance what queries you're going to want to run, so that the analytics are simpler, faster and cheaper because you've already done the pre-querying. This means summary data is always available for low-latency lookups.

The developers say that:

"Even if billions and billions of rows are written to events_stream, our continuous view ensures that only one physical row per hour is actually persisted within the database. As soon as the continuous view reads new incoming events and the distinct count is updated to reflect new information, the raw events will be discarded."

This way of working enables extremely large levels of raw event throughput on modest hardware footprints, and extremely low read query latencies.

Alongside time-series analysis, PipelineDB can be used for what the developers call continuous transforms:

"Unlike continuous views -- which store aggregate state in incrementally updated tables -- continuous transforms are stateless and simply apply a transformation to a stream, writing out the result to another stream."

PipelineDB ships with a number of built-in aggregates including:

HyperLogLog-based distincts counting, merging, and manipulation
Bloom filters for set membership analysis
Top-K and “heavy hitters” tracking
Distributions and percentiles analysis

pipelinedb

More Information

PipelineDB Website

PostgreSQL 11 RC Available

PostgreSQL Improves Declarative Partitioning

PostgreSQL Adds Parallel Query Support

PostgreSQL Version 9.5

PostgreSQL 9.4 Released

PostgreSQL Plus Cloud Database

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Deno Not Giving Up Over JavaScript Trademark
01/07/2025

Deno has faced a setback in its attempt to get Oracle to relinquish the JavaScript Trademark. The US Patents Office Trademark Trial and Appeal Board (TTAB) dismissed Deno's fraud claim which is one th [ ... ]

+ Full Story

Cheap 3D Printed Robots Walk Off Production Line
20/07/2025

Robots that enthusiasts could build for under $500, and that are smart enough to walk off the 3D printer that formed them, have been demonstrated by a team at the University of Edinburgh.

+ Full Story

More News

Comments

or email your comment to: comments@i-programmer.info

Last Updated ( Thursday, 01 November 2018 )

Recent Articles

Recent Book Reviews

Popular Articles

More Information

Related Articles

Comments