PeerDB Brings Real Time Streaming To PostgreSQL
Written by Nikos Vaggalis   
Thursday, 23 November 2023

PeerDB is an ETL/ELT tool built for PostgreSQL. It makes all tasks that require streaming data from PostgreSQL to third party counterparts as effortless as it gets.

But the basics first;. Why the need to stream data from PostgreSQL, or any database for the matter?

For a start, it's what streaming data's most popular technique CDC is used for. CDC is a way to capture changes made in the database and forwarding them in real-time to external applications (such as Kafka) through connectors such as the ones offered by Debezium, the open source distributed platform that turns your existing databases into event streams. There are many ways to implemented CDC like row versioning, pubsub, triggers and log monitoring, with the log-based one being the most popular and automated. The use cases of CDC include real-time analytic, replication to Data Warehouses, Queues and Storages or any other customized solutions.

The most popular tool for enabling CDC is of course open source Debezium. Compared to Debezium, PeerDB is significantly simpler to set up and manage.
You just define your Peers between the source and the target and let them exchange data like Linux does with pipes. Actually, the scheme could very well be described
as a glorified pipe.

For instance, to mirror data from a Postgres instance to a Snowflake one you just have to :

CREATE PEER postgres_peer
FROM postgres (. . . );

CREATE PEER snowflake_peer
FROM snowflake (. . . );

CREATE MIRROR real_time_cdc
FROM postgres_peer
TO snowflake_peer
WITH TABLE MAPPING (transactions:transactions, users:users);

Transactions and users table are now replicated in realtime from Postgres to Snowflake, so that when you Insert/Update or Delete from the Postgres tables, the same operation is mirrored on the Snowflake ones too.

Besides sporting a developer friendly API as seen above, PeerDB is also performant in comparison to similar tools:

  • 2x to 16x faster large data loads
    When you are moving larger datasets (10s of GB to a few TB) from Postgres to any supported targets, PeerDB can be 2x to 16x faster than other tools. This helps faster initial loads in WAL-based replication and faster Query or Watermark based replication
  • Change Data Capture (CDC) with 5s to 60s lag on target
    PeerDB is designed for real-time streaming from Postgres. If your application is latency sensitive you can configure refresh intervals as low as a few seconds.

Since PeerDB talks "Postgres" it also supports native Postgre features such as :

  • Advanced data types - PeerDB supports natively replicating advanced data types incl. ARRAYs, JSON/JSONB, HSTORE, ENUMs, Geospatial etc from Postgres.
  • Partitioned Tables - PeerDB has comprehensive support for replicating partitioned tables.
  • Efficient replication TOAST (large) columns

Nativity also means that you can use the tools you are familiar with on PeerDB as well:

  • Client tools like pgAdmin, psql to run SQL commands.
  • BI tools like Grafana, Tableau to visually monitor syncs and transforms.
  • Database migration and versioning tools like Flyway to manage your ETL.
  • Any language (Python, Go, Node. js etc) and Scheduler (AirFlow) for development.

PeerDB support a number of different modes of streaming like log based (CDC), cursor based (timestamp or integer) and XMIN, while at the time of writing it supports the following connectors :

Of course it is free and open source and available as a docker image. There's also a Cloud and Enterprise offering which is fully managed and hosted on AWS, Azure and GCP, and requires a paid subscription.

To conclude, PostgreSQL never ceases to amaze. With PeerDB included, its ecosystem goes from strength to strength.

 

More Information

PeerDB official

PeerDB Github

Related Articles

pg_later - Native Asynchronous Queries Within Postgres 

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Banner


Grace Hopper - Her 119th Anniversary
09/12/2025

Today, December 9th 2025, is the 119th anniversary of the birth of Grace Hopper. Her concern for teaching young people is why Computer Science Education Week and the Hour of Code, now the Hour of AI,  [ ... ]



Python In The Age Of AI
30/11/2025

For its Octoverse event, GitHub recorded an interview with Guido van Rossum, the creator of Python. From it we learn about the origins of Python and its name and its role in the age of AI.


More News

pico book

 

Comments




or email your comment to: comments@i-programmer.info