Apache SeaTunnel Reaches Top Level Status
Written by Kay Ewbank   
Monday, 12 June 2023

Apache has announced that SeaTunnel has graduated to be an Apache top level project. SeaTunnel is described as a next-generation cloud-native, high-performance, distributed, massive data integration tool.

SeaTunnel was initially created by open source community members, data experts, and developers in China. It can be used to ingest and synchronize massive data (tens of billions of items of data a day) faster, greatly lowering the cost of maintaining the data transfer.

seatunnel

SeaTunnel has been designed to solve common problems in the field of data integration such as the need to use multiple incompatible data sources with complex synchronization scenarios such as offline-full synchronization, offline- incremental synchronization, CDC, real-time synchronization, and full database synchronization. It also takes care of problems in the data integration and synchronization processes such as data loss or duplication.

SeaTunnel comes with a Connector API that does not depend on a specific execution engine, and the developers say that different types of connectors including source, transform and sink developing using this API can run on many different engines. The current version supports the SeaTunnel Engine, Flink and Spark.

There's also a connector plug-in that can be used to develop connectors and integrate them into the SeaTunnel project. SeaTunnel currently comes with over 100 connectors with more under development.

SeaTunnel also comes with batch-stream integration and supports offline synchronization, real-time synchronization, full- synchronization, and incremental synchronization, and has a distributed snapshot algorithm to ensure data consistency. High throughput and low latency is promised through parallel reading and writing, and real-time monitoring can be carried out.

Other features include JDBC multiplexing and database log multi-table parsing. The team says this solves the need for CDC multi-table synchronization scenarios.

Finally, two job development methods are supported: coding and canvas design, and the SeaTunnel web project provides visual management of jobs, scheduling, running and monitoring capabilities.

SeaTunnel is available for testing now.


seatunnel 

More Information

SeaTunnel Website

SeaTunnel web project

Related Articles

Apache Iceberg Improves Spark Support

Spark BI Gets Fine Grain Security

Spark Announcements

Apache Flink ML 2.0 Released

Apache Flink 1.9 Adds New Query Engine

Apache Flink 1.5.0 Adds Support For Broadcast State

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Banner


Azure AI And Pgvector Run Generative AI Directly On Postgres
26/03/2024

It's a match made in heaven. The Azure AI extension enables the database to call into various Azure AI services like Azure OpenAI. Combined with pgvector you can go far beyond full text search. Let's  [ ... ]



Insights From AI Index 2024 Report
17/04/2024

Published this week, the latest Stanford HAI AI Index report tracks worldwide trends in AI. A mix of its new research and findings from many other sources, it provides a wide ranging look at how  [ ... ]


More News

raspberry pi books

 

Comments




or email your comment to: comments@i-programmer.info