Amazon Redshift Updates
Written by Kay Ewbank   
Thursday, 05 December 2019

Amazon has announced a number of updates to Redshift, its cloud-based data warehouse service.

Redshift data can be analyzed using ‘normal’ SQL-based tools and business intelligence applications, and is designed to be easy to set up and manage - clusters can be set up using a few clicks in the AWS Management Console. Queries can be distributed and parallelized across multiple nodes. Amazon has automated most of the common administrative tasks associated with provisioning, configuring, monitoring, backing up, and securing a data warehouse to make Redshift easier to administer. Redshift is based on ParAccel technology from Actian (formerly known as Ingres), which Amazon acquired in 2013.


The updates announced at Amazon's Re:Invent conference start with the support for data lake export in Apache Parquet format.  You can now unload the result of an Amazon Redshift query to your Amazon S3 data lake as Apache Parquet. The Parquet format is up to twice as fast to unload and uses up to six times less storage in Amazon S3, compared to text formats.

The next improvement to be announced is a preview of support for federated querying. The Amazon Redshift Federated Query feature lets you query and analyze data across operational databases, data warehouses, and data lakes. With Federated Query, you can now integrate queries on live data in Amazon RDS for PostgreSQL and Amazon Aurora PostgreSQL with queries across your Amazon Redshift and Amazon S3 environments.

Another improvements to queries in Redshift is the preview of Advanced Query Accelerator (AQUA) for Amazon Redshift. This is a new distributed and hardware-accelerated cache that Amazon says means Redshift can run up to ten times faster than any other cloud data warehouse. AQUA attempts to avoid the bottleneck of having to move data from centralized storage to compute clusters for processing, where the network bandwidth needed to move the data can be the bottleneck. Instead, AQUA does a substantial share of data processing in-place on its hardware-accelerated cache. Data intensive tasks such as such as filtering and aggregation are carried out closer to the storage layer so minimizing data movement between where data is stored and compute clusters.

The final improvement to Redshift is support for materialized views - again, this is in preview. Materialized views can speed up query performance for repeated and predictable analytical workloads. They store pre-computed results of queries and maintain them by incrementally processing the latest changes made to the source tables. Any query that uses the materialized views gets the pre-computed results much faster. Materialized views can be created based on one or more source tables using filters, projections, inner joins, aggregations, grouping, functions and other SQL constructs.

More details of all the new features can be found on the Redshift website. 



More Information

Amazon Redshift

Related Articles

Amazon Releases PartiQL, A One Stop Query Language

Amazon Updates Data Offerings

Amazon Redshift Ready For Data

Amazon Redshifts Big Data

New AWS Managed Services

Amazon RDS Adds Replication Feature



To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.


Apache Superset 4 Updates Reports

Apache Superset 4 has been released with improvements to the reporting module and redesigned alerts. Superset is a business intelligence web application. It is open source, provides data exploration a [ ... ]

Actionforge Releases GitHub Actions VSCode Extension

Actionforge has released the beta of its GitHub Actions tool as a VS Code extension. The extension consists of a suite of tools making up a visual node system for building and managing GitHub Actions  [ ... ]

More News

raspberry pi books



or email your comment to:

Last Updated ( Friday, 06 December 2019 )