Amazon Releases AWS Glue 5
Written by Kay Ewbank   
Monday, 10 February 2025

Amazon has announced the general availability of AWS Glue 5.0, with improved performance, enhanced security, and support for Amazon Sagemaker Unified Studio and Sagemaker Lakehouse.

AWS Glue is a serverless data integration service that Amazon says makes it easier to discover, prepare, move, and integrate data from multiple sources for analytics, machine learning and application development.

awslogo

Glue includes a collection of libraries, engines, and tools developed by the open source community. AWS Glue consists of a Data Catalog which is a central metadata repository; an ETL engine that can automatically generate Scala or Python code; a flexible scheduler that handles dependency resolution, job monitoring, and retries; and AWS Glue DataBrew for cleaning and normalizing data with a visual interface.

The performance and security improvements to AWS Glue 5.0 come largely from upgrades to the engine to Apache Spark 3.5.2, Python 3.11, and Java 17. Amazon says that Glue 5.0 uses the AWS performance optimized Spark runtime, which they say is 3.9 times faster than open source Spark. This and other changes means Glue 5.0 is 32% faster than AWS Glue 4.0 and reduces costs by 22%.

Glue 5.0 also updates its open table format support to Apache Hudi 0.15.0, Apache Iceberg 1.6.1, and Delta Lake 3.2.0. This means users get stronger tools for improving performance, cost, governance, and privacy in their data lakes.

AWS Glue 5.0 also adds Spark native fine grained access control with AWS Lake Formation, meaning users can apply table, column, row, and cell level permissions on Amazon S3 data lakes.

Glue 5.0 also adds support for Sagemaker Lakehouse. This means organizations can unify their data across Amazon S3 data lakes and Amazon Redshift data warehouses. SageMaker Lakehouse lets customer unity all their data across Amazon Simple Storage Service (Amazon S3) data lakes and Amazon Redshift data warehouses. Its aim is to let organizations build analytics and AI/ML applications on a single copy of data. SageMaker Lakehouse can also be used to access and query data in-place with all Apache Iceberg–compatible tools and engines.

AWS Glue 5 is available now.

awslogo

More Information

Amazon Glue Webpage

Related Articles

Amazon Announces AWS Glue Data Quality

AWS Glue 4 Adds Pandas Support

Amazon Open Sources Python Library for AWS Glue

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Banner


MongoDB Acquires Voyage AI To Add Embedding Models
10/03/2025

MongoDB is to acquire Voyage AI with the intention of using Voyage AI's facilities within MongoDB so developers can build apps that use AI more reliably. Voyage AI produces embedding and reranking mod [ ... ]



Gemini Code Assist Adds Free Layer
10/03/2025

Google has announced the public preview of Gemini Code Assist for individuals, a free version of Google's AI-coding assistant together with Gemini Code Assist for GitHub, which provides free, AI- [ ... ]


More News

espbook

 

Comments




or email your comment to: comments@i-programmer.info