Athena Query Alterer Open Sourced
Written by Alex Denham   
Tuesday, 12 February 2019

A tool that alerts you if users are running expensive queries on Amazon Athena query engine has been open sourced by the developers.

The Athena Alerter was developed by a team at Fandom Engineering, and is designed to overcome the problem that the AWS Athena big data query engine is easy to use, but can also easily run up large bills as a consequence of its simplicity.



It is possible to write less expensive queries by using partitions to limit the amount of data being accessed, but that requires the query writer to remember to use a partition and to understand how expensive a question could be. This is particularly the case when working in an external tool.

Because of the potential to spend money on Athena so quickly, the development team at Fandom Engineering developed Athena Alerter. This is an open source set of lambda functions designed to work together to track which queries are run, how much data do they scan (which directly maps to costs) and notify users when they run costly queries.

Because Fandom Engineering uses Slack for internal communication, Athena Alerter notifies users by sending Slack messages. However the developers point out that given the very modular nature of the tool, it’s easy to adjust the notification function to use a different mechanism.

Internally, the alerting tool uses Cloudtrail, Lambda, DynamoDB, SQS, and S3, and is set up using a CloudFormation‎ script which will create all the AWS components that are needed for you. The user needs to provide their specific configuration and then use the provided makefile.

To process the information about Athena queries, the tool first processes Cloudtrail logs to learn who started which query, then uses the Athena API to track the query and get information about the amount of data scanned. This informatin is then pushed to DynamoDB and SQS, and users are notified.

athena alterter


At its heart the tool consist of three lambda functions: 

  • cloudtrail_handler — this function processes cloudtrail logs and adds entries to the DynamoDB table. At this stage the function provides query, executing user, start time and execution id.
  • usage_update — this function runs every minute, takes queries that are in “Running” state and updates information about amount of scanned data. Note that athena api does not provide information about who's executing the query, hence the tool relies on cloudtrail for that. When a query execution finishes a SQS event is generated
  • notification — this function runs for each sqs event, checks whether the amount of data scanned exceeded the notification threshold and if so, generates a slack message. If you want to process the data scanned information differently, this function can be easily replaced with your own implementation.

Athena Alterer is available on Github.



More Information

Athena Alerter on GitHub

Related Articles

 AWS Lambda For The Impatient

Amazon Glacier Select Analyzes Archived Data

New AWS Services

AWS Improvements For Developers

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.


Huawei Intends To Challenge iOS and Android

Huawei has just changed its mind and decided to push its HarmonyOS to the rest of the world. A challenger to iOS and Android would be nice, but it is possible?

MongoDB Atlas Stream Processing Generally Available

The MongoDB developers have announced that MongoDB Atlas now has support for stream processing. The news was announced at MongoDB.Local NYC.

More News

raspberry pi books



or email your comment to:

Last Updated ( Tuesday, 12 February 2019 )