Google Open Sources Three New Hash Functions

Written by Alex Armstrong

Thursday, 24 March 2016

OK, a hash function may not be a huge piece of software, but when you know how hard it is to find a good hash function having three more open source is to be welcomed.

googleopensource

The first is a parallel implementation of SipHash, a fast cryptographically strong pseudo-random function. The Google version produces the same output as the standard SipHash but 1.5 times quicker due to the use of AVX-2 instructions. AVX instructions work with 128-bit vectors. SipHash is important because it is used to implement hash tables in a lot of modern languages, for example Python, Haskell, Ruby and Rust. It is also used in systems packages such as systemd and OpenDNS. The point really is that hash functions are everywhere.

Fast random hash functions are needed in servers and language implementations because of the possibility of hash flooding attacks. A hash table works fast as long as the table has plenty of space and the hash function is good. However, if you present a hash table with a list of data that all have the same hash then they all want to be stored in the same location and every update after the first is more costly. This can be so costly as to bring a server to a halt.

If you think this is unlikely then in 2011 an attack on a PHP server with 500KB of carefully constructed POST data took a minute of CPU time to deal with. The solution is to use a random hash function which can be set to give a different set of hash values each time it is used on the same data, e.g. SipHash.

hash3

SipTreeHash is a variant on SipHash and doesn't give the same outputs. It works by slicing the input into 8-byte chunks and computes the SipHash on each chunk in parallel, again using AVX-2 instructions, and then combines the result using a final SipHash. The only problem is that AVX-2 instructions are only available on Intel's Haswell and later processors or on the soon-to-be-released AMD processors.

As there is an overhead in performing the chunking, this is only faster than SipHash if the input is larger than 96 bytes. For large enough inputs SipTreeHash is up to three times faster than SipHash.

The final new hash function is HighwayHash - where do they get the names from? In this case the hash function is completely new and designed to make best use of AVX-2 multiply and permute instructions. The basic idea is that the 32x32 bit multiplications used produce 64-bit results and so are not easy to reverse and the permutations even out the distribution of bytes. The payoff is that it is two to three times faster than SipTreeHash and hence up to six times faster than SipHash.

The big problem is that this new hash function is not well studied and this is one of the reasons for open sourcing the code. It is hopped that HighwayHash will be examined and proved cryptographically strong.

The code is available on GitHub and, as you might expect, it is all in C++ and complied using GCC.

hash3

More Information

HighwayHash On GitHub

SipHash: a fast short-input PRF

Hashing - The Greatest Idea In Programming

Advanced Hashing

The Reason For The Weird PHP Function Names

The Lost Art Of The Storage Mapping Function

Universal Hashing

The Invertible Bloom Filter

The Bloom Filter

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on, Twitter, Facebook, Google+ or Linkedin.

Microsoft Adds IQ Layer To Fabric
04/12/2025

Microsoft is adding a "semantic intelligence layer" to Fabric to add that elevates Fabric from a unified data platform to a unified intelligence platform. The announcement says the extra layer w [ ... ]

+ Full Story

Codacy Releases AI Risk Reduction Tool
04/12/2025

Codacy has launched two new products to help control genAI coding. AI Risk Hub and AI Reviewer form a code compliance suite that organizations can use for governance of AI-generated code and sma [ ... ]

+ Full Story

More News

Comments

or email your comment to: comments@i-programmer.info

Last Updated ( Saturday, 26 March 2016 )

More Information

Related Articles

Comments