The Invertible Bloom Filter
The Invertible Bloom Filter
Written by Mike James   
Thursday, 29 December 2016
Article Index
The Invertible Bloom Filter
Delete and List

 

bloomlogo

List entries

Listing is the final operation and if you have been following the descriptions of the other operations you should be able to guess how it is going to work.

First all of the elements in the filter with a count of one have valid data and key storage elements. So the first task is to scan the storage and extract all of the elements with B[i].count=1. You can then add

(B[i].key,B[i].value) 

to the output list.

Notice there will be duplicates but this doesn't matter as you can remove duplicate keys.

This isn't the end of what you can do, however, because if you remove all such entries from the filter then there is a chance that it will undo any collisions that they caused. If this reduces the count of any element to one then another data element can be retrieved.

So the full algorithm is:

while there is an B[i].count=1 do
 add (B[i].key,B[i].value) to the output list
 perform DELETE(B[i].key,B[i].value)
end while
if the list is not empty set incomplete_list
return output_list

You can see that if all of the collisions can be undone then the list should be empty at the end of the operation,

This corresponds to the operation:

LISTENTRIES(): list all the key-value pairs being stored in B. With low (inverse polynomial in t) probability, this operation may return a partial list along with an incomplete_list status.

So how well does this work?

In terms of performance, if you are using k hash functions and the array has t elements then insertions, deletions and lookups take O(k) and listing takes O(t).

The probability that a GET operation will return a "not found" error, i.e. the data might be in the table but due to collisions it cannot be found,  when the data actually is in the table is the same as the false positive rate for the corresponding Bloom filter. 

form2

where k is the number of hash functions, m is the size of the bit array and n is the number of items stored in the bit array.

This can be made as small as you like by increasing the number of hash functions in use and the size of the table. The same is true of the probability of getting an incomplete listing of the data from the table - roughly O(t-k+2) as long as some conditions are met.

The invertible Bloom filter has lots of surprising uses including working out if two databases or two sets store the same elements. Basically what you do is create an invertible filter from one set, then delete all the elements of the other set and the elements that are left at the end of the operation are those that were only in one of the two sets.

You can even discover what the elements are and which set they belonged to, but this is another story.

bloomicon

Further Reading

The Bloom Filter

Invertible Bloom Lookup Tables

Related articles

Hashing

Hashing solves one of the basic problems of computing - finding something that you have stored somewhere.

Advanced Hashing

Extensible hashing and perfect hashing are ideas that are worth exploring.

Universal Hashing

A look at a way of creating a set of hash functions.

 

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on, Twitter, FacebookGoogle+ or Linkedin.

 

Banner


//No Comment - Trojan of Things, Disk LEDs Leak Data & So Do Phone Magnetometers
02/03/2017

• Trojan of Things: Embedding Malicious NFC Tags into Common Objects 

• LED-it-GO: Leaking (a lot of) Data from Air-Gapped Computers via the (small) Hard Drive LED

• Mobile Phone Identifi [ ... ]



AI Security At Online Conference: An Interview With Ian Goodfellow
04/03/2017

To find out more about the forthcoming AI With the Best online event we talked with Ian Goodfellow who is one of the event's keynote speakers. We also took the opportunity to find out more about his a [ ... ]


More News

 
 

 

blog comments powered by Disqus



Last Updated ( Thursday, 29 December 2016 )
 
 

   
Banner
RSS feed of all content
I Programmer - full contents
Copyright © 2017 i-programmer.info. All Rights Reserved.
Joomla! is Free Software released under the GNU/GPL License.