Bloom filter is a probabilistic test data structure that returns False if an object is not part of a set. When bloom filter returns True an object might or might not be part of a set. Shifting Bloom filter is an extension on Bloom fliter that allows for multiple set or multiset usecases.
This library is made up of 4 submodules, namely the ShiftingBloomFilter
which contains the Shifting Bloom filter it self, utils
which contains a set of utilities that are useful while using the set, exceptions
which contain the exceptions associated with the bloom filter and visualiser which contains a graphic tool that can be used to inspect the filter.
Library can be pip-installed just like this:
pip3 install ShiftingBloomFilter
The filter supports following built-in methods: len()
, bool()
(True if not empty), str()
, repr()
, obj[index]
name | type | arguments | description |
---|---|---|---|
MULTISET |
constant | N/A | constant value for initialising the filter to be used with multiset |
MULTIPLE |
constant | N/A | constant value for initialising the filter to be used with multiple sets |
ShiftingBloomFilter(length) |
class | length, hash_count, hash_source, mode, set_count | bloom filter with support for handling multisets or multiplesets |
length |
the size of the underlying bytearrray which is used to represent the filter | ||
hash_count=len(algorithms_guaranteed) |
amount of hashing functions to use. NOTE!: cannot be greater than the length of hash source | ||
hash_source=algorithms_guaranteed |
a list of hashing functions to use. | ||
length_as_power=True |
is length of filter expressed as power of 2 (True ) or is it literal (False ) |
||
mode=MULTIPLE |
MULTIPLE if there are multiple sets or MULTISET if its one set but supporting multiple elements |
||
set_count=0 |
how many sets is this filter suppoused to support | ||
obj.insert(item) |
method | item , set_no=0 |
insert item into the filter with set_no (applicable for multiple sets only) |
obj.check(item) |
method | item |
check if item is in the filter |
obj.save2file() |
method | filename=sbf.bin |
save filter to file (binary) |
obj.get_fpr() |
method | get false postitive rate for current state of the filter | |
ShiftingBloomFilter.load_from_file() |
static method | filename=sbf.bin |
load filter from binary file |
name | type | arguments | description |
---|---|---|---|
CSVDataSet(filename) |
class | filename, separator=',' | Iterative reader for csv data sets |
built-ins | repr() , next() |
||
RandomStringGenerator() |
class | string_length=4 , ascii_start=32 , ascii_end=126 , stream_length=... |
a stream of random strings of given length |
built-ins | repr() , len() , next() |
||
HashFunction(hash_base, salt) |
class | hash_base , salt |
wrapper around salted hashing function |
built-ins | repr() , obj() |
||
HashFactory(hash_family, hash_count) |
class | hash_family , hash_count |
Produces a list of salted hash functions. hash_family is a base hash function from hashlib. hash_count is number of hash functions to create |
obj.save2file() |
method | filename=hashdata.bin |
save HashFactory object to file. |
HashFactory.load_from_file() |
static method | filename=hashdata.bin |
load HashFactory object from file |
built-ins | len() , repr() , next() , obj[index] |
name | type | arguments | description |
---|---|---|---|
SBFException() |
Exception class | Top-level module excpetion | |
HashesUnavailableError(message) |
Exception class | message | Exception raised when there is error related to hashing functions |
Tool for presenting how shifting bloom filter works. Can be used as debugger if needed.
name | type | arguments | description |
---|---|---|---|
Main() |
class | title=ShiftingBloomFilter Visualiser , length=25 , hash_count=None , hash_source=None , bloom=None , deepcopy=True , mode=MULTIPLE , no_sets=1 |
Main window of Visualiser. |
title | title for the visualiser window | ||
length | length of the filter | ||
hash_count | number of hashing functions to use | ||
hash_source | source of hash functions | ||
bloom | use existing bloom filter instead of creating new one (useful for debugging) | ||
deepcopy | work on copy of a given bloom filter | ||
Main.run() |
static method | title=ShiftingBloomFilter Visualiser , length=25 , mode=MULTIPLE |
Starts the visualiser |