Implementation of Prefix-Filter, which is an incremental filter (approximate set-membership queries). If you plan on using the Prefix-Filter, please cite our paper: Prefix Filter: Practically and Theoretically Better Than Bloom. Tomer Even, Guy Even, Adam Morrison. To appear in PVLDB, 15(7).
Short talk abouth the Prefix Filter: https://www.youtube.com/watch?v=KMVtvACSGo0
- Compiler: A C++17 compiler such as GNU G++ or LLVM Clang++.
- CMake (Version 3.10 or higher).
- System: Linux.
- Hardware: Intel. Support of AVX512 is a must.
- matplotlib
- brokenaxes (From here).
- pandas
There are three main targets for three different benchmarks.
measure_perf
for benchmarking performance of insertions and lookups under various loads.measure_build
for build-time.measure_fpp
for evaluating the false positive probability, and the "effective space consumption".
There is also an example of how to use the Prefix-Filter in the file example.cpp
:
#include "Tests/wrappers.hpp"
int main(){
using spare = TC_shortcut; // Any incremental filter can replace TC_shortcut.
using prefixFilter = Prefix_Filter<spare>;
size_t filter_max_capacity = 1'000'000; // Choose any size.
prefixFilter example_filter = FilterAPI<prefixFilter>::ConstructFromAddCount(filter_max_capacity);
uint64_t x1 = 0x0123'4567'89ab'cdef;
FilterAPI<prefixFilter>::Add(x1, &example_filter); // Insertions of an item x1. Insertion can be performed only one step at a time.
bool res = FilterAPI<prefixFilter>::Contain(x1, &example_filter); // Lookup of x1.
assert(res); //No false negative.
uint64_t y1 = ~0x0123'4567'89ab'cdef;
bool res2 = FilterAPI<prefixFilter>::Contain(y1, &example_filter); // Lookup of y1.
std::cout << res2 << std::endl; // Possible false positive. (Although with one item in the filter, this is highly unlikely.)
return 0;
}
git clone -b master https://github.com/TheHolyJoker/Prefix-Filter.git
cd Prefix-Filter
mkdir build
cd build
cmake .. -DCMAKE_C_COMPILER=gcc-10 -DCMAKE_CXX_COMPILER=g++-10
t="measure_perf measure_built measure_fpp"; time cmake --build ./ --target $t -j 20
On GCC release: Older releases like 9.3 will work. We recommend using newer releases, which seems to perform better.
To run all benchmarks (from Prefix-Filter/build
directory):
cp ../RunAll.sh RunAll.sh
./RunAll.sh
Any specific target (for example measure_perf
) can be built and executed with as follows:
t="measure_perf"; time cmake --build ./ --target $t -j 20 && time taskset -c 2 ./$t
- Xor Filter:
"Xor Filters: Faster and Smaller Than Bloom and Cuckoo
Filters."
Graf, Thomas Mueller, and Daniel Lemire.
Repository.
We build upon Xor filter's benchmarks. We also used Its BBF variant, its fast Bloom filter, and its fast modulo alternative.
- Cuckoo Filter
"Cuckoo Filter: Practically Better Than Bloom." Fan B, Andersen DG, Kaminsky M, Mitzenmacher MD. Repository - Blocked Bloom Filter
We used two variants taken from the Xor filter's repository.
- Vector Quotient Filter
"Vector Quotient Filters: Overcoming The Time/Space Trade-Off In Filter Design.". Pandey P, Conway A, Durie J, Bender MA, Farach-Colton M, Johnson R. Vector quotient filters: Overcoming the time/space trade-off in filter design.
Repository.
However, we used our own implementation, called twoChoicer (In fileTC-Shortcut/TC-shortcut.hpp
).