Binary fuse filters & xor filters are probabilistic data structures which allow for quickly checking whether an element is part of a set.
Both are faster and more concise than Bloom filters, and smaller than Cuckoo filters. Binary fuse filters are a bleeding-edge development and are competitive with Facebook’s ribbon filters:
This is a Zig implementation, which provides many practical benefits:
std.mem.Allocator
implementations for the filter itself and population, enabling interesting opportunities like mmap-backed population of filters with low physical memory usage.Xor(u8)
, Xor(u16)
, BinaryFuse(u8)
, BinaryFuse(u16)
, or experiment with more exotic variants like Xor(u4)
thanks to Zig’s bit-width integers and generic type system.Zig’s safety-checking and checked overflows has also enabled us to improve the upstream C/Go implementations where overflow and undefined behavior went unnoticed.[1]
Decide if xor or binary fuse filters fit your use case better: should I use binary fuse filters or xor filters?
Get your keys into u64
values. If you have strings, structs, etc. then use something like Zig’s std.hash_map.getAutoHashFn
to convert your keys to u64
first. (“It is not important to have a good hash function, but collisions should be unlikely (~1/2^64).”)
Create a build.zig.zon
file in your project (replace $LATEST_COMMIT
with the latest commit hash):
.{
.name = "mypkg",
.version = "0.1.0",
.dependencies = .{
.fastfilter = .{
.url = "https://github.com/hexops/fastfilter/archive/$LATEST_COMMIT.tar.gz",
},
},
}
Run zig build
in your project, and the compiler instruct you to add a .hash = "..."
field next to .url
.
Then use the dependency in your build.zig
:
pub fn build(b: *std.Build) void {
...
exe.addModule("fastfilter", b.dependency("fastfilter", .{
.target = target,
.optimize = optimize,
}).module("fastfilter"));
}
In your main.zig
, make use of the library:
const std = @import("std");
const testing = std.testing;
const fastfilter = @import("fastfilter");
test "mytest" {
const allocator = std.heap.page_allocator;
// Initialize the binary fuse filter with room for 1 million keys.
const size = 1_000_000;
var filter = try fastfilter.BinaryFuse8.init(allocator, size);
defer filter.deinit(allocator);
// Generate some consecutive keys.
var keys = try allocator.alloc(u64, size);
defer allocator.free(keys);
for (keys, 0..) |key, i| {
_ = key;
keys[i] = i;
}
// Populate the filter with our keys. You can't update a xor / binary fuse filter after the
// fact, instead you should build a new one.
try filter.populate(allocator, keys[0..]);
// Now we can quickly test for containment. So fast!
try testing.expect(filter.contain(1) == true);
}
(you can just add this project as a Git submodule in yours for now, as Zig’s official package manager is still under way.)
Binary fuse filters automatically deduplicate any keys during population. If you are using a different filter type (you probably shouldn’t be!) then keys must be unique or else filter population will fail. You can use the fastfilter.AutoUnique(u64)(keys)
helper to deduplicate (in typically O(N) time complexity), see the tests in src/unique.zig
for usage examples.
To serialize the filters, you only need to encode these struct fields:
pub fn BinaryFuse(comptime T: type) type {
return struct {
...
seed: u64,
segment_length: u32,
segment_length_mask: u32,
segment_count: u32,
segment_count_length: u32,
fingerprints: []T,
...
T
will be the chosen fingerprint size, e.g. u8
for BinaryFuse8
or Xor8
.
Look at std.io.Writer
and std.io.BitWriter
for ideas on actual serialization.
Similarly, for xor filters you only need these struct fields:
pub fn Xor(comptime T: type) type {
return struct {
seed: u64,
blockLength: u64,
fingerprints: []T,
...
If you’re not sure, start with BinaryFuse8
filters. They’re fast, and have a false-positive probability rate of 1/256 (or 0.4%).
There are many tradeoffs, primarily between:
See the benchmarks section for a comparison of the tradeoffs between binary fuse filters and xor filters, as well as how larger bit sizes (e.g. BinaryFuse(u16)
) consume more memory in exchange for a lower false-positive probability rate.
Note that fuse filters are not to be confused with binary fuse filters, the former have issues with construction, often failing unless you have a large number of unique keys. Binary fuse filters do not suffer from this and are generally better than traditional ones in several ways. For this reason, we consider traditional fuse filters deprecated.
This implementation supports key iterators, so you do not need to have all of your keys in-memory, see BinaryFuse8.populateIter
and Xor8.populateIter
.
If you intend to use a xor filter with datasets of 100m+ keys, there is a possible faster implementation for construction found in the C implementation xor8_buffered_populate
which is not implemented here.
The API is generally finalized, but we may make some adjustments as Zig changes or we learn of more idiomatic ways to express things. We will release v1.0 once Zig v1.0 is released.
0.12.0-dev.706+62a0fbdae
0.12.0-dev.706+62a0fbdae
(build.zig
.path
-> .source
change.)0.12.0-dev.706+62a0fbdae
util.sliceIterator
to fastfilter.SliceIterator
SliceIterator
is now unmanaged / does not store an allocator.SliceIterator
now stores []const T
instead of []T
internally.BinaryFuseFilter.max_iterations
is now a constant.fastfilter.MeasuredAllocator
for measuring allocations.std.build.Pkg
definitionBinaryFuse
filters are now recommended by default, are generally better than Xor and Fuse filters.Fuse
filters (BinaryFuse
are much better.)initial release with support for Xor and traditional Fuse filters of varying bit sizes, key iterators, serialization, and a slice de-duplication helper.
Benchmarks were ran on both a 2019 Macbook Pro and Windows 10 desktop machine using e.g.:
zig run -O ReleaseFast src/benchmark.zig -- --xor 8 --num-keys 1000000
0.12.0-dev.706+62a0fbdae
Algorithm | # of keys | populate | contains(k) | false+ prob. | bits per entry | peak populate | filter total |
---|---|---|---|---|---|---|---|
binaryfuse8 | 1000000 | 37.5ms | 24.0ns | 0.00391115 | 9.04 | 22 MiB | 1 MiB |
binaryfuse16 | 1000000 | 45.5ms | 24.0ns | 0.00001524 | 18.09 | 24 MiB | 2 MiB |
binaryfuse32 | 1000000 | 56.0ms | 24.0ns | 0 | 36.18 | 28 MiB | 4 MiB |
xor2 | 1000000 | 108.0ms | 25.0ns | 0.2500479 | 9.84 | 52 MiB | 1 MiB |
xor4 | 1000000 | 99.0ms | 25.0ns | 0.06253865 | 9.84 | 52 MiB | 1 MiB |
xor8 | 1000000 | 103.4ms | 25.0ns | 0.0039055 | 9.84 | 52 MiB | 1 MiB |
xor16 | 1000000 | 104.7ms | 26.0ns | 0.00001509 | 19.68 | 52 MiB | 2 MiB |
xor32 | 1000000 | 102.2ms | 25.0ns | 0 | 39.36 | 52 MiB | 4 MiB |
binaryfuse8 | 10000000 | 621.2ms | 36.0ns | 0.0039169 | 9.02 | 225 MiB | 10 MiB |
binaryfuse16 | 10000000 | 666.6ms | 102.0ns | 0.0000147 | 18.04 | 245 MiB | 21 MiB |
binaryfuse32 | 10000000 | 769.0ms | 135.0ns | 0 | 36.07 | 286 MiB | 43 MiB |
xor2 | 10000000 | 1.9s | 43.0ns | 0.2500703 | 9.84 | 527 MiB | 11 MiB |
xor4 | 10000000 | 2.0s | 41.0ns | 0.0626137 | 9.84 | 527 MiB | 11 MiB |
xor8 | 10000000 | 1.9s | 42.0ns | 0.0039369 | 9.84 | 527 MiB | 11 MiB |
xor16 | 10000000 | 2.2s | 106.0ns | 0.0000173 | 19.68 | 527 MiB | 23 MiB |
xor32 | 10000000 | 2.2s | 140.0ns | 0 | 39.36 | 527 MiB | 46 MiB |
binaryfuse8 | 100000000 | 7.4s | 145.0ns | 0.003989 | 9.01 | 2 GiB | 107 MiB |
binaryfuse16 | 100000000 | 8.4s | 169.0ns | 0.000016 | 18.01 | 2 GiB | 214 MiB |
binaryfuse32 | 100000000 | 10.2s | 173.0ns | 0 | 36.03 | 2 GiB | 429 MiB |
xor2 | 100000000 | 28.5s | 144.0ns | 0.249843 | 9.84 | 5 GiB | 117 MiB |
xor4 | 100000000 | 27.4s | 154.0ns | 0.062338 | 9.84 | 5 GiB | 117 MiB |
xor8 | 100000000 | 28.0s | 153.0ns | 0.004016 | 9.84 | 5 GiB | 117 MiB |
xor16 | 100000000 | 29.5s | 161.0ns | 0.000012 | 19.68 | 5 GiB | 234 MiB |
xor32 | 100000000 | 29.4s | 157.0ns | 0 | 39.36 | 5 GiB | 469 MiB |
Legend:
0.12.0-dev.706+62a0fbdae
Algorithm | # of keys | populate | contains(k) | false+ prob. | bits per entry | peak populate | filter total |
---|---|---|---|---|---|---|---|
binaryfuse8 | 1000000 | 44.6ms | 24.0ns | 0.00390796 | 9.04 | 22 MiB | 1 MiB |
binaryfuse16 | 1000000 | 48.9ms | 25.0ns | 0.00001553 | 18.09 | 24 MiB | 2 MiB |
binaryfuse32 | 1000000 | 49.9ms | 25.0ns | 0.00000001 | 36.18 | 28 MiB | 4 MiB |
xor2 | 1000000 | 77.3ms | 25.0ns | 0.25000163 | 9.84 | 52 MiB | 1 MiB |
xor4 | 1000000 | 80.0ms | 25.0ns | 0.06250427 | 9.84 | 52 MiB | 1 MiB |
xor8 | 1000000 | 76.0ms | 25.0ns | 0.00391662 | 9.84 | 52 MiB | 1 MiB |
xor16 | 1000000 | 83.7ms | 26.0ns | 0.00001536 | 19.68 | 52 MiB | 2 MiB |
xor32 | 1000000 | 79.1ms | 27.0ns | 0 | 39.36 | 52 MiB | 4 MiB |
fuse8 | 1000000 | 69.4ms | 25.0ns | 0.00390663 | 9.10 | 49 MiB | 1 MiB |
fuse16 | 1000000 | 71.5ms | 27.0ns | 0.00001516 | 18.20 | 49 MiB | 2 MiB |
fuse32 | 1000000 | 71.1ms | 27.0ns | 0 | 36.40 | 49 MiB | 4 MiB |
binaryfuse8 | 10000000 | 572.3ms | 33.0ns | 0.0038867 | 9.02 | 225 MiB | 10 MiB |
binaryfuse16 | 10000000 | 610.6ms | 108.0ns | 0.0000127 | 18.04 | 245 MiB | 21 MiB |
binaryfuse32 | 10000000 | 658.2ms | 144.0ns | 0 | 36.07 | 286 MiB | 43 MiB |
xor2 | 10000000 | 1.2s | 39.0ns | 0.249876 | 9.84 | 527 MiB | 11 MiB |
xor4 | 10000000 | 1.2s | 39.0ns | 0.0625026 | 9.84 | 527 MiB | 11 MiB |
xor8 | 10000000 | 1.2s | 41.0ns | 0.0038881 | 9.84 | 527 MiB | 11 MiB |
xor16 | 10000000 | 1.3s | 117.0ns | 0.0000134 | 19.68 | 527 MiB | 23 MiB |
xor32 | 10000000 | 1.3s | 147.0ns | 0 | 39.36 | 527 MiB | 46 MiB |
fuse8 | 10000000 | 1.1s | 36.0ns | 0.0039089 | 9.10 | 499 MiB | 10 MiB |
fuse16 | 10000000 | 1.1s | 112.0ns | 0.0000172 | 18.20 | 499 MiB | 21 MiB |
fuse32 | 10000000 | 1.1s | 145.0ns | 0 | 36.40 | 499 MiB | 43 MiB |
binaryfuse8 | 100000000 | 6.9s | 167.0ns | 0.00381 | 9.01 | 2 GiB | 107 MiB |
binaryfuse16 | 100000000 | 7.2s | 171.0ns | 0.000009 | 18.01 | 2 GiB | 214 MiB |
binaryfuse32 | 100000000 | 8.5s | 174.0ns | 0 | 36.03 | 2 GiB | 429 MiB |
xor2 | 100000000 | 16.8s | 166.0ns | 0.249868 | 9.84 | 5 GiB | 117 MiB |
xor4 | 100000000 | 18.9s | 183.0ns | 0.062417 | 9.84 | 5 GiB | 117 MiB |
xor8 | 100000000 | 19.1s | 168.0ns | 0.003873 | 9.84 | 5 GiB | 117 MiB |
xor16 | 100000000 | 16.9s | 171.0ns | 0.000021 | 19.68 | 5 GiB | 234 MiB |
xor32 | 100000000 | 19.4s | 189.0ns | 0 | 39.36 | 5 GiB | 469 MiB |
fuse8 | 100000000 | 19.6s | 167.0ns | 0.003797 | 9.10 | 4 GiB | 108 MiB |
fuse16 | 100000000 | 20.8s | 171.0ns | 0.000015 | 18.20 | 4 GiB | 216 MiB |
fuse32 | 100000000 | 21.5s | 176.0ns | 0 | 36.40 | 4 GiB | 433 MiB |
Legend:
If it was not for the above people, I (@emidoots) would not have been able to write this implementation and learn from the excellent C implementation. Please credit the above people if you use this library.