Skip to content

cinnom/nano-cuckoo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

59 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

License Maven Central Javadocs Build Status Coverage Status

nano-cuckoo

Fast Java cuckoo filter implementation that utilizes sun.misc.Unsafe for native memory access.

What is a Cuckoo filter?

"Cuckoo Filter: Better Than Bloom" by Bin Fan, David G. Andersen, Michael Kaminsky, and Michael D. Mitzenmacher

Probabilistic Filters By Example

Maven

<dependency>
    <groupId>net.cinnom</groupId>
    <artifactId>nano-cuckoo</artifactId>
    <version>3.0.2</version>
</dependency>

General Usage

import net.cinnom.nanocuckoo.NanoCuckooFilter;

public class CuckooTest {
	
    public void generalUsageTest() {
    
        long capacity = 32;
        
        // Use Builder to create a NanoCuckooFilter. Only required parameter is capacity.
        final NanoCuckooFilter cuckooFilter = new NanoCuckooFilter.Builder( capacity )
                .withCountingEnabled( true ) // Enable counting
                .build();
        
        String testValue = "test value";
        
        // Insert a value into the filter
        cuckooFilter.insert( testValue );
        
        // Check that the value is in the filter
        boolean isValueInFilter = cuckooFilter.contains( testValue ); // Returns true
        
        // Check that some other value is in the filter
        isValueInFilter = cuckooFilter.contains( "some other value" ); // Should return false, probably
        
        // Insert the same value a couple more times
        cuckooFilter.insert( testValue );
        cuckooFilter.insert( testValue );
        
        // Get a count of how many times the value is in the filter
        int insertedCount = cuckooFilter.count( testValue ); // Returns 3 since we inserted three times with counting enabled
        
        // Delete value from the filter once
        boolean wasDeleted = cuckooFilter.delete( testValue ); // Returns true since a value was deleted
        
        // Try to delete the value up to six more times
        int deletedCount = cuckooFilter.delete( testValue, 6 ); // Returns 2 since only two copies of the value were left
        
        isValueInFilter = cuckooFilter.contains( testValue ); // Returns false since all copies of the value were deleted
        
        // Double filter capacity by doubling entries per bucket. However, this also roughly doubles max FPP.
        cuckooFilter.expand();
        
        // Close the filter when finished with it. This is optional to immediately free memory used by the filter.
        // Otherwise, the memory will be freed when the GC gets around to it.
        cuckooFilter.close();
    }
}

Serialization

Standard Java serialization is supported. Any custom implementations for StringEncoder, BucketHasher, FingerprintHasher, or RandomInt should be serializable, or filter serialization will fail.

In addition, readMemory and writeMemory can be used to directly dump and replace filter memory. The source and destination filters for readMemory and writeMemory should be built with the same parameters.

Configuration

See NanoCuckooFilter.Builder documentation for various settings.

Type Support

Currently, only String, byte[], and long (pre-hashed value) types can be inserted into the filter. Generic type support may be added later, but I generally recommend serializing data yourself for optimal performance.

Multithreading

Multithreaded insert/delete/contains/count are supported. Bucket expansion and serialization (including readMemory and writeMemory) will lock all buckets until they are finished.

About

Fast cuckoo filter implementation for Java

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages