Class ArrayCountingBloomFilter
- All Implemented Interfaces:
BitMapExtractor,BloomFilter<CountingBloomFilter>,CellExtractor,CountingBloomFilter,IndexExtractor
Any operation that results in negative counts or integer overflow of counts will mark this filter as invalid. This transition is not reversible. The
operation is completed in full, no exception is raised and the state is set to invalid. This allows the cells for the filter immediately prior to the
operation that created the invalid state to be recovered. See the documentation in isValid() for details.
All the operations in the filter assume the cells are currently valid, for example cardinality or contains operations. Behavior of an invalid
filter is undefined. It will no longer function identically to a standard Bloom filter that is the merge of all the Bloom filters that have been added to and
not later subtracted from the counting Bloom filter.
The maximum supported number of items that can be stored in the filter is limited by the maximum array size combined with the Shape. For example an
implementation using a Shape with a false-positive probability of 1e-6 and Integer.MAX_VALUE bits can reversibly store approximately 75
million items using 20 hash functions per item with a memory consumption of approximately 8 GB.
- Since:
- 4.5.0-M1
- See Also:
-
Nested Class Summary
Nested classes/interfaces inherited from interface org.apache.commons.collections4.bloomfilter.CellExtractor
CellExtractor.CellPredicate -
Field Summary
Fields inherited from interface org.apache.commons.collections4.bloomfilter.BloomFilter
SPARSE -
Constructor Summary
ConstructorsConstructorDescriptionArrayCountingBloomFilter(Shape shape) Constructs an empty counting Bloom filter with the specified shape. -
Method Summary
Modifier and TypeMethodDescriptionbooleanadd(CellExtractor other) Adds the specified CellExtractor to this Bloom filter.int[]Return a copy of the IndexExtractor data as an int array.intGets the cardinality (number of enabled bits) of this Bloom filter.intGets the characteristics of the filter.voidclear()Clears the filter to by resetting it to its initial, unpopulated state.booleancontains(BitMapExtractor bitMapExtractor) Returnstrueif this filter contains the bits specified in the bit maps produced by the bitMapExtractor.booleancontains(IndexExtractor indexExtractor) Returnstrueif this filter contains the indices specified IndexExtractor.copy()Creates a new instance of thisArrayCountingBloomFilterwith the same properties as the current one.intGets the maximum allowable value for a cell count in this Counting filter.intgetMaxInsert(CellExtractor cellExtractor) Determines the maximum number of times the Cell Extractor could have been added.getShape()Gets the shape that was used when the filter was built.booleanisValid()Returnstrueif the internal state is valid.booleanprocessBitMaps(LongPredicate consumer) Each bit map is passed to the predicate in order.booleanprocessCells(CellExtractor.CellPredicate consumer) Performs the given action for eachcellwhere the cell count is non-zero.booleanprocessIndices(IntPredicate consumer) Each index is passed to the predicate.booleansubtract(CellExtractor other) Adds the specified CellExtractor to this Bloom filter.Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.apache.commons.collections4.bloomfilter.BitMapExtractor
asBitMapArray, processBitMapPairsMethods inherited from interface org.apache.commons.collections4.bloomfilter.BloomFilter
contains, contains, estimateIntersection, estimateN, estimateUnion, isEmpty, isFullMethods inherited from interface org.apache.commons.collections4.bloomfilter.CountingBloomFilter
getMaxInsert, getMaxInsert, getMaxInsert, getMaxInsert, merge, merge, merge, merge, remove, remove, remove, remove, uniqueIndices
-
Constructor Details
-
ArrayCountingBloomFilter
Constructs an empty counting Bloom filter with the specified shape.- Parameters:
shape- the shape of the filter
-
-
Method Details
-
add
Description copied from interface:CountingBloomFilterAdds the specified CellExtractor to this Bloom filter.Specifically all cells for the indexes identified by the
otherwill be incremented by their corresponding values in theother.This method will return
trueif the filter is valid after the operation.- Specified by:
addin interfaceCountingBloomFilter- Parameters:
other- the CellExtractor to add.- Returns:
trueif the addition was successful and the state is valid- See Also:
-
asIndexArray
Description copied from interface:IndexExtractorReturn a copy of the IndexExtractor data as an int array.Indices ordering and uniqueness is not guaranteed.
The default implementation of this method creates an array and populates it. Implementations that have access to an index array should consider returning a copy of that array if possible.
- Specified by:
asIndexArrayin interfaceIndexExtractor- Returns:
- An int array of the data.
-
cardinality
Description copied from interface:BloomFilterGets the cardinality (number of enabled bits) of this Bloom filter.This is also known as the Hamming value or Hamming number.
- Specified by:
cardinalityin interfaceBloomFilter<CountingBloomFilter>- Returns:
- the cardinality of this filter
-
characteristics
Description copied from interface:BloomFilterGets the characteristics of the filter.Characteristics are defined as bits within the characteristics integer.
- Specified by:
characteristicsin interfaceBloomFilter<CountingBloomFilter>- Returns:
- the characteristics for this bloom filter.
-
clear
Description copied from interface:BloomFilterClears the filter to by resetting it to its initial, unpopulated state.- Specified by:
clearin interfaceBloomFilter<CountingBloomFilter>
-
contains
Description copied from interface:BloomFilterReturnstrueif this filter contains the bits specified in the bit maps produced by the bitMapExtractor.- Specified by:
containsin interfaceBloomFilter<CountingBloomFilter>- Parameters:
bitMapExtractor- theBitMapExtractorto provide the bit maps.- Returns:
trueif this filter is enabled for all bits specified by the bit maps
-
contains
Description copied from interface:BloomFilterReturnstrueif this filter contains the indices specified IndexExtractor.Specifically this returns
trueif this filter is enabled for all bit indexes identified by theIndexExtractor.- Specified by:
containsin interfaceBloomFilter<CountingBloomFilter>- Parameters:
indexExtractor- the IndexExtractor to provide the indexes- Returns:
trueif this filter is enabled for all bits specified by the IndexExtractor
-
copy
Creates a new instance of thisArrayCountingBloomFilterwith the same properties as the current one.- Specified by:
copyin interfaceBloomFilter<CountingBloomFilter>- Returns:
- a copy of this BloomFilter.
-
getMaxCell
Description copied from interface:CountingBloomFilterGets the maximum allowable value for a cell count in this Counting filter.- Specified by:
getMaxCellin interfaceCountingBloomFilter- Returns:
- the maximum allowable value for a cell count in this Counting filter.
-
getMaxInsert
Description copied from interface:CountingBloomFilterDetermines the maximum number of times the Cell Extractor could have been added.- Specified by:
getMaxInsertin interfaceCountingBloomFilter- Parameters:
cellExtractor- the extractor of cells.- Returns:
- the maximum number of times the CellExtractor could have been inserted.
-
getShape
Description copied from interface:BloomFilterGets the shape that was used when the filter was built.- Specified by:
getShapein interfaceBloomFilter<CountingBloomFilter>- Returns:
- The shape the filter was built with.
-
isValid
Returnstrueif the internal state is valid.This flag is a warning that an addition or subtraction of cells from this filter resulted in an invalid cell for one or more indexes. For example this may occur if a cell for an index was set to negative following a subtraction operation, or overflows the value specified by
getMaxCell()following an addition operation.A counting Bloom filter that has an invalid state is no longer ensured to function identically to a standard Bloom filter instance that is the merge of all the Bloom filters that have been added to and not later subtracted from this counting Bloom filter.
Note: The change to an invalid state may or may not be reversible. Implementations are expected to document their policy on recovery from an addition or removal operation that generated an invalid state.
Implementation note
The state transition to invalid is permanent.
This implementation does not correct negative cells to zero or integer overflow cells to
Integer.MAX_VALUE. Thus the operation that generated invalid cells can be reversed by using the complement of the original operation with the same Bloom filter. This will restore the cells to the state prior to the invalid operation. Cells can then be extracted usingprocessCells(CellPredicate).- Specified by:
isValidin interfaceCountingBloomFilter- Returns:
trueif the state is valid
-
processBitMaps
Description copied from interface:BitMapExtractorEach bit map is passed to the predicate in order. The predicate is applied to each bit map value, if the predicate returnsfalsethe execution is stopped,falseis returned, and no further bit maps are processed.If the extractor is empty this method will return true.
Any exceptions thrown by the action are relayed to the caller.
- Specified by:
processBitMapsin interfaceBitMapExtractor- Parameters:
consumer- the function to execute- Returns:
trueif all bit maps returnedtrue,falseotherwise.
-
processCells
Description copied from interface:CellExtractorPerforms the given action for eachcellwhere the cell count is non-zero.Some Bloom filter implementations use a count rather than a bit flag. The term
Cellis used to refer to these counts.Any exceptions thrown by the action are relayed to the caller. The consumer is applied to each cell. If the consumer returns
falsethe execution is stopped,falseis returned, and no further pairs are processed.- Specified by:
processCellsin interfaceCellExtractor- Parameters:
consumer- the action to be performed for each non-zero cell.- Returns:
trueif all cells return true from consumer,falseotherwise.
-
processIndices
Description copied from interface:IndexExtractorEach index is passed to the predicate. The predicate is applied to each index value, if the predicate returnsfalsethe execution is stopped,falseis returned, and no further indices are processed.Any exceptions thrown by the action are relayed to the caller.
Indices ordering and uniqueness is not guaranteed.
- Specified by:
processIndicesin interfaceCellExtractor- Specified by:
processIndicesin interfaceIndexExtractor- Parameters:
consumer- the action to be performed for each non-zero bit index.- Returns:
trueif all indexes return true from consumer,falseotherwise.
-
subtract
Description copied from interface:CountingBloomFilterAdds the specified CellExtractor to this Bloom filter.Specifically all cells for the indexes identified by the
otherwill be decremented by their corresponding values in theother.This method will return true if the filter is valid after the operation.
- Specified by:
subtractin interfaceCountingBloomFilter- Parameters:
other- the CellExtractor to subtract.- Returns:
trueif the subtraction was successful and the state is valid- See Also:
-