Releases: apache/datasketches-cpp
datasketches-cpp-5.1.0
- implemented tdigest
- added get_serialized_size_bytes() and get_max_serialized_size_bytes() to compact Theta sketch
- fixed compressed Theta sketch stream serialization
- added Tuple sketch filter() method
5.0.2
This is patch update. The original 5.0.0 release notes are presented next with a cumulative set of patch update changes at the end.
This is a major release due to separation of Python part of the library into its own repository datasketches-python, which can potentially be API-breaking for somebody. We also took this opportunity to do some other possibly API-breaking cleanup.
- moved all Python-related code to new datasketches-python repository
- finished moving public constants to separate namespaces
- removed deprecated methods (such as get_quantiles())
- generalized array_of_doubles sketch as array_tuple_sketch
- implemented new EB-PPS sketch (exact PPS sampling with bounded sample size)
- fixed slowness in Theta intersection
- fixed incompatibility of serialized empty frequent items sketches with Java
The patch release fixes:
- a bug in KLL that could cause a self-move (undefined behavior) (5.0.1)
- a bug in EBPPS Sampling's to_string() method that could cause compilation failure for non-string types (5.0.1)
- use of a method in density sketch that was removed in C++17, breaking forward compatibility (5.0.2)
datasketches-cpp-5.0.1
This is a major release due to separation of Python part of the library into its own repository datasketches-python, which can potentially be API-breaking for somebody. We also took this opportunity to do some other possibly API-breaking cleanup.
- moved all Python-related code to new datasketches-python repository
- finished moving public constants to separate namespaces
- removed deprecated methods (such as get_quantiles())
- generalized array_of_doubles sketch as array_tuple_sketch
- implemented new EB-PPS sketch (exact PPS sampling with bounded sample size)
- fixed slowness in Theta intersection
- fixed incompatibility of serialized empty frequent items sketches with Java
The patch release fixes:
- a bug in KLL that could cause a self-move (undefined behavior)
- a bug in EBPPS Sampling's to_string() method that could cause compilation failure for non-string types
datasketches-cpp-5.0.0
This is a major release due to separation of Python part of the library into its own repository datasketches-python, which can potentially be API-breaking for somebody. We also took this opportunity to do some other possibly API-breaking cleanup.
- moved all Python-related code to new datasketches-python repository
- finished moving public constants to separate namespaces
- removed deprecated methods (such as get_quantiles())
- generalized array_of_doubles sketch as array_tuple_sketch
- implemented new EB-PPS sketch (exact PPS sampling with bounded sample size)
- fixed slowness in Theta intersection
- fixed incompatibility of serialized empty frequent items sketches with Java
datasketches-cpp-4.1.0
- HLL union speed improvement
- Fixed a bug in theta and tuple union base
- new density sketch
- new count min sketch
- thread local random generator
- generic quantile sketches in Python (KLL, REQ, classic quantiles)
- generic frequent items sketch in Python
- generic tuple sketch in Python
- added optional compression of serialized theta sketch
- iterators use new style (no inheritance from std::iterator)
datasketches-cpp-4.0.1
This is a patch release with only very minor code changes to address several small compiler warnings.
The main difference is that the associated Python wheels distributed as convenience binaries (and not included in git) are now produced for ARM64 architectures, which should provide increased compatibility with several major cloud computing providers.
datasketches-cpp-4.0.0
This is a major release with some API-breaking changes
- Common sorted view used by all quantiles sketches with simultaneous support for both inclusive and exclusive modes
- The default mode for all methods for querying quantiles sketches was changed from exclusive to inclusive
- The mode is now a method parameter, not a template parameter
- Queries of empty quantiles sketches such as get_rank() and get_quantile() will throw an exception now (returned NaN for floating point types before)
- SerDe was removed from class templates and added to the relevant method templates (such as serialize and deserialize)
- Support for comparator instances in quantiles sketches
- Support for equality operator instance in frequent items sketch
- Added operator-> to iterators over quantiles sketches
v3.5.1
Patch release, no new features:
- Fix python wheel build script to produce valid wheels for Apple Silicon Macs
- Fix a serialization bug for theta and tuple sketches when sketch had no entries but was not empty (e.g. the result of an intersection between disjoint sets)
datasketches-cpp-3.5.0
- Type converting constructors for KLL and REQ sketches
- Fixed KLL copy constructor (affects non-arithmetic types)
- Added internal check in CPC sketch compression to avoid problems with static analysis
v3.4.0
This release includes the following changes:
- addition of Quantiles sketch: the algorithm is largely obsolete vs KLL but this provides compatibility for existing sketches
- support for serde instances in all relevant sketches; class-level templates are now marked deprecated
- greater API consistency across quantiles, KLL, and REQ
- all three support a new public get_sorted_view() interface
- all three support rank and quantile queries with an optional inclusive mode
- cmake minimum version bump to 3.16
- Kolmogorov-Smirnov test for KLL and classic Quantiles, also available in python
- code cleanup and bugfixes