Skip to content

Releases: man-group/ArcticDB

v4.5.1

18 Oct 12:12
Compare
Choose a tag to compare

🐛 Fixes

Uncategorized

Full Changelog: v4.5.0...v4.5.1


The wheels are on PyPI. Below are for debugging:

v4.5.0

16 Aug 13:05
Compare
Choose a tag to compare

🚀 Resampling (#1495)

We are pleased to introduce a first version of resampling into ArcticDB.
Using the QueryBuilder you can easily resample timseries data as part of the read. The following code will resample a symbol into hourly buckets, for example. Please see the QueryBuilder docs for more info.

q = adb.QueryBuilder()
q = q.resample("h").agg({"to_sum": "sum"})
lib.read("symbol", query_builder=q).data

🚀 Features

Remove Composite from processing pipeline (#1722)
Refactor VersionMap to include all combinations of LoadTypes and whether to include_deleted (#1443)
Multiple aggregators per column are now possible in QueryBuilder (#1468)
Add s3 custom ca support (#1338)
Add version_chain ASV benchmarks (#1508)
Support mongodb+srv connection strings (#1513)
Enhancement/515/support metadata in finalize staged data (#1525)
list_snapshots: Add option to skip loading metadata (#1531)
LibraryTool add read_index (#1533)
Make service alive checking more universal (#1607)
Enhancement/1610/method to force compact symbol list (#1624)

🐛 Fixes

Add maybe_unused to fix MSVC compilation error by @vasil-pashov in #1739
Properly hash descriptor in v1 (master) by @willdealtry in #1682
Bugfix 1692: fix batch read with row range no query builder (#1710)
Bugfix 1720: Speedup processing pipeline with lots of columns (#1723)
Fix use after move in op_log.cpp (#1732)
Implement support for very old style symbol list entries. (#1630)
Fix ArcticDB reading streaming data (#1647)
Add validate_index arg to staged data finalization in both V1 and V2 APIs (#1694)
Fix sort_and_finalize_staged_data (#1617)
Bugfix 1609: Allow parallel appends when incomplete index values match last existing index value in the symbol (#1619)
Fix SYMBOL_LIST Segment index out of bounds bug (#1473)
Fix overflow bug on append and update (#1585)
Bugfix/1250/compact incompletes rationalisation (#1467)
Guard against numpy v2 to prevent breakage when it gets released (#1465)
Consistently handle timezones in QueryBuilder filtering (#1483)
Align azure timeout with s3 (#1487)
bugfix/1247: Don't create an lmdb library if the name is invalid. (#1481)
Cleanup arctic client after usage in local_quary_builder in ASV (#1522)
Bugfix/1518/get description date range incorrect after delete data in range (#1523)
Bugfix 1507+1509: dynamic schema different index append fixes (#1529)
Only write ref key once when writing with prune_previous_versions (#1560)
Fix #1148: Show clearer error message for mismatched types in QueryBuilder (#1557)
Bugfix 1615: Fix segfault in benchmark_clause.cpp (#1616)
Bugfix 1612: Raise a meaningful exception type/message when an empty string is provided as a symbol/snapshot name (#1618)
Adapt Raise to use format string as message directly if no args are given (#1606)
Restore mistake in #1315 (#1482)
Update docstring: added section to clarify that an empty df does not delete (#1592)

New Contributors

Full Changelog: v4.4.3...v4.5.0

v4.4.3

26 Jun 15:38
Compare
Choose a tag to compare

🐛 Fixes

  • Release GIL on update and append (#1559)
  • Only write ref key once when writing with prune_previous_versions (#1571)
  • Fix overflow bug on append and update (#1585)
  • Require numpy < 2 (#1465)
Uncategorized

The wheels are on PyPI. Below are for debugging:

v4.4.2

08 May 08:57
Compare
Choose a tag to compare

🐛 Fixes

  • Fix segmentation fault related to Ray package (by reverting to C++ 17 build) #1542
  • Fix get_description date range after using delete_data_in_range #1526
  • Fix tail segfault with empty data #1543

Please see v4.4.1 release notes for the previous changes

v4.4.1

16 Apr 10:01
Compare
Choose a tag to compare

🐛 Fixes

  • Bug fix in Azure Storage, clear to_delete list to allow deletion of more than 256 keys in a batch (#1482)

Please see v4.4.0 release notes for the main changes from the last release.

v4.4.0

05 Apr 13:24
Compare
Choose a tag to compare

🚀 Features

  • Prevent writing empty types by default (gives compatibility with v1.6.2 readers) in #1440
  • Improved resilience to external (out of order) replication in #1355
  • Support modifying library options, and introduce enterprise library options in #1457

You can now modify library options on an existing Arctic library:

from arcticdb.options import LibraryOptions

ac: Arctic = ...
lib = ac.create_library("lib")
ac.modify_library_option(lib, ModifiableLibraryOption.ROWS_PER_SEGMENT, int(1e6))

See arcticdb/options.py for a description of the modifiable options.

🐛 Fixes

  • Bugfix 902: Cannot filter on nans and nones in string and float columns #1276
  • Bugfix: Empty type in #1227
  • Bugfix 1334: optimise version ref key access #1345
  • Fix version map cache invalidation policies in #1351
  • Fix empty column default type in #1378
  • Bugfix #1388: Correctly check whether we are in ec2 in #1415
  • Bugfix 1423: Raise a meaningful error message when trying to use QueryBuilder with sparse data in #1435
  • Bug Fix Windows: Remove lmdb files when delete_library is called in #1437
  • Bugfix 1256: reject parallel appends to unsorted data in #1442
  • Bugfix 1209: Consistently return metadata from write, append, update, write_metadata, and batch versions thereof in #1444
  • Bugfix/935/match pandas behviour when aggregating columns with nans in #1450
  • LZ4 decoding empty type: move the segment buffer forward by the compressed data size for empty types in #1463
  • Bugfix 1268: swap out xxhash for grouping in #1416
  • Bugfix 1260: allow broader range of numeric type promotion with dynamic schema #1426

💾 Storage Exception Normalization

We have made storage-related exceptions uniform across different backend storage platforms, despite the fact that the underlying behaviour varies.

  • #447 LMDB Exceptions Normalization in #1285
  • #447 Memory Storage Exception Normalization in #1297
  • Adds a MockS3Client which can simulate s3 failures in #1281
  • #447 S3 Storage Exceptions Normalization in #1304
  • Add a MockAzureClient which can simulate azure failures in #1331
  • #447 Azure Storage Exceptions Normalization in #1344
  • #447 Exception normalization for RocksDB in #1360
  • Refactor: Move mongo client errors handling into mongo_storage.cpp before normalization in #1383
  • #447 Add a MockMongoClient which can simulate mongo failures in #1395
  • #447 Mongo Exceptions Normalization in #1411
  • LMDB Exception Normalization with mock client in #1414
Uncategorized
  • Prevent writing empty types by default (gives compatibility with v1.6.2 readers) in #1440
  • Add a way to enable/disable silencing of errors when deleting a library in #1271
  • Feature flag to use WinInet client not WinHttp in #1284
  • Abstract an S3ClientWrapper out of details-inl.hpp in #1274
  • GitHub Workflows: Make can_merge run for all files to allow any docs changes to be mergeable. in #1292
  • Fix benchmarks in #1286
  • Reduce the hashes that we use to benchmark against in #1303
  • Refactor 1278: Column data dense forward iterator in #1301
  • Add contains to Arctic class, to support lib in arctic in #1309
  • Update BSL table for v4.3.0 in #1282
  • Utility to analyze the size of various key types in a library in #1291
  • Do not compile wheel build on EC2 in #1318
  • Docs: For top level imports use arcticdb.object instead of using full path to object in #1323
  • conda-build: Adaptations for folly in #1320
  • Align docstring with the behaviour in #1243
  • Added 142 new tests for empty/missing operations in #1319
  • Abstract AzureClientWrapper out of azure_storage.cpp in #1315
  • Stop using ec2 runners in the conda+linux workflow in #1302
  • Adds python tests which simulate s3 storage exceptions in #1330
  • S3 local delete failure raising meaningful error in #1329
  • Fix dynamic strings append to fixed strings issue in #1346
  • Minor improvement on analysis flow in #1290
  • Build time improvements in #1263
  • build: Disable compilers' extensions in #1335
  • maint: Fully specify fmt::format_to in #1333
  • Fixed pd_delete_replace + added single tests in #1342
  • Use type-deduced functor for all column iterating functions in #1347
  • Disable ec2 runners on PR in #1357
  • Use 14.39 toolset in #1359
  • Removed pytest dependency from arcticdb in #1350
  • build: Use C++20 in #1332
  • mark test_symbol_list_parallel_stress_with_delete as flaky in #1368
  • Docs: Increase CSS max-wdith and build docs from a branch in #1363
  • Print error msg in ExponentialBackoff exception in #1365
  • Disable missing key warnings when expected in #1379
  • Remove pin on civetweb in #1380
  • Multiple segments within the same block: storage and library refactoring in #1307
  • Not allowing snapshotting tombstoned versions in #1280
  • Clarify Intel/AMD build support in #1389
  • Fix debug formatting in #1397
  • maint: Replace robin_hood with unordered_dense in #1390
  • Test benchmarking improvements in #1326
  • maint: Remove dependency on some elements of folly in #1370
  • Allow different testing dependency version in pipeline in #1410
  • Add metadata extraction functions in library_tool in #1375
  • Add a global timeout for pytests in #1381
  • Set upper bound for supported protobuf version in #1421
  • 1 year and 1k stars readme banner in #1425
  • read_batch set include_deleted to false by default when reading a version in #1419
  • build: Update to fmt 10 in #1427
  • Make changes for prometheus metrics in #1418
  • maint: Replace use of folly getCurrentThreadId with STL in #1417
  • maint: Ignore the diff of #1263 in #1340
  • Roll back vcpkg version to fix failing abseil build in #1436
  • maint: Remove use of folly/portability/PThread.h in #1447
  • maint: Remove use of folly/system/ThreadName.h in #1446
  • Better error messaging around pickling in #1451
  • Update analysis_workflow.yml in #1455
  • Adding update, append and delete asv benchmarks in #1434
  • Support generators for metadata vectors again in #1456
  • Remove the brotli dep in #1458

The wheels are on PyPI. Below are for debugging:

v4.0.4

14 Mar 10:44
Compare
Choose a tag to compare

🚀 Features

  • Better backend storage retryable error handling and error message printout #1365

🐛 Fixes

  • Stop allowing snapshotting tombstoned versions #1280
  • Add a way to enable/disable silencing of errors when deleting a library #1271

The wheels are on Pypi. Below are for debugging:

v4.3.1

09 Feb 09:29
Compare
Choose a tag to compare

🐛 Fixes

  • Fix regression in round-tripping empty type for dynamic schema (#1313)

The wheels are on PyPI. Below are for debugging:

v4.3.0

07 Feb 13:47
Compare
Choose a tag to compare

Version 4.3.0 was pulled from PyPi and Conda Forge due to a regression. We no longer provide builds for 4.3.0.
Regression is fixed in 4.3.1 release. Please use 4.3.1 instead.

🚀 Features

  • Exposes existing regex filter in lib.list_symbols (#1123)
>>> from arcticdb import Arctic
>>> import pandas as pd
>>> ac.create_library("test")
>>> lib = ac["test"]
>>> lib.write("sym0", pd.DataFrame())
>>> lib.write("sym1", pd.DataFrame())
>>> lib.list_symbols()
['sym0', 'sym1']
>>> lib.list_symbols(regex="1$")
['sym1']
  • Introduce jitter in symbol list compaction threshold (#1174)
  • Sorting speed improvements in SegmentInMemory (#1181)
  • Reduce log level from warn to debug for "Failed to find segment for key" message where appropriate (#1130)
  • Speed up writes by parellising aggregator_set_data over data segments (#1065)
  • Support sortedness checks and maintenance with parallel writes and appends (#1251)
  • #1014 Introduce storage fixtures to easily test ArcticDB against various storage backends. See arcticdb.storage_fixtures package. (#1054)

🐛 Fixes

  • Release the symbol list's storage lock if it has existed for longer than its TTL (#1134)
  • Ensure that the version chain is always updated atomically (#1104)
  • Return empty pd.DataFrame with MultiIndex if originally provided (#1126)
  • conda-build: Explicitly depend on openssl and libcurl (#1244)
  • Reintroduce attrs as a runtime dependency (#1272)
  • Speedup reading wide dataframes that have no empty columns (#1225)
  • Bugfix 1046: Prevent appending/updating numeric columns with non-identical types with static schema (#1205)
  • Bugfix 1173: Correctly apply sortedness checks when calling update with date_range argument (#1238)
  • Fix non-deterministic hashing in Linux conda builds (#1261)
  • Improve date range returned by get info for unordered and range indexed dataframes (#1241)
  • Bugfix 1248 and 1249: compact_incomplete reject incomplete segments that overlap each other, or existing segments in the case of append (#1255)
  • Detailed error in case of S3's libcurl network failure (#1265)
Uncategorized
  • [Aggregation tests] Replace non_zero_numeric_type_strategies with numeric_type_strategies (#968)
  • Fixes reuse_name for azure storage #1061 (#1115)
  • small getting-started-docs tweaks (#1103)
  • Improve fixture reliability (#1116)
  • maint: Define arcticdb::proto::logger in log.hpp (#1117)
  • maint: Remove unneeded includes (#1113)
  • [Column] Move some definitions to cpp file (#1100)
  • maint: Move implementations in memory_segment_impl.hpp to memory_segment_impl.cpp (#1092)
  • Update git blame file (#1118)
  • Flaky test hypothesis mean agg (#496) (#1125)
  • Use same region for S3 and EC2 to avoid data transfer costs (#1128)
  • build: Remove attrs from the dependencies (#1135)
  • Only build on pull request events (#1127)
  • More fixture robustness improvements (#1132)
  • Remove releasing docs as they are now in GitHub wiki (#1136)
  • Update README.md (#1141)
  • Remove test parellism, and speed up test bottleneck (#1143)
  • Fix support for shared/unique S3 prefixes (#1140)
  • maint: Remove headers in types.hpp (#1121)
  • Skip flaky pytests which check log messages (#1161)
  • Update README.md (#1156)
  • Refactor: Move DataError method implementaitons into cpp (#1155)
  • Update .git-blame-ignore-revs for DataError implementation move (#1165)
  • Add MSVC 2022 preset. Tweak MSVC CMake settings. (#1133)
  • Build-time improvments: allocator.hpp, log.hpp, buffer.hpp (#1152)
  • Fix publish.yml workflow (#1167)
  • README - put third party tools in alphabetical order (#1172)
  • Fix persistent tests (#1147)
  • Introduce sorting and merging google benchmarks (#1138)
  • Skip array type tests due to occsional segfaults (#1187)
  • build: Remove some adherence to folly (#1144)
  • Add equity options notebook + data (#1178)
  • maint: Ignore some references (#1190)
  • Added equity options notebook to index (#1193)
  • Use vcpkg for gbench (#1189)
  • Forward port internal PR #1082 (#1180)
  • Bugfix 1191: Propagate storage failures in version map batch methods to calling code (#1194)
  • Link against python explicitly in order to make MSVS builds work (#1192)
  • Final version of equity opts notebook (#1196)
  • Bugfix 1182: Unskip test that is no longer flaky (#1197)
  • Docs that StorageFailureSimulator is not used in all stores (#1203)
  • Clean and reorganize OffsetString and StringPool (#1137)
  • build: Do not depend on protobuf-lite (#1212)
  • docs: Fix documentation links (#1038)
  • Fix recurse_segment forward declaration to match the signature of its implementation (#1217)
  • Update git blame file for OffsetString and StringPool implementation move (#1211)
  • Add frequently used items at the top level of arcticdb (#1219)
  • Switch from arcticdb to adb in the demo Notebooks (#1228)
  • Add a way to handle non-string values for index names (#1170)
  • Pass unmodified argument by const& to FieldCollection::add_field (#1234)
  • Switch from arcticdb to adb in python docstrings (#1236)
  • Remove obsolete test log level environment variable (#1231)
  • Update incorrect docs for validate_index (#1233)
  • Bugfix 1207: Use pandas.Timestamp.max - 1 day in test_read_ts. Remove pointless snapshot. Improve error message when index key reading fails. (#1235)
  • Bugfix invalid library name (#1206)
  • Enhancement/1253/skip temporary allocation when decoding dynamic schema columns (#1259)
  • Expose headers for consumers via arcticdb_core_static (#1257)
  • Update WarnVersionTypeNotHandled::warn() warning message (#1273)
  • Update README correcting spelling (#1275)
  • build: Adapt protobuf compilation (#1199)
  • Enable skipped test_partial_write_hashed (#1215)

The wheels are on PyPI. Below are for debugging:

v4.2.1

15 Dec 14:23
Compare
Choose a tag to compare

This is a patch release to version 4.2.0 which fixes Issue #1157 regarding the defragment_symbol_data method.

🐛 Fixes

  • Defragmenting a symbol no longer invalidates previous versions (#1163)

The wheels are on Pypi. Below are for debugging: