Skip to content

Releases: kamu-data/kamu-cli

Release v0.202.1

21 Sep 03:33
Compare
Choose a tag to compare

[0.202.1] - 2024-09-20

Fixed

  • Open Telemetry integration fixes

Release v0.202.0

20 Sep 09:36
Compare
Choose a tag to compare

[0.202.0] - 2024-09-20

Changed

  • Major dependency upgrades:
    • DataFusion 42
    • HTTP stack v.1
    • Axum 0.7
    • latest AWS SDK
    • latest versions of all remaining libs we depend on
  • Outbox refactoring towards true parallelism via Tokio spaned tasks instead of futures

Fixed

  • Failed flows should still propagate finishedAt time
  • Eliminate span.enter, replaced with instrument everywhere

Release v0.201.0

18 Sep 21:14
0c0d554
Compare
Choose a tag to compare

[0.201.0] - 2024-09-18

Added

  • REST API: New /verify endpoint allows verification of query commitment as per documentation (#831)

Changed

  • Outbox main loop was revised to minimize the number of transactions:
    • split outbox into planner and consumption jobs components
    • planner analyzes current state and loads bunch of unprocessed messages within a 1 transaction only
    • consumption jobs invoke consumers and detect their failures
  • Detecting concurrent modifications in flow and task event stores
  • Improved and cleaned handling of flow abortions at different stages of processing
  • Revised implementation of flow scheduling to avoid in-memory time wheel:
    • recording FlowEventScheduledForActivation event (previously, placement moment into the time wheel)
    • replaced binary heap based time wheel operations with event store queries
    • Postgres/SQlite event stores additionally track activation time for the waiting flows
    • in-memory event store keeps prepared map-based lookup structures for activation time

Release v0.200.0

13 Sep 23:39
Compare
Choose a tag to compare

[0.200.0] - 2024-09-13

Added

  • Added first integration of Prometheus metrics starting with Outbox
  • Added --metrics CLI flag that will dump metrics into a file after command execution

Changed

  • Telemetry improvements:
    • Improved data collected around transactional code
    • Revised associating span objects with large JSON structures such as messages
    • Suppressed several noisy, but not very useful events
  • Improved Outbox stability when message consumers fail
  • Similarly, Task Executor keeps executing next tasks in case running a task results in an internal error

Release v0.199.3

11 Sep 17:39
f11d308
Compare
Choose a tag to compare

[0.199.3] - 2024-09-11

Fixed

  • Associating correct input dataset that was hard compacted with the error during transformation of derived dataset

Release v0.199.2

09 Sep 12:22
b632ce1
Compare
Choose a tag to compare

[0.199.2] - 2024-09-09

Added

  • REST API: The /query endpoint now supports response proofs via reproducibility and signing (#816)
  • REST API: New /{dataset}/metadata endpoint for retrieving schema, description, attachments etc. (#816)

Fixed

  • Fixed unguaranteed ordering of events when restoring event sourcing aggregates
  • Enqueuing and cancelling future flows should be done with transactions taken into account (via Outbox)

Release v0.199.1

06 Sep 19:38
9d1700f
Compare
Choose a tag to compare

[0.199.1] - 2024-09-06

Fixed

  • Fixed crash when a derived dataset is manually forced to update while an existing flow
    for this dataset is already waiting for a batching condition

Release v0.199.0

06 Sep 16:28
43de6c2
Compare
Choose a tag to compare

[0.199.0] - 2024-09-06

Added

  • Persistency has been enabled for Task and Flow domains.
    Both TaskExecutor and FlowExecutor now fully support transactional processing mode,
    and save state in Postgres or Sqlite database.
  • Tasks now support attaching metadata properties. Storing task->flow association as this type of metadata.
  • Flows and Tasks now properly recover the unfinished requests after server restart

Changed

  • Simplified database schema for flow configurations and minimized number of migrations
    (breaking change of the database schema)
  • Introduced pre_run() phase in flow executor, task executor & outbox processor to avoid startup races
  • Explicit in-memory task queue has been eliminated and replaced with event store queries
  • Get Data Panel: use SmTP for pull & push links
  • GQL api method setConfigCompaction allows to set metadataOnly configuration for both root and derived datasets
  • GQL api triggerFlow allows to trigger HARD_COMPACTION flow in metadataOnly mode for both root and derived datasets

Release v0.198.2

30 Aug 14:06
c2bbe8c
Compare
Choose a tag to compare

[0.198.2] - 2024-08-30

Added

  • Container sources allow string interpolation in env vars and command
  • Private Datasets, changes related to Smart Transfer Protocol:
    • kamu push: added --visibility private|public argument to specify the created dataset visibility
    • Send the visibility attribute in the initial request of the push flow

Changed

  • Schema propagation improvements:
    • Dataset schema will be defined upon first ingest, even if no records were returned by the source
    • Schema will also be defined for derivative datasets even if no records produced by the transformation
    • Above ensures that datasets that for a long time don't produce any data will not block data pipelines
  • Smart Transfer Protocol:
    • Use CreateDatasetUseCase in case of creation at the time of the dataset pulling
    • Now requires the x-odf-smtp-version header, which is used to compare client and server versions to prevent issues with outdated clients

Release v0.198.1

28 Aug 11:22
d89741a
Compare
Choose a tag to compare

[0.198.1] - 2024-08-28

Added

  • Private Datasets, ReBAC integration:
    • ReBAC properties update based on DatasetLifecycleMessage's:
    • kamu add: added hidden --visibility private|public argument, assumed to be used in multi-tenant case
    • GQL: DatasetsMut:
      • createEmpty(): added optional datasetVisibility argument
      • createFromSnapshot(): added optional datasetVisibility argument