Skip to content
This repository has been archived by the owner on Oct 15, 2024. It is now read-only.

[decision]: Changetracking (round 2) #4714

Merged
merged 13 commits into from
Feb 3, 2023
155 changes: 0 additions & 155 deletions doc/decisions/0_drafts/change_tracking.md

This file was deleted.

4 changes: 2 additions & 2 deletions doc/decisions/0_drafts/operation_sequences.md
Original file line number Diff line number Diff line change
Expand Up @@ -227,7 +227,7 @@ As can be seen, the change tracking within the `dbus` and `logchange` plugins wr
This way, we have only one such algorithm in a central place, and plugin authors don't have to think about the sequences their plugins are called by developers.

This approach can also be paired with [COW semantics](../3_decided/internal_cache.md), so that memory toll will be kept low.
A separate [decision for change tracking](change_tracking.md) is currently in progress.
A separate [decision for change tracking](../1_in_discussion/change_tracking.md) is currently in progress.

Should we observe this problem with use cases other than change tracking, we can provide general frameworks for those too.

Expand All @@ -243,7 +243,7 @@ As can be seen, the change tracking within the `dbus` and `logchange` plugins wr

## Related Decisions

- [Change Tracking](change_tracking.md)
- [Change Tracking](../1_in_discussion/change_tracking.md)
- [Internal KeySet Cache](../3_decided/internal_cache.md)
- []()

Expand Down
181 changes: 181 additions & 0 deletions doc/decisions/1_in_discussion/change_tracking.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,181 @@
# Change Tracking

## Problem

Currently, a range of Elektra plugins are implementing some sort of change tracking for configuration data.
This includes, but is not limited to, the [internalnotification](/src/plugins/internalnotification/README.md) and [dbus](/src/plugins/dbus/README.md) plugins.
In the near future, Elektra shall also be extended with session recording.

KDB itself also has some rudimentary change tracking (via the `keyNeedsSync` flag) to determine whether `kdbSet` needs to actually do something.

These competing change tracking implementations create multiple problems:

1. pointless waste of resources, as data is duplicated in each plugin,
2. multiplication of code, generating a maintenance burden.
3. various subtle differences in change tracking behavior, e.g., `kdbSet` might write a config file but notification is not sent out
4. the current approach to change tracking in plugins is fragile, which is outlined in [a separate decision about valid kdbGet/kdbSet sequences](../0_drafts/operation_sequences.md).

For `KeySet` we need to track which of the keys:

- have been removed
- have been added

For `Key` we need to track:

- original value of the key
- size of the original value (for binary keys)
- metadata, which is a combination of the tracking of keysets and keys
- tracking should only be done on the following namespaces:
- `system:/`
- `user:/`
- `dir:/`
- `meta:/`
- `spec:/`

## Constraints

Change tracking must:

- be transparent for applications using the public Elektra API
- be transparent to users changing configuration data
- if overhead is not negligible: only do tracking as required, i.e., a plugin specifically requests it
- have negligible overhead if disabled
- not duplicate data for each plugin that wants change tracking
- work with all allowed sequences of `kdbGet` and `kdbSet` as [per this decision](../0_drafts/operation_sequences.md)

We only want to track changes that are done by the user, not changes that are done by plugins.
I.e. the scope of change tracking is on what happens _outside_ of `kdbGet` and `kdbSet`.

The library `libelektra-core` must be kept minimal.

## Assumptions

- It is possible to do change tracking with reasonable memory and computation overhead
- It is possible to design a single change tracking API that is useful for all existing and future plugins
- False positivies are okay

- this may happend when some keys have been changed by the user, but have subsequentially been "unchanged" by a transformation plugin
Scenario: plugin that converts `false`<->`0` and `true`<->`1`
- `system:/background` is stored with value `false`
- user gets key `system:/background` with value `0` (after conversion by plugin)
- user changes it to `false`
- changetracking detects that value has been changed, because `false` != `0`
- plugin changes value from `false` to `0`
- consumers of the changetracking API will now get a false positive if they query whether `system:/background` has been changed
atmaxinger marked this conversation as resolved.
Show resolved Hide resolved
- We assume consumers of the changetracking API have access to both the previous and the new value of a changed key.
Therefore, they could detect these false positive cases, if the need to.

- False negatives (missed changes) are not okay

## Solutions - Storage

### Utilize already existing `backendData->keys`

We already store which keys have been returned by `kdbGet` for each backend within KDB.
Currently, those are shallow copies.
To be useful for changetracking, we need to perform deep copies instead.
As keys and keysets are already utilizing copy-on-write, this would not add too much memory overhead.

A problem with this approach is that the internally stored keys are recreated as new instances every time `kdbGet` is called.

### Combine with internal cache

We already decided that we want to have an internal deep-duped keysets of all the keys we returned.
See [internal cache decision](../3_decided/internal_cache.md).

The difference to `backendData->keys` is that this cache is not recreated each time `kdbGet` is called.

### Have a seperate storage for changetracking

Have a global keyset with deep-duped keys that is purely used for changetracking and nothing else.

### Store changes as meta keys

When something changes for the first time, store the original value as a metakey to every key.
Not yet clear how we handle changes to metadata then.
Also not really possible for keysets.

### Implement changetracking as a mechanismn on the `Key` and `KeySet` datastructures

The idea here is that we extend the `KeySet` and `Key` structs with additional fields.

The tracking itself would be done within the `ks*` and `key*` methods, after checking if it is enabled.
It would also transparently work for metadata, as metadata itself is implemented as a keyset with keys.

Downsides of this approach:

- It adds functionality to `libelektra-core` which may violate the constraint above.
It may, however, be debatable whatever 'minimal' means in this context.
- Adding fields to the structs causes a slight memory overhead, even with tracking turned off.
While negligible for `KeySet` due to the low amount of keysets in typical applications, it may be noticable for `Key`.
On a 64-bit system we'd add 16 bytes to it.
8 bytes for the pointer to the original value, 8 bytes for the size of the original value.
To put this in perspective, the current size of the `Key` struct is 64 bytes, so we'd add 25% overhead to an empty key.
However, this percentage will be much lower in a real-world application, as the usefulness of an empty key is very low.
- Another downside here is that it is not so easy to determine what the "original" value is.
Some part of `libelektra-kdb` would need to mark the keys as original (after transformations etc.)

## Solutions - Implementation

### Implement directly within `libelektra-kdb`

Implement all the logic for changetracking directly within `libelektra-kdb`.

### Implement as a seperate plugin

Implement changetracking as a hooks plugin that will be called within `kdbGet` and `kdbSet` accordingly.

The following hooks will be needed:

- `tracking/get`: will be called at the end of `kdbGet`, directly before the result is returned.
- `tracking/set`: will be called at the beginning of `kdbSet`.
- `tracking/changeset`: compute the changeset for the requested parent key and return it.

As every hook plugin can define its own contract, in theory all storage forms mentioned in the previous chapter should be possible to implement.
We could just point the plugin to `backendsData->keys` or the internal cache if we go down that route.

## Solutions - Query

### Provide an API within `libelektra-kdb`

The API should be useable both by plugins and applications utilizing ELektra.
The API may look something like this:

```c
bool elektraChangeTrackingIsEnabled (KDB * kdb);
ChangeTrackingContext * elektraChangeTrackingGetContext (KDB * kdb, Key * parentKey);

KeySet * elektraChangeTrackingGetAddedKeys (ChangeTrackingContext * context);
KeySet * elektraChangeTrackingGetRemovedKeys (ChangeTrackingContext * context);
KeySet * elektraChangeTrackingGetModifiedKeys (ChangeTrackingContext * context);

bool elektraChangeTrackingValueChanged (ChangeTrackingContext * context, Key * key);
bool elektraChangeTrackingMetaChanged (ChangeTrackingContext * context, Key * key);

KeySet * elektraChangeTrackingGetAddedMetaKeys (ChangeTrackingContext * context, Key * key);
KeySet * elektraChangeTrackingGetRemovedMetaKeys (ChangeTrackingContext * context, Key * key);
KeySet * elektraChangeTrackingGetModifiedMetaKeys (ChangeTrackingContext * context, Key * key);

Key * elektraChangeTrackingGetOriginalKey (ChangeTrackingContext * context, Key * key);
const Key * elektraChangeTrackingGetOriginalMetaKey (ChangeTrackingContext * context, Key * key, const char * metaName);
atmaxinger marked this conversation as resolved.
Show resolved Hide resolved
```

## Decision

## Rationale

## Implications

-
-
-

## Related Decisions

- [Valid kdbGet/kdbSet sequences](../0_drafts/operation_sequences.md) from [#4574](https://github.com/ElektraInitiative/libelektra/pull/4574).
- [Copy On Write](../2_in_progress/copy_on_write.md)

## Notes

- Issue [#4514](https://issues.libelektra.org/4514) uncovered a problem in the current change tracking approach
- Issue [#4520](https://issues.libelektra.org/4520) already explored some of the considered alternatives
4 changes: 2 additions & 2 deletions doc/decisions/2_in_progress/copy_on_write.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ Currently, there are many places within Elektra where keys and keysets are dupli
Most of those copied keys are never modified, but are required to be detached from the lifetime of the original instance.
We want to introduce an in-memory copy-on-write mechanism to lower our memory usage.

In the near future, Elektra will also gain facilities for [change tracking](../0_drafts/change_tracking.md) and session recording, both of which will potentially again duplicate keys.
In the near future, Elektra will also gain facilities for [change tracking](../1_in_discussion/change_tracking.md) and session recording, both of which will potentially again duplicate keys.
There are also aspirations to create a new, simple [internal cache](../3_decided/internal_cache.md) that would also benefit from a copy-on-write mechanism.

## Constraints
Expand Down Expand Up @@ -525,7 +525,7 @@ Implement the full-blown COW approach.

## Related Decisions

- [change tracking](../0_drafts/change_tracking.md)
- [change tracking](../1_in_discussion/change_tracking.md)
- [internal cache](../3_decided/internal_cache.md)

## Notes
2 changes: 1 addition & 1 deletion doc/news/_preparation_next_release.md
Original file line number Diff line number Diff line change
Expand Up @@ -492,7 +492,7 @@ This section keeps you up-to-date with the multi-language support provided by El
- decided [decision process](https://www.libelektra.org/decisions/decision-process) _(Markus Raab)_
- draft for [man pages](../decisions/0_drafts/man_pages.md) _(Markus Raab)_
- <<TODO>>
- Add decision for [change tracking](../decisions/0_drafts/change_tracking.md) _(Maximilian Irlinger @atmaxinger)_
- Add decision for [change tracking](../decisions/1_in_discussion/change_tracking.md) _(Maximilian Irlinger @atmaxinger)_
- <<TODO>>
- Create [decision](../decisions/0_drafts/operation_sequences.md) for allowed and prohibited operation seqences _(Maximilian Irlinger @atmaxinger)_
- <<TODO>>
Expand Down