Skip to content
This repository has been archived by the owner on Oct 15, 2024. It is now read-only.

[new-backend] docs #4484

Merged
merged 7 commits into from
Oct 7, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions doc/AUTHORS.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,11 +19,11 @@ maintainer, development of the cache and mmapstorage plugins

## Klemens Böswirth

development of the highlevel API and code-generation; various other things
a bit of everything

- email: k.boeswirth+git@gmail.com
- github user: [kodebach](https://github.com/kodebach)
- devel/test on: Fedora
- devel/test on: Manjaro

## Robert Sowula

Expand Down
12 changes: 8 additions & 4 deletions doc/dev/algorithm.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,13 @@
# Algorithm

You might want to read
[about architecture](architecture.md)
and
[data structures](data-structures.md) first.
You might want to read [about architecture](architecture.md) and [data structures](data-structures.md) first.

## Outdated

<!-- TODO [new_backend]: Update the text below using the docs listed in the warning. -->

> **Warning** Many of the things described below (especially about `KDB` and the `kdb*` functions) are outdated.
> See [`kdb-operations.md`](kdb-operations.md) and [`kdb-contracts.md`](kdb-contracts.md) for more up-to-date information.

## Introduction

Expand Down
9 changes: 8 additions & 1 deletion doc/dev/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,14 +14,21 @@ To help readers to understand the algorithm that glues together the
plugins, we first describe some details of the
[data structures](data-structures.md). Full
knowledge of the [algorithm](algorithm.md) is not presumed to be able to develop
most plugins (with the exception of [the resolver](/src/plugins/resolver/)).
most plugins.

Further important concepts are explained in:

- [bootstrapping](/doc/help/elektra-bootstrapping.md)
- [granularity](/doc/help/elektra-granularity.md)
- [sync-flag](/doc/help/elektra-sync-flag.md)

## Outdated

<!-- TODO [new_backend]: Update the text below using the docs listed in the warning. -->

> **Warning** Many of the things described below (especially in relation to backends and mountpoints) are outdated.
> See [`kdb-operations.md`](kdb-operations.md), [`backend-plugins.md`](backend-plugins.md) and [`mountpoints.md`](mountpoints.md) for more up-to-date information.

## API

The aim of the Elektra Initiative is to design and implement a powerful
Expand Down
115 changes: 93 additions & 22 deletions doc/dev/backend-plugins.md
Original file line number Diff line number Diff line change
@@ -1,22 +1,100 @@
# Proposal: Backend Plugins
# Backend Plugins

## Backend Contract

There exists a _backend contract_ between `libelektra-kdb` and any plugin acting as a backend plugin.
This contract sets, the order of the phases described above and defines the interaction between a backend plugin and `libelektra-kdb`.

TODO: update get diagram; add cachecheck phase
```mermaid
sequenceDiagram
actor user
participant kdb as libelektra-kdb
participant backend as backend
participant other as other-backend
participant storage
participant validation
participant sa as storage-unit-a (e.g. file)
participant sb as storage-unit-b (e.g. database)

user->>+kdb: kdbGet
kdb->>backend: init
kdb->>other: init
kdb->>+backend: resolver
backend->>-kdb: "storage-unit-a", update needed
kdb->>+other: resolver
other->>-kdb: "storage-unit-b", no update needed
kdb->>backend: prestorage
kdb->>+backend: storage
backend->>+storage: &nbsp;
storage->>+sa: &nbsp;
sa-->>-storage: &nbsp;
storage-->>-backend: &nbsp;
backend-->>-kdb: &nbsp;
kdb->>+backend: poststorage
backend->>+validation: &nbsp;
validation-->>-backend: &nbsp;
backend-->>-kdb: &nbsp;
kdb->>kdb: merge backends
kdb-->>-user: &nbsp;
```

```mermaid
sequenceDiagram
actor user
participant kdb as libelektra-kdb
participant backend as backend
participant other as other-backend
participant storage
participant validation
participant sa as storage-unit-a (e.g. file)

user->>+kdb: kdbSet
kdb->>kdb: check backends initialized
kdb->>+backend: resolver
backend-->>-kdb: "storage-unit-a"
critical try to store
kdb->>+backend: prestorage
backend->>+validation: &nbsp;
validation-->>-backend: &nbsp;
backend-->>-kdb: &nbsp;
kdb->>+backend: storage
backend->>+storage: &nbsp;
storage->>+sa: write to temp
sa-->>-storage: &nbsp;
storage-->>-backend: &nbsp;
backend-->>-kdb: &nbsp;
kdb->>backend: poststorage
kdb->>backend: precommit
backend->>sa: make changes permanent
kdb->>+backend: commit
backend-->-kdb: &nbsp;
kdb->>backend: postcommit
option on failure
kdb->>backend: prerollback
kdb->>+backend: rollback
backend->>sa: revert changes
backend-->>-kdb: &nbsp;
kdb->>backend: postrollback
end
kdb-->>-user: &nbsp;
```

The diagrams above show possible sequences of phases during a `get` and a `set` operation.
For each of the phases of a `get` operation `libelektra-kdb` calls the backend plugin's `elektra<Plugin>Get` function once.
Similarly, for the phases of a `set` operation `elektra<Plugin>Set` is called.
The backend plugin can also (optionally) delegate to other plugins.

![Sequence of phases in `get` operation](kdbGet.svg)
The current phase is communicated to the backend plugin (and any other plugin) via the global keyset.
It can be retrieved via the `elektraPluginGetPhase` function.

![Sequence of phases in `set` operation](kdbSet.svg)
### `parentKey`

The diagrams above show the possible sequences of phases during a `get` and a `set` operation.
For each of the phases of a `get` operation `libelektra-kdb` calls the backend plugin's `elektra<Plugin>Get` function once.
Similarly, for the phases of a `set` operation `elektra<Plugin>Set` is called.
The key `parentKey` that is given to the backend plugin as an input at various points, must be treated carefully.
Currently, _all_ modifications to this key will be propagated to the `parentKey` that was used to call `kdbGet`.

The current phase is communicated to the backend plugin via the global keyset.
The value of the key `system:/elektra/kdb/backend/phase` is always set to the current phase.
The name of the `parentKey` is marked read-only and therefore cannot be changed.
The value and metadata can, and in some cases must be, changed.
In future this may be restricted further to ensure a more structured communication.

### Operation `get`

Expand All @@ -30,10 +108,10 @@ During the `init` phase the backend plugin is called with:
The key name and value of this key are read-only.
The name of `parentKey` is chosen to make it easier for the plugin to produce good error messages.
- A keyset `definition` containing the mountpoint definition.
To make things easier for the plugin, keys in `definition` are translated into cascading keys relative to `parentKey`.
For example, if the key `system:/elektra/mountpoints/system:\/hosts/path` is set in the KDB, then `definition` will contain a key `/path`.

TODO: how does the backend plugin get access to the other plugins? Maybe, just add a plugins/# array into `ks` containing keys with `Plugin *` values?
To make things easier for the plugin, keys in `definition` are renamed to be below `system:/`.
For example, if the key `system:/elektra/mountpoints/system:\/hosts/path` is set in the KDB, then `definition` will contain a key `system:/path`.
- Additionally, the plugins for the current mountpoint are opened by `libelektra-kdb` and provided to the backend plugin via the global keyset.
They can be accessed via the `elektraPluginFromMountpoint` function.

The backend plugin then:

Expand All @@ -49,7 +127,7 @@ This phase exists purely for the backend plugin to initialize and configure itse

> **Note**: This phase is only executed _once per instance of `KDB`_.
> Only the first `kdbGet()` call will result in `libelektra-kdb` executing this phase, all future calls to `kdbGet()` (and `kdbSet()`) start with the `resolver` phase.
> The backend plugin must store the information contained in the mountpoint definition internally to accommodate this.
> The backend plugin must store the necessary information contained in the mountpoint definition internally to accommodate this.

#### Resolver Phase

Expand Down Expand Up @@ -81,8 +159,7 @@ During the `cachecheck` phase the backend plugin is called with:

- The exact `parentKey` that was returned by the `resolver` phase of this `get` operation.
The key name and value of this key are read-only.
Additionally, the metakey `internal/kdb/cachetime` is set to a value indicating the update time of the cache entry.
TODO: exact format of `internal/kdb/cachetime` TBD
Additionally, the metakey `internal/kdb/cachehandle` is set to a value indicating the cache handle (usually modification time) of the cache entry.
- An empty keyset `ks`.

The backend plugin then:
Expand Down Expand Up @@ -130,8 +207,6 @@ It is where validation, generation of implicit values and similar tasks happen.

Finally, `libelektra-kdb` merges the keyset returned by the `poststorage` phase with the ones returned by other backend plugins for different mountpoints and then returns it to the user.

TODO: Are modifications to `parentKey` visible to the user?

### Operation `set`

The `set` operation is optional.
Expand Down Expand Up @@ -237,8 +312,6 @@ This makes the `postcommit` phase mostly useful for logging.

Finally, `libelektra-kdb` merges the keyset returned by the `postcommit` phase (which is still the same one that was returned by the `prestorage` phase) with the ones returned by other backend plugins for different mountpoints and then returns it to the user.

TODO: Are modifications to `parentKey` visible to the user?

#### Rollback Phases (`set` only)

If any of the phases `prestorage`, `storage`, `poststorage`, `precommit` or `commit` fail, `libelektra-kdb` will continue with the rollback phases.
Expand Down Expand Up @@ -267,5 +340,3 @@ In the `rollback` phase the backend plugin:
- **MAY** act differently depending on which phase failed.

Finally, `libelektra-kdb` will restore `ks` to the state in which the user provided it and return.

TODO: Are modifications to `parentKey` visible to the user?
112 changes: 1 addition & 111 deletions doc/dev/data-structures.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ It performs well for lookup, but needs more memory allocations.

Currently the `KeySet` is implemented as a sorted array.
It is fast on appending and iterating, and has nearly no size-overhead.
To improve the lookup-time, an additional **hash** will be used.
To improve the lookup-time, an additional **hash** is used, see [OPMPHM](#order-preserving-minimal-perfect-hash-map-aka-opmphm) below.

### ABI compatibility

Expand Down Expand Up @@ -194,40 +194,6 @@ indicating where the new value should be inserted when the key
is not found.
Elektra now also uses this trick internally.

### Internal Cursor

`KeySet` supports an
**external iterator**
with the two functions
`ksRewind()` to go to the beginning and `ksNext()` to
advance the _internal cursor_ to the next key.
This side effect is used to indicate a position for
operations on a `KeySet` without any additional parameter.
This technique is comfortable to see which
key has caused an error after an unsuccessful key database operation.

Elektra only has some functions to change the cursor of a key set.
But these allow the user to compose powerful functions.
Plugins do that extensively as we will see later
in `ksLookupRE()`.
The user can additionally write more such functions for
his or her own purposes.
To change the internal cursor, it is
sufficient to iterate
over the `KeySet` and stop at the wanted key.
With this technique, we can, for example, realize
lookup by value, by specific metadata and by
parts of the name.
Without an additional index, it is not possible that
such operations perform more efficiently
than by a linear iteration key by key.
For that reason, Elektra’s core does not provide
such functions.
The function `ksLookupByName()`, however,
uses the more efficient binary search
because the array inside the `KeySet`
is ordered by name.

### External Cursor

External cursor is an alternative to the approach explained above.
Expand Down Expand Up @@ -282,82 +248,6 @@ For example,
for every key in a `KeySet` without having null pointer or
out of range problems.

## Trie vs. Split

Up to now,
we have discussed external data structures visible to the user of the
library.
The application and plugin programmer needs them
to access configuration.
Last, but not least,
we will show two internal data structures.
The user will not see them.
To understand the algorithm, however,
the user needs to understand them as well.

### Trie

A _Trie_ or prefix tree is an ordered tree
data structure.
In Elektra,
it provides the information
to decide
in which backend a key resides.
The algorithm, presented in [algorithm](algorithm.md),
also needs a list of all backends.
The initial approach was to iterate over the `Trie`
to get a list of all backends.
But the transformation of a `Trie` to a list of backends, contained
many bugs caused by corner cases in connection with the default backend
and cascading mount points.

### Split

So, instead of transforming the trie to a list of backends,
we introduced a new data structure called `Split`.
The name `Split` comes from the fact that
an initial key set is split into many key sets.
These key sets are stored in the `Split` object.
`Split` advanced to the central data structure for the algorithm:

```c
typedef struct _Split Split;

struct _Split {
size_t size;
size_t alloc;
KeySet **keysets;
Backend **handles;
Key **parents;
int *syncbits;
};
```

The data structure `Split` contains the following fields:

- **size**: contains the number of key sets currently in `Split`.

- **alloc**: allows us to allocate more items than currently in use.

- **keysets** represents a list of key sets.
The keys in one of the key sets are known to belong to a specific
backend.

- **handles**: contains a list of handles to backends.

- **parents**: represents a list of keys.
Each `parentKey` contains the
root key of a backend. No key of the respective key set is above the
`parentKey`.
The key name of `parentKey` contains the mount point of a backend.
The resolver writes the file name into the value of the `parentKey`.

- **syncbits**: are some bits that can be set for every backend.
The algorithm uses the `syncbits` to decide if the key set needs to be
synchronized.

Continue reading [with the error handling](error-handling.md).

## Order Preserving Minimal Perfect Hash Map (aka OPMPHM)

The OPMPHM is a non-dynamic randomized hash map of the Las Vegas type, that creates an index over the elements,
Expand Down
Loading