-
Notifications
You must be signed in to change notification settings - Fork 123
[New Backend] Discussion: Record Elektra tooling #4371
[New Backend] Discussion: Record Elektra tooling #4371
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this looks very promising to me, but there are a few problems I can see.
The recording on a per-user basis could cause problems with sudo
, but I don't really have a good solution for this. Maybe you could store the whole recording data outside the KDB in e.g. $HOME
. The env-var should be preserved for sudo command
, but that obviously complicates your plugin.
The kdb record export
feature is also a bit problematic. I think you should start with a simpler kdb record replay
. Maybe also consider creating a kdb record diff
/kdb record apply
to create and apply proper patch files (in a somewhat custom format) instead of relying on export/import.
The kdb export
/kdb import
tools were designed to export/import state snapshots. But I think you want a proper diff/patch system. Where the patch file can only applied, if the KDB is in a compatible state (like with git).
|
||
The `kdb record` tooling enables the user to record configuration sessions. It also allows exporting and importing the configuration changes. | ||
|
||
Recording sessions are stored on a per-user basis. This implies that a recording session can only track changes made by a single user. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about cases where a sudo kdb
is required, either because you need to change system:/
or spec:/
keys, or because of other permission issues?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could mount user:/elektra/record
with a (custom) resolver that only uses the $HOME
to detect the user directory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Such a resolver would fail, if $HOME
isn't set. It also kinda defeats the point of user:/
.
I'm not sure what the correct solution is but, storing information about system:/
and/or spec:/
keys in the user:/
namespace was weird from the beginning. You could store recordings in system:/elektra/record
, but then you get into permissions issues. With a single mountpoint you either always need sudo
, or anybody can write into any session. So for proper permissions, you'd need to create a new mountpoint for every session, then the file could be owned by the user that started the session. But creating mountpoints again requires sudo
.
Maybe we should store the recordings outside the KDB entirely. I'm think of something like this:
- Every session is stored in a single file. The path of the file serves as the unique ID (no session_name needed)
kdb record start [session_file]
takes the path to the session file, if none is given it creates one in the current directory. If the given file exists the session will be resumed, otherwise a new file is created.- The session file uses a line-based format (something custom or, e.g. JSON lines). That way adding a new entry to a session is easy, just append a line. It also avoids the need for sequence numbers or unique timestamps, the ordering is given by the order of the lines.
- The active session is defined by
/elektra/record/session
(normal cascading lookup).kdb record start
writes touser:/elektra/record/session
by default, but to cover thesudo
case you can usesudo kdb record start --system
with writes tosystem:/elektra/record/session
which starts a system-wide session. Alternatively, you could callkdb record start
andsudo kdb record start
with the same file to start the session for both the current and the root user, as along as both users have access to the file this should work. System-wide (or even root) sessions could of course be problematic on multi-user systems, because a different user could modify the KDB (possibly withsudo
) simultaneously which would interfere with the recording. So as a final option, I would suggest that theELEKTRA_RECORD_SESSION
environment variable is also read (*) and takes precedence. Then you can simply set the env-var (has to be done manually) and as long as you stay in the same shellsudo
changes will be recorded correctly.
By storing recordings outside the KDB you also don't need to make sure you exclude the user:/elektra/record
keys from the recording.
Item 4 from above, could be applied to your current setup as well. But then you still need to solve the problem of where to store the recording.
(*) We could use gopts
for that, but IMO that would be overkill here. Directly reading ELEKTRA_RECORD_SESSION
and putting it into proc:/elektra/record/session
(ensures precedence) either during kdbOpen
or during kdbGet
, is much simpler and also does the job.
If you really want to store everything in the KDB, there is another option. It's not pretty, but it would work. Let the env-var ELEKTRA_RECORD_USER
define the user for which the changes should be recorded, if the root user causes a change. In other words, if you set ELEKTRA_RECORD_USER=kodebach
before calling sudo kdb set system:/foo 123
then the change will be recorded in the active session of user kodebach
instead of the root user's session. I say this is not pretty, because it requires special case handling for when the root users changes something (not so hard), and writing to another user's user:/
namespace (quite a bit harder).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it looks like storing the session externally is the best option.
Not sure though that JSON is the best format for this, as storing binary data is pretty cumbersome. I should probably evaluate some other formats like BSON or protobuf as well.
Question is whether @markus2330 approves of adding a new dependency to the kdb
utility. We could make it optional at build-time though and only include the kdb record
commands only when it is included.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure though that JSON is the best format for this, as storing binary data is pretty cumbersome.
That's true.
I should probably evaluate some other formats like BSON or protobuf as well.
To make the whole "just append to a file" idea work, you need a format that has streaming support. If you need to wrap the whole file into something like a JSON array, then "just append to a file" doesn't work.
If you want to look at protobuf, maybe also check out flatbuffers. It's similar but designed to require minimal parsing/unpacking. The format can even be mmap
ed.
adding a new dependency to the
kdb
utility.
I see a few options:
- Use a sufficiently simple format that we can implement the parsing ourselves.
- Use a library that can be linked statically (e.g. I think for flatbuffers this can be done)
- Make
kdb record
optional as you suggest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@atmaxinger Can you please summarize pros/cons of external serialization in a decision?
I think it is very elegant to have everything within KDB:
- we can use all our plugins as storage format
- users can read/manipulate the recordings with their own tooling easily
But if there are compelling reasons against it (sudo alone imho is not, this could be fixed by having "system" sessions that are also accessible by a user, e.g. because it is mounted to /tmp), we'll look for alternatives. Currently the only feature in Elektra that has external data is the cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by having "system" sessions that are also accessible by a user
That doesn't work in a true multi-user environment. The system session would be active for all users and that would be nightmare. The environment variable trick, was the only solution I could come up with for this problem.
I think it is very elegant to have everything within KDB [...]
I guarantee you we will again run into the limitations of the KeySet
structure. For one you are limit to those formats that can store a KeySet
1:1 with all metadata, non-leaf values, binary data, arbitrary names etc.
Some other reasons:
- It's kind of an abuse of the KDB. The KDB is meant to store configuration data. This is quite clearly not configuration data.
- Users can't choose the storage plugin anyway, since we store data below
/elektra
where adding mountpoints is prohibited. - I don't think reading/manipulating the recordings with existing tooling would really be easy, if we force the recording into a
KeySet
format. The format with the key names as basenames needs an extra parsing step to actually know what's going on. You need to first parse the storage file as normal, but then you need to handle the key names as basenames before you actually have the parsed the data fully. - Using something like "JSON lines" would actually be incredibly easy to handle with existing tools. Just using the
jq
tool already gets you very far. The problem is of course binary data, but Base64 is always an option, or we could support a text-based and a binary format.
One thing we could do is a mixed solution. We use a non-KDB wrapper format for the log entries, but each entry contains an old
and a new
field which are encoded by the plugin specified in format
. This would also take care of the issue of changes to multiple keys, since we are recording a full KeySet
for old
and new
.
Binary values will still be a problem, if the wrapper format is text-based. But again we can use Base64, or we use a binary wrapper format. The wrapper format could even be implemented in plugins. Although those would have to be separate from the normal storage plugins.
Here are some examples with different wrapper formats:
JSON Lines
{ "parent": "user:/foo", "format": "dump", "old": "kdbOpen 2\n$end\n", "new": "kdbOpen 2\n$key string 0 3\n\n123\n$end\n"}
{ "parent": "user:/foo", "format": "dump", "old": "kdbOpen 2\n$key string 0 3\n\n123\n$end\n", "new": "kdbOpen 2\n$key string 0 4\n\n1234\n$end\n"}
YAML
parent: "user:/foo"
format: dump
old: |
kdbOpen 2
$end
new: |
kdbOpen 2
$key string 0 3
123
$end
---
parent: "user:/foo"
format: dump
old: |
kdbOpen 2
$key string 0 3
123
$end
new: |
kdbOpen 2
$key string 0 4
1234
$end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To get good results in a reasonable time we must take one step after the other:
- Requirements+Description of user-facing parts
- Major decisions (structure of log format, internal/external, global plugins, tooling, etc) and architecture
- Details like which extra libraries, how to implement etc.
Currently we are at step 1, slowly progressing to step 2. Now is not yet the time to discuss syntax, formats etc. (Step 3) They easily bring us astray, as happened in this discussion. Let us focus again on requirements.
The question is which of the following statements need to be captured by the same session:
kdb set dir:/some/key
kdb set user:/some/key
sudo -u other_user kdb set user:/some/key
sudo kdb set system:/some/key
kdb set user:/some/key
executed by another user
also answering the question:
Do we want the record tooling to work different to Elektra, which knows system, user and dir scopes?
I would answer the second question with no and would not expect 3-5 to end up in a user-session. I would expect all of the calls to end up in a system-session.
From your earlier statements, related to these requirements:
But creating mountpoints again requires sudo.
Not if we finally implement #1074. Maybe it is related to how recording should look like, as also for recording there might be use cases to have recordings for a system, for a user and for a specific directory.
Users can't choose the storage plugin anyway, since we store data below /elektra where adding mountpoints is prohibited.
I don't see a fundamental problem, we could easily allow well-defined mountpoints within /elektra, in specific for recording.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see this is quickly becoming one of those discussions with 1000 questions and no answers, where all arguments go straight past each other. I will address some of your concerns, but my main response is at the bottom and in a separate comment outside this thread..
You did not address my concerns, about system sessions in multi-user environments. Let's say Alice needs to record a sequence that modifies both user:/
end system:/
keys. To achieve her goal, Alice starts a session alice1
in both user and in system mode.
kdb record start alice1 # creates and activates the session for current user
sudo kdb record start alice1 # creates and activates the session system-wide
Now all changes that Alice makes are recorded in the user/system session. But what happens if at the same time, Bob also uses the machine and makes some changes to system:/
keys. These would be recorded in the alice1
session, since the session is active system-wide. Even worse, what if Bob also wants to start a system session
sudo kdb record start bob1
Now some of Alice's changes will be in alice1
(maybe with some stuff from Bob), but the rest will go into bob1
.
Unless I misunderstood what you mean by "system sessions", this cannot work like this. There is only one system:/elektra/record/config/session/current/name
key for the whole system and we can't use a key in user:/
to define the active system session, because that once again won't be accessible to the root user.
This problem totally negates your question "which of the following statements need to be captured by the same session". If the whole concept doesn't work, there is no need to decide what is recorded in which session.
I think this "system session" concept could only work, if we use an environment variable to define the active session instead. That wouldn't be shared between users. But you didn't even react to this suggestion at all...
To address some other things:
But creating mountpoints again requires sudo.
Not if we finally implement #1074.
IIRC the decision there was that user:/elektra/mountpoints
can only defined user:/
or dir:/
mountpoints, but not system:/
ones. So that's not gonna help.
I don't see a fundamental problem, we could easily allow well-defined mountpoints within /elektra
True, that issue might not actually be a problem.
With all of that out of the way, I agree with you that we need to go step-by-step here. However, IMO we should at least partially consider everything from the beginning. It is useless, if we come up with a perfect description of requirements, only to discover that those requirements cannot be put into practice.
* `kdb set user:/my/app/hello max` | ||
* `kdb set user:/my/app/hello susi` | ||
|
||
Only the value `susi` will be exported. Additionally, if you execute `kdb set user:/my/app/hello world` the key will not be exported at all, as there are no changes compared to the beginning of the recording. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there should also be a kdb record replay
that simply applies one change after another exactly as recorded. Firstly, this is much simpler to implement, so you'll have something to test much sooner. And secondly, there could be cases where because of complex plugin interactions you need some intermediate state to get the correct final result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, a "replay" and "undo" (go back to what was before the session started) sound much more intuitive. "replay" is probably export and then import with merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"replay" is probably export and then import with merge.
I think that's unnecessarily complex. Directly applying all the recorded changes one-by-one and then doing kdbSet
at the end (to make the abort in case of conflicts easy), seems much simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be clear: should replay
export something so that it can be used on another machine/kdb instance, or is it only a "redo" if you "undo" the changes of this session?
If we want to export something I can envision that it just outputs a shell script with the exact kdb
commands that have been entered in this session.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about replay
applying the changes immediately, but writing to a file could be even better. But my main point was that kdb record export
seems to produce the diff between start and end. So something like this
kdb set user:/foo/bar abc
kdb record start
kdb set user:/foo/bar 123
kdb set user:/foo/bar abc
kdb record export
produces an empty export file, since user:/foo/bar
had the same value at the start and at the end.
With kdb record replay
the intermediate change should still happen, because it could have side effects (depending on what plugins do).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be clear: should replay export something so that it can be used on another machine/kdb instance, or is it only a "redo" if you "undo" the changes of this session?
Yes, we need to distinguish these use cases clearly. The minimum local functionality is undo
(redo
, replay
is probably not needed so often). The more interesting part is the export+import functionality, which must continue to work after modifications, ergo 3-way merge yields better results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO this really depends on what export
actually does. If it ignores intermediate changes, some other command that doesn't is also essential. It could also be an option for the kdb record export
command. But if we cannot export the full log of changes, the tool is just a diff tool and not a proper recording tool.
Added system:/hosts/ipv4/www.microsoft.com with value 4.4.4.4 | ||
Removed system:/hosts/ipv4/www.apple.com | ||
$ kdb record export --format ansible | ||
<output corresponding to ansible-libelektra plugin (https://galaxy.ansible.com/elektra_initiative/libelektra)> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please provide the full output
doc/dev/kdb-record.md
Outdated
$ kdb set system:/hosts/ipv4/www.google.com 8.8.8.8 | ||
$ kdb set system:/hosts/ipv4/www.microsoft.com 4.4.4.4 | ||
$ kdb rm system:/hosts/ipv4/www.apple.com |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depending on your installation these may not work. It is quite likely that sudo
is required to modify system:/
keys, since /etc/hosts
normally is owned by root:root
.
doc/dev/kdb-record.md
Outdated
Every modification will include its UTC timestamp and the modified key in its path. | ||
For example, `user:/record-elektra/sessions/MySession/12345678/system:\/hello`. The value of this key will be the modification action, i.e. `created`, `modified`, `deleted`. | ||
|
||
Based on which modification took place (creation, modification or deletion), the old and new value for the key will also be recordes as `.../old` and `.../new`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about metadata?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops, seems like I forgot about meta-data. However it should not be too hard to fit that in using the same mechanism.. will take a look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we need to clarify what exactly gets recorded.
|
||
## Importing configuration changes | ||
|
||
The output of the `kdb record export` command can be imported with the existing `kdb import` functionality. Depending on the specified storage plugin, other steps may be necessary, e.g. the Ansible plugins creates an Ansible playbook. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don' think this is as easy as you make it out to be. I'm pretty sure you need at least a special kdb import --strategy
, if not a separate kdb record import
.
See below (Tooling) for more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, there should be also an explicit statement which --strategy must be used. For the user, however, it is better to provide a replay
, even if it is only a 2 line shell script.
The tooling will be implemented as part of the `kdb` command line utility. | ||
|
||
The most complex part of it will be the `kdb record export` command. | ||
When invoked, it must compute the minimal changeset of the session. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't record the full initial state of a session, which is fine, since making a copy of the full KDB for every recording would be pretty expensive. But this means that you only know about keys that were modified or deleted, unmodified keys will simply be missing from the changeset (which is totally normal).
But this means you need a special format for the changeset. The standard storage formats (like dump
) have no way to record deleted keys. A deleted key would simply not be present. But as stated above, unmodified keys would also be missing. So there is no way to distinguish them.
With the standard kdb import
you then only have two options:
- Keep the KDB as is and only overwrite what is contained in the imported KeySet
- Delete the part of the KDB where data will be imported and write the imported KeySet there
Therefore, with option 1 you might keep keys that should be deleted and with option 2 you might delete keys that should be kept unmodified.
So, you need a somewhat special export format, that can either be imported with a certain plugin, or with a separate kdb record import
plugin. Your export format needs to record deleted keys and that is not something the normal storage formats are designed to do.
However, it might be enough to use a format that supports metadata (like dump
or quickdump
) and add a special metakey (e.g. meta:/elektra/deleted
) to keys that should be deleted. You'll still need a separate kdb record import
or a special kdb import --strategy
to correctly apply these changes though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point.
I'm thinking we should define two new functions that plugins can implement - diffImport
and diffExport
(names are up for discussion).
This would allow us to implement various formats in the future.
kdb record export
and kdb record import
would then just call the respective function of the plugin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this is a good approach. I think you should either use a KeySet
based format (hard to do, but may be possible with heavy metadata use), which could then be stored with the existing plugins, OR you should just hard code a custom format somewhere for now.
The diffImport
and diffExport
functions would have entirely different requirements than the normal get
/set
functions of storage plugins, so it wouldn't really make sense to add them to the existing plugins.
Personally, I don't really see the point of having different patch formats, so I would just create one hardcoded format.
|
||
## Exporting configuration changes | ||
|
||
`kdb record export [session_name] [--format <format>]`: Will dump the changed keys of the specified session onto `stdout`. If no session name is provided, the current session will be used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should probably provide an example before this, since the current order makes it sound like kdb record
doesn't actually record all changes. Until you get to the example, it sounds like you just get a diff between start and stop states.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I disagree, it is better to explain first and then give examples. The explanation probably needs an update. Simply saying "dump the changed keys" is vague. What about removed keys?
"onto stdout
. If no session name is provided, the current session will be used." -> again details for the man page
doc/dev/kdb-record.md
Outdated
$ kdb record changes | ||
Changed system:/hosts/ipv4/www.google.com from 1.2.3.4 to 8.8.8.8 | ||
Added system:/hosts/ipv4/www.microsoft.com with value 4.4.4.4 | ||
Removed system:/hosts/ipv4/www.apple.com |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe there should also be a way to mark the initial state of a keys as important.
$ kdb record mark system:/hosts/ipv4/www.libelektra.org
$ kdb record changes
Unchanged system:/hosts/ipv4/www.libelektra.org must have value 7.7.7.7
This would ensure that system:/hosts/ipv4/www.libelektra.org
is recorded and exported with the current value.
In the diff/patch system I mentioned, this could be used to force compatible KDB states. In the export/import system it is less important since you could just change and then change back the key to get it into the changeset. But sometimes there may be keys that can't be changed without side-effects that you don't want during the session recording. For example, you might need to record the port a server uses, to make sure the rest of your session is compatible. But if this server is currently running, changing the port could break things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not convinced if such a feature is needed. Wouldn't someone simply edit the exported file in such cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't someone simply edit the exported file in such cases?
Depends how easy it is to edit the file. But it's also a matter of discoverability. Users need to know that they can add such things to the exported file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this feature could come in really handy and it doesn't look like it's going to be too much of a hassle to implement. I'll consider it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Editing the files obviously has a big drawback if you are in a round-trip workflow. So yes, some explicit statement of that you need the default as it is now makes sense.
It would bring quite a number of new cases, though: also marked keys could be changed, added or removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would bring quite a number of new cases, though: also marked keys could be changed, added or removed.
I don't think so. IMO "marking" a key could internally just be implemented as recording a "Changed user:/foo from A to A", i.e. with the /old
and /new
keys being equal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kodebach I don't understand your sentence.
I mean cases like that a marked key could be removed and a later time reappear, then it still should be marked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could restrict marking keys to keys that have not been modified yet in the current session. Before an import we then check that those marked keys have the exact value.
Alternatively, we could just "mark" all changed keys implicitly. We already record the old values of modified keys. We could just check that the "old" values of all changed keys that the will be imported equal the values in the target database. This way we don't need an explicit mark command. Additionally, we can provide a command line paramter like --dont-check
that skips this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could just check that the "old" values of all changed keys that the will be imported equal the values in the target database
I would've done this anyway. If you don't check for the old value, the whole thing isn't really different from just making your modifications and doing a normal kdb export
and a normal kdb import
on the other end.
Additionally, we can provide a command line paramter like
--dont-check
that skips this.
Yes, either that, or there should be away to do this on a key by key level. But the default should the that the old value is checked. Otherwise, you accidentally replace something important.
This way we don't need an explicit mark command.
We still do, because maybe you want to mark an unchanged key. You cannot mark all of those, because then you're just doing an export of the whole KDB.
I think before we can discuss this properly, we need to distinguish between a diff and a log.
kdb set user:/foo 123
kdb record start
kdb record mark user:/foo
kdb rm user:/foo
kdb set user:/foo 123
After this sequence there will be a log of everything that happend somewhere in user:/elektra/record
. This log will contain something like:
Started session
Marked key user:/foo with value 123
Removed key user:/foo with value 123
Added key user:/foo with value 123
In this log there is no problem at all. My proposal here was that instead of introducing a new "Marked key ... with value ..." type of log entry, we just let the log look like this (even if it looks a bit silly):
Started session
Changed key user:/foo from value 123 to value 123
Removed key user:/foo with value 123
Added key user:/foo with value 123
Of course this is just the internal representation. On the user-facing side, we should use a more sensible text.
Now all the problems you are thinking of only arise, when you try to collapse this log into a single diff. I don't know whether user:/foo
should still be marked in this case. IMO it is a matter of definition, if removing a key removes it's marked status. Things become even more complicated, when you also add a final change:
Started session
Marked key user:/foo with value 123 | Changed key user:/foo from value 123 to value 123
Removed key user:/foo with value 123
Added key user:/foo with value 123
Changed key user:/foo from value 123 to value 456
Now what? Is the key now marked with value 456? Has it now lost the marking?
This is actually, why I think we really need a way to export the log and not just the diff. That's also why I wanted kdb record replay
(which could also take an exported log as input). The replay
command would replay the log one step at a time, instead of just applying a diff. So in our example a replay would:
- Check that
user:/foo
has the value123
- Check that
user:/foo
has the value123
and remove it - Add the key
user:/foo
with value123
- (Check that
user:/foo
has value123
and change it's value to456
)
That's where the "Changed user:/foo from A to A" idea came from. We need to check that the current value is A
anyway (otherwise there is a conflict) so the fact that we then set it to A
doesn't change anything, but it achieved the goal of having a check.
Final note: So far this discussion is only about changing one key at a time. But it must of course be possible to have "changed/added/removed/marked" for multiple keys in a single log entry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree we must clarify logs/recordings and diffs/merges. @atmaxinger please also extend doc/help/elektra-glossary.md
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are several features not yet described clearly and the overall structure should be changed (tutorial, man pages and architecture split in different files).
Overall: Looks amazing, I am very much looking forward to this functionality. 🚀
* `kdb record stop`: Stops the current recording session. | ||
* `kdb record delete [session_name]`: Deletes the specified recording session. If no session name is provided, the current session will be deleted. Deleting a session will not stop the recording. | ||
* `kdb record list-sessions`: Lists all sessions by their name | ||
* `kdb record changes [session_name]`: Shows all changes for the specified session. If no session name is provided, the current session will be used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this some kind of export, too? I don't think this command is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depending on what the actual format of record export
looks like, it might be hard for a human to read. I think a simple listing of all changes like this could be quite helpful. We should, however, make it very clear that the output is not intended to be machine readable and might change at any time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, as @kodebach mentioned the data displayed by this command is intended to be human-readable, as opposed to the machine-readable output of record export
.
This feature is intended as a helpful reminder when debugging the currently changed configuration. This should be documented more clearly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is kdb record status
then? Changes, also in human-readable or statistical formats could be done by plugins, then it is also usable outside of kdb record
.
|
||
## Exporting configuration changes | ||
|
||
`kdb record export [session_name] [--format <format>]`: Will dump the changed keys of the specified session onto `stdout`. If no session name is provided, the current session will be used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I disagree, it is better to explain first and then give examples. The explanation probably needs an update. Simply saying "dump the changed keys" is vague. What about removed keys?
"onto stdout
. If no session name is provided, the current session will be used." -> again details for the man page
doc/dev/kdb-record.md
Outdated
$ kdb record changes | ||
Changed system:/hosts/ipv4/www.google.com from 1.2.3.4 to 8.8.8.8 | ||
Added system:/hosts/ipv4/www.microsoft.com with value 4.4.4.4 | ||
Removed system:/hosts/ipv4/www.apple.com |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not convinced if such a feature is needed. Wouldn't someone simply edit the exported file in such cases?
<output corresponding to ansible-libelektra plugin (https://galaxy.ansible.com/elektra_initiative/libelektra)> | ||
``` | ||
|
||
## Architecture |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is in the place where it should be! The file can be renamed to doc/dev/record.md
as it describes the whole recording, not only the user-facing tools.
|
||
All recording sessions are located under `user:/record-elektra/sessions/<SESSION_NAME>`. | ||
|
||
Every modification will include its UTC timestamp and the modified key in its path. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about actions occurring within a second? Sequence numbers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question! This is only a problem if the same key was to be changed multiple times in the same time interval.
We could use a timestamp with nanosecond resolution or ticks, that should be good enough.
Sequence numbers would also work, if we can generate them reliably.
I think we need to take a step back here. We need to answer the fundamental question: What is the intended use case for Not just, how can I use it, or what does it do. But actually why would I use it? @atmaxinger @markus2330 Please both answer this question. Let's postpone all further discussion until then, so that we can compare, if we are even trying to achieve the same goal here. When we have our answers, we can find some common ground and add a use-case definition to the document. For me the intended use case of The core of all these is that I want to record all changes that are made to the whole KDB over a certain period of time and get a detailed log of every change. In that sense, (*) could be a sequence of Note: The creation of a single diff between start and end of recording, is bonus to me. IMO it would be easier to export at the start and end of the modifications and do a diff between those exports, than to record everything just to collapse it into a single diff. |
@kodebach I would use it for iteratively prototyping configuration, similar to how we build Dockerfiles. In this case I am not yet sure which changes to the configuration are necessary. During those iterations it is quite possible that multiple keys will be changed only to then be restored to their original value. When I'm finished I want to export those changes and apply them to other systems. For me the log of how exactly the new state was reached is a byproduct, although a useful one. As you mentioned, it would enable undo functionality and audit logs. But from my point of view this is the bonus. The thing is - even though both use cases are different, a huge part of the code and user interface (the actual recording and the session management) is exactly the same. We just have to provide two separate "export"/"import" formats/commands:
I really don't see a big problem with implementing both. |
- When using an "append-only" format there is no need to create sequence numbers | ||
|
||
Cons: | ||
- Need to implement reliable saving ourself |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The resolver
plugin could still be reused, if one of the directories associated with an Elektra namespace can be used. Otherwise, we could just extra the parts that are concerned with atomic writes into a separate library that can be reused.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. On the other hand we just need to copy to a temporary file, append something to it, and then use rename
to replace the old file in the postcommit
stage. Shouldn't be too hard to implement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's basically want resolver
does (when we ignore the actual resolving part). In the resolver
phase at the beginning of set
it resolves the real path, but actually returns the path for a temporary file. Then in commit
it does a rename
to actually apply the changes, assuming no other changes where applied in the meantime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't be too hard to implement.
Looking forward to see fixes in the resolver plugin then #1470 😉 It is actually quite hard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The resolver
is quite generic that is a part of what makes it so hard. If the use case for recording is more limited or uses a different API (e.g. an append
that atomically reads and updates the file), it might be easier. But I agree, it is probably not as simple as just using rename
.
Iteratively building a config is already possible (as far as applications support it) with But, if you just want to start with an empty config and then try different options by setting and unsetting values and in the end you want to have a config file that can be copied/applied to other machines, then you don't need any new features. You can already create a new mountpoint for your config, play around with
If the diff is the only goal, I'm not sure recording everything in between is the best approach.
That is true to some extent. But IMO as soon as you actually implement a record/replay feature, this becomes the main feature. I say this, because record/replay requires recording all changes, maintaining correct the order, specialized export formats, etc. And then you need to squash the separate change entries into a single diff, which is kinda complicated with lots of edge cases. So yes, you can implement both features with shared code, but it is still more complicated than just doing one of them.
Yes, having both features would be nice and seems quite possible. But if you indeed implement both, I would suggest you focus on the record/replay feature first. It is the basis for everything and you can always replay the whole recording to achieve the final config. It's not as efficient as a single direct diff, but it should still work. (*) More precisely you may still need to use the existing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Partial review.
|
||
We need a way to store session recordings. | ||
Sometimes it is required to run the `kdb` application with `sudo`, especially when changing system-wide configuration. | ||
Session recordings should be on a per-user basis. Storing them in the `user:/` namespace is not compatible with `sudo`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first sentence is a requirement, the second sentence contradicts the first (see man page of sudo: "execute a command as another user").
- Can query recordings with existing tooling | ||
|
||
Cons: | ||
- Still needs workaround to start session (aka mount a session in `system:/`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean here that kdb record start
would need to be run as root? If it needs to mount something by design, I do not see a "workaround".
|
||
Cons: | ||
- Still needs workaround to start session (aka mount a session in `system:/`) | ||
- Semantic mismatch: Elektra is built for configuration data; session recordings are not configuration data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we need details what "session recordings" are (link to description).
Cons: | ||
- Still needs workaround to start session (aka mount a session in `system:/`) | ||
- Semantic mismatch: Elektra is built for configuration data; session recordings are not configuration data | ||
- Need to model session recordings to fit into Elektra data model (KeySet) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the same as the one before?
- When using an "append-only" format there is no need to create sequence numbers | ||
|
||
Cons: | ||
- Need to implement reliable saving ourself |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't be too hard to implement.
Looking forward to see fixes in the resolver plugin then #1470 😉 It is actually quite hard.
|
||
The same principle applies to the creation and deletion of keys within a session. If you create a key and delete it in the same session, it will not be exported. | ||
|
||
By default, the elektra dump format is used. The optional --format parameter can be used to specify which storage plugin should be used to format the output. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also look at how merging works 😉
The base doesn't contain user:/foo (assuming kdb record
defines the base), ours doesn't contain user:/foo, so theirs wins without conflict: user:/foo will stay present with 456
.
$ kdb set system:/hosts/ipv4/www.microsoft.com 4.4.4.4 | ||
$ kdb rm system:/hosts/ipv4/www.apple.com | ||
$ kdb record stop | ||
$ kdb record changes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please do fixes or create separate PRs, it is getting confusing here. I stop the review here.
Regarding the "might need a different plugin API" issue: I couldn't find an issue for it (if one exists), but we did want to move from the current This could be helpful here. Moving everything over to the new API would be a quite a bit of effort, but as long as the new API uses a different name for the exported function, the fallback should be seamless (just try the new one first, then the old one). We could even use the hierarchy of the
Details should probably be discussed elsewhere, but I just wanted to mention that this is probably the way to go, if you need a different plugin API. |
DO NOT MERGE
This is the first draft of what I envision the session recording tooling to be. The proposed solution should work with all the use cases I could think of, but someone more senior with the usage of Elektra will probably find some features lacking.
As for the tooling, we could probably also write a C library for it as proposed in #4208. That would also require a rewrite of the existing import and export tooling in C.
@markus2330 Please take a look at it and offer your valuable insights. We could also discuss this next week if there are any big topics to discuss.
@kodebach I'd also really appreciate your feedback.