Support for Kubernetes, ephemeral OTOBO containers #1148
bschmalhofer
started this conversation in
Projects
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
The current Docker support for OTOBO 10.1 relies on the Docker volume /opt/otobo. This volume provides persistence to the Docker-based OTOBO installation. Implicitly the volume is also a means for communicating events, e.g. using the "mtime of a file has changed" mechanism. However things are different when OTOBO runs under Kubernetes or another high availablity setup. In Kubernetes the default is to have ephemeral PODs, where /opt/otobo is local to each POD.
In this discussion we investigate what has to considered in an high availability setup. Specifically, what it means when /opt/otobo is not shared accross nodes. The alternative approach, to use a shared /opt/otobo, has been discarded. It has been tried with glusterfs but there were severe performance problems.
Recap of Kubernetes Basics
Just some definitions.
Doing away with the shared and persistent file system
If the data in /opt/otobo were static, then there be no need for sharing between pods. So, let's start by making a list of things that are dynamic in /opt/otobo.
Essential features:
Non-essential features:
A special case is the usage of shared memory for the AdminLog frontend module.
A note regarding Redis
Redis is not a safe data storage. Everything which has to persist has to be put into either the database or a storage like S3.
Bootstrapping
Some of the approaches below need some data to work. E.g. the database connection and the location of shared storage must be known beforehand. Maybe put the essential config into a config map and pass it to the containers via the environment.
Manual changes to Kernel/Config.pm
In the current use case it is fairly usual to make manual adaption to Kernel/Config.pm . In an environment where containers are started frequently and /opt/otobo is not share this becomes less manageable. Some control can be exerted via the use of environment variables. But this isn't very convenient either.
The currently recommended approach is to use locally built Images where Kernel/Config,pm is changed. But is there another. more convenient, option.? An alternative is to provide a customisation package that contains an Autoload module. If that is the new recommended approach, then it must be documented properly.
*.pm files in Kernel/Config/Files
There are only three files that need to be considered here:
Usually there are no other *.pm files in Kernel/Config/Files. But an OTOBO package might be doing weird stuff. But in this case the service needs to be restarted anyways.
These Perl modules in Kernel/Config/Files constitute the major part of the current configuration of OTOBO. Whenever an instance of
Kernel::Config
is created the following actions are executed:Load()
is called with passing the package name and the instance ofKernel::Config
The daemon modules do not check wheter ZZZ*,pm files have changed. Instead they for a relead on these files in each iterations. Console commands are not expected to run a long time, they simple read the ZZZ*.pm files when starting up.
The method
Load()
is free to do whatever it wants with the instance ofKernel::Config
. Usually values are merged into the nested data structure. In most cases this could also be done by merging YAML files, but the interface is wide open by design. It is fairly frequent that the *.pm files change. An example are the config adaption done during unit tests.This approach works fairly well for multiple servers when Kernel/Config/Files is located on a shared file system. Keeping this proven approach is a goal here.
Things do become a bit more complicated when Kernel/Config/Files is no longer shared. In the most simple case there is a single instance of truth. This reference must be synchronised for each POD before the list of modules is determined. This synchronisation should be done fairly quickly when a new instance of
Kernel::Config
. It suffices to add the check in `Kernel::Config'. This is so because otobo.psgi creates a new config object for every request and the daemon creates a new config object for every task.The requirements for the synchronisation are:
The reverse direction must be supported too. Any changes to the files in Kernel/Config/Files must initially be done in the reference. The Daemon and the webserver are responsible that the work with the newest version of the configuration.
It is not obvious how this can be implemented effectively. The most simple approach is to store the .pm files under a specific S3 prefix. Daemon and webserver then need to get the current versions from S3. This can be done with getting an object listing and comparing object size and modification time.
It is not obvious whether the above approach is the best solution. Some othere ideas are:
Kernel::System::Daemon::DaemonModules::SystemConfigurationSyncManager
Note that there already is the Daemon module
Kernel::System::Daemon::DaemonModules::SystemConfigurationSyncManager
. The current understanding is that this module can't help with distributing the configs. The OTOBO Daemon does not, and should not, know about which PODs are running.Loader Files, Minified and Concatenated CSS and JS files
The most safe solution is to simply turn off the loader. This can always be done when there are problems cropping up.
If only minification is enough, then something like https://metacpan.org/release/IDOPEREL/Plack-App-MCCS-1.000000 could be used.
Keeping the current approach to minification is doable too. Currently minified files are written into the file system. This occurs when the server generates HTML content. The browser then requests these pages when he renders the HTML pages. These generated files are often served by_otobo.psgi_, but they can also be served by any reverse proyx.
When running in K8s we have no shared file system. Instead we write the minified files into S3. The storage in S3 is the reference. When a loader files is requested then following actions take place:
Always serving from the file system has the advantage that the files can be streamed. Streaming is supported by https://metacpan.org/pod/Plack::App::File.
For further optimisation the minified files could also be served by the K8s load balancer Ingress directly from S3. Local caching is possible in this case too.
NodeID
The config setting NodeID is set to the value 1 per default. The intention was that it is set to different values for different nodes.
But currently there is no way to automatically assign NodeIDs to Daemon instances. Furthermore, it is not obvious whether it is really necessary to provide different NodeIDs. Let's list where NodeID is used in OTOBO.
The value
DaemonRunning
of the cache is per NodeIDIt is set by the Daemon itself, in the file https://github.com/RotherOSS/otobo/blob/rel-10_1/Kernel/System/Daemon/DaemonModules/SchedulerTaskWorker.pm . It is used by the plugin
DaemonRunning
of the support data collector and by the notification pluginDaemonCheck
. Another use is in the package manager frontend, https://github.com/RotherOSS/otobo/blob/rel-10_1/Kernel/Modules/AdminPackageManager.pm . This "Upgrade All packages" can only be done when the Daemon is running.NodeID is also used in Kernel/Modules/AdminRegistration.pm. But this frontend module is no longer used.
Most of these usages might be broken as they check only the specific node of the web server. But things should be fine when the Daemon runs on any node.
Daemon module
SystemConfigurationSyncManager
The NodeID seems to be only validated, but not used.
Daemon module
SyncWithS3
The NodeID is currently not used in that plugin. But maybe it should, for synchronisation.
Task scheduler within the OTOBO Daemon
This is bit involved. It looks like the Daemon determines which scheduled tasks, or cron tasks, need to be executed and add these tasks to a task queue. The scheduling is specific for a combination of process id of the Daemon and the NodeID. But this assumes that the NodeID is different on different hosts. This means that the current approach, where NodeID is always 1, might be broken.
This concerns:
Also, the PID-file contains the NodeID. But that should be fine as the PID-file is usually in /opt/otobo/var/run
Kernel::System::SysConfig
The NodeID is part of a temporary setting in the database attribute sysconfig_deployment.comments . This is likely not critical.
Kernel::System::Ticket::NumberBase
The NodeID is part of ticket_number_counter.counter_uid. Not critical.,
Installing OTOBO packages
In a Kubernetes context, the cleanest way would be that a new or updated package would be delivered via new docker images. But the OTOBO way is to use the package definitions in the database. This approach also allows operation in non-Kubernetes settings.
My impression is that up to now cluster support for OTRS is usually done via a shared file system. There seem to be no readily available solutions for distributing changed files. This implies that we need to roll our own watchdogs. There is already is support for detecting changes in the ZZZ*.pm files. Unfortunately the web server and the daemon do these checks in different ways. It would be nice to unify these checks, as both use cases have basically the same requirements. In both cases it would be nice to have:
There are several several options for how package updates are communicated to the watchdogs. One thing that can be tried is to abuse the S3 prefix OTOBO/Kernel/Config/Files and add or update a file like repostory_list.json. An update to that file would indicate an added, updated or deleted package.
TODO: what about removed packages
Using cron is not really a good option. Usually cron needs to run as root, which is not best Kubernetes practice. Alternatives to cron, like https://github.com/aptible/supercronic or https://github.com/mcuadros/ofelia, are not available as Debian packages.For every web server and daemon we probably need a watchdog that checks for OTOBO package updates. Adding the watchdoq to Gazelle and otobo.Daemion.pl is kind of hard. Let's go for something like cron. The watchdog can send SIGHUP to the watched daemon. Maybe better: halt the server, update files, resume the service. It looks like this can be done with otobo:Daemon.pl, not sure about the webserver.
When a POD is starting up, we can use initContainers. These containers do the update actions as defined bin/docker/entrypoint.sh .
Rolling restart of the PODs should be avoided.
$User.pm files
Regenerating user specific configurations can be done the same way as the ZZZ*.pm files. But I suggest to disallow $User.pm files in the initial Kubernetes support.
Articles and attachments
Articles and attachments can be persistently stored in S3.
New communication paths between PODs
In order to keep things simple, no new communication infrastructure should be introduced.
Logging
Investigate and decide how logging should be done. Maybe with Log::Log4perl to a logging server, or Kubernetes looks at the syslog.
Health checks
It might be that K8s does not rely on Docker health checks, but wants to probe itself. See https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/.
related issues:
Related questions and TODOs
See also
Conclusion
Aim for strategy 1 without a persistent volume. Tackle the critical points one after the other.
Current implementation
Some support for S3 is implemented but testing is very basic and documentation is non-existent. Here is the beginnings of a HOWTO.
Development and testing is done with localstack. But for production MinIO is recommended. The steps are:
Beta Was this translation helpful? Give feedback.
All reactions