Skip to content
This repository was archived by the owner on Nov 17, 2020. It is now read-only.

Move the location of the 'enabled_plugins' file #369

Closed
wants to merge 3 commits into from

Conversation

HoloRin
Copy link
Contributor

@HoloRin HoloRin commented Mar 30, 2020

In response to rabbitmq/rabbitmq-server#2234

Move the location of the enabled_plugins file from /etc/rabbitmq to /var/lib/rabbitmq on unix, but fallback to the original location if there is no file at the new location, and the file at the old location is both readable and writable.

From /etc/rabbitmq to /var/lib/rabbitmq on unix, but fallback to
the original location if there is no file at the new location, and
the file at the old location is both readable and writable.
@HoloRin
Copy link
Contributor Author

HoloRin commented Mar 30, 2020

It's worth noting that this triggered the warning about the deprecated location in the current docker images. We are enabling the plugins with rabbitmq-plugins enable --offline rabbitmq_management in those cases, so I would have expected it to have picked up the new location. I will investigate before this can be merged.

If the 'enabled_plugins' file is present at the legacy location, but not
writable, we now copy it to the modern location first, rather than ignore it

The intention is to avoid a change in behavior for users who configured
plugins this way for startup, but never changed them.
@lukebakken
Copy link
Contributor

There was one issue in rabbitmq-server-release related to enabled_plugins: rabbitmq/rabbitmq-server-release#30

I'm pretty sure there is no action necessary, I just remembered the issue is all.

HoloRin added a commit to rabbitmq/rabbitmq-server-release that referenced this pull request Apr 1, 2020
One a VOLUME has been declared, changes to that location are subsequently dropped during the building of the Dockerfile. With the changes, in rabbitmq/rabbitmq-common#369 applied, this meant that the enabling of plugins no longer worked. We now make the VOLUME declaration the last line of the Dockerfile, assuming that we would not want to throw out any changes.

Additionally, we use gosu to execute plugin enablement, so that the 'enabled_plugins' file is owned by the rabbitmq user when the broker goes to update it.
Copy link
Contributor

@dumbbell dumbbell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is the wrong location for the moved enabled_plugins logic; see my comments in the review.

I didn't look at the testsuite itself because it should be moved and adapted as well.

.gitignore Outdated
@@ -21,6 +21,7 @@
/sbin/
/sbin.lock
/test/ct.cover.spec
/test/*_SUITE_data/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These directories, if they exist, are meant to be committed as they are supposed to contain data used by testsuites. They are not meant to contain temporary content during testing.

If a testsuite needs to write temporary data, it can use ?config(priv_dir, Config).

[LegacyLocation, ModernLocation]),
LegacyLocation;
{false, {ok, #file_info{access = read}}} ->
{ok, _} = file:copy(LegacyLocation, ModernLocation),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This module can't have side effects like creating files. Its purpose is only to interpret the environment and return a computed context.

In fact this entire logic should live in the rabbit_plugins module in rabbitmq-server, which is responsible for the enabled_plugins file.

The context can store the old and new locations to communicate them to rabbit_plugins.

Copy link
Contributor Author

@HoloRin HoloRin Apr 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This generally did cross my mind, but I didn't think it was worth the greater complexity of changes. I willing to be convinced, though.

So, I think the cleanest way to do that is to move the notion of the default all the way out of the context, otherwise the context has to communicate either:

  • a location that should be used unequivocally, because it came directly for the env var
  • two locations that should be handled exactly with the described fallback behavior
    The context could return somelike {certainly, "some user specified location"} or {with_fallback, ["/var/lib/rabbitmq/enabled_plugins", "/etc/rabbitmq/enabled_plugins"]}, but the coupling between the modules/repos doesn't feel right to me.

So instead, I think the context can communicate only the env var based location, and rabbit_plugins can take care of all the rest that is here. However, that leaves a remaining question: How does the cli adopt the same behavior, if you run rabbitmq-plugins and encounter the same situation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dumbbell Is there a reasonable possibility of the context returning the location as a kind of lazily executed, but memoized deferred value, that the caller can resolve at the time of its choosing (I'm thinking of https://clojuredocs.org/clojure.core/delay)? Then the broker and cli would only need to trigger resolution, instead of both having a the fallback mechanism. I suspect the answer is no, since gen_server seems to fulfill this kind of role generally in erlang.

check_log_process_env/1,
check_log_context/1
]).
-compile(export_all).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep the list of explicitly exported functions, otherwise there is no way to know when there is dead code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had wondered why the export was explicit, since in other tests, we just export all. How does this help us find dead code?

@dumbbell
Copy link
Contributor

dumbbell commented Apr 1, 2020

Beside the location of the code, I like the general logic.

I wonder if we should always move the existing file to the new location (and thus always use the new location). I'm thinking of troubleshootings, it might be easier if the default location is the same for everyone.

@HoloRin
Copy link
Contributor Author

HoloRin commented Apr 1, 2020

I wonder if we should always move the existing file to the new location

While I agree that would make things simpler moving forward, if someone did have /etc read and writable, and relied on the contents elsewhere, this could cause issues for them. A mitigation could be to move the file rather than copy, and replace the original location with a symlink to the new location (a course of action that is only possible if the old location is readable and writeable). It's a little more complex, but might be worth it for troubleshooting as your said.

One more aggressive option would be to fail to start if the old file is present, which would effectively force them to migrate the file, but that isn't an approach I would prefer to take.

@michaelklishin
Copy link
Contributor

If nodes refuse to start because of this file we will be flooded with support cases and users won't upgrade.

@HoloRin
Copy link
Contributor Author

HoloRin commented Apr 1, 2020

@michaelklishin I am in total agreement that we don't want to break startup for anyone, for all the reasons you mention.

What about the move+symlink option rather than the copy? Would you consider that sufficiently safe?

It is now undefined with respect to the context, as it will later determined later on by the rabbit_plugins module, potentially with side effects.

Additionally, we rename the default group in the tests for clarity, since they were not actually run in parallel.
@HoloRin
Copy link
Contributor Author

HoloRin commented Apr 2, 2020

I've moved the logic over to rabbitmq/rabbitmq-server#2298

A few changes were still required to common which are present in a new commit. Once we feel satisfied with the approach, these commits can be squashed down.

@lukebakken
Copy link
Contributor

Closing due to the migration to the monorepo. This PR can be re-opened there when necessary.

@lukebakken lukebakken closed this Nov 16, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants