Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor how media_repository is configured #72

Open
wants to merge 11 commits into
base: unraid_develop
Choose a base branch
from

Conversation

realtyem
Copy link
Owner

@realtyem realtyem commented Nov 21, 2022

Refactor how media_repository is configured

The code that enables the media repository functions in repository.py is really confusing. Let's clean that up by refactoring(while preserving backwards compatibility), then move it to the worker section of config instead of server. After that, docs to show how to make synapse.app.media_repository be a synapse.app.generic_worker.

  • Refactor how the conditional decides if it is going to enable enable_media_repo to be easier to read. Also easier to remove reference to synapse.app.media_repository.
  • Move the yaml read and configuration setup from server.py to workers.py. Don't forget to move the WorkerConfig to just above ContentRepositoryConfig. Remove orphaned code.
  • Update docs with enable_media_repo being able to be set in worker.yaml file to true if disabled in homeserver.yaml or shared.
  • Optionally, add the config set to the same style of pusher and federation_sender, which can only be done in the worker config file.
  • Update docs to show new functionality

# Only enable the media repo if either the media repo is enabled or the
# current worker app is the media repo.
if (
self.root.server.enable_media_repo is False
and config.get("worker_app") != "synapse.app.media_repository"
):
self.can_load_media_repo = False
return
else:
self.can_load_media_repo = True

Is the code that references synapse.app.media_repository. As I understand it, this is the only piece that is standing in the way of removing the last reference to this worker_type. Splitting the conditional and nesting it will allow for simple line removal when it's time to rip it out. A side effect of how this is written is that if enable_media_repo is set to false in homeserver.yaml or a shared configuration, it can be set to true in a worker.yaml file to enable the functionality correctly for that worker, even if the worker_app is NOT declared to be synapse.app.media_repository. That took me over a week to work out in my head, and is a fact not spelled out in the docs.

The code that checks what settings were read from yaml is here:

# whether to enable the media repository endpoints. This should be set
# to false if the media repository is running as a separate endpoint;
# doing so ensures that we will not run cache cleanup jobs on the
# master, potentially causing inconsistency.
self.enable_media_repo = config.get("enable_media_repo", True)

Why? There doesn't seem to be a good reason, although there is one other reference to this in the server config section at
if name in ["media", "federation", "client"]:
if self.config.server.enable_media_repo:
media_repo = self.get_media_repository_resource()
resources.update(
{
MEDIA_R0_PREFIX: media_repo,
MEDIA_V3_PREFIX: media_repo,
LEGACY_MEDIA_PREFIX: media_repo,
}
)
elif name == "media":
raise ConfigError(
"'media' resource conflicts with enable_media_repo=False"
)

ServerConfig is loaded before ContentRepositoryConfig, then WorkerConfig. Checking the code paths didn't yield any issues with moving this around so I'll put it in WorkerConfig, provided it's moved up to just above ContentRepositoryConfig so it would be available. This feels more "right".

While this is all being fixed, update how it's read so we can reuse the backwards compatilbity function that is used by federation_sender and pusher to give us the capability to use a media_repo_instances map for free.(This part is waiting on the acceptance of the PR in the matrix-org repo, # 14496). It should look something like:

media_repo_instances:
  - media_repository1

Questions:

Full set of conditionals

  1. Homeserver is monolith, enable_media_repo defaults to true, no media_worker, true
  2. Homeserver is polylith, enable_media_repo is set to false, media_worker is app.media_repo, true
  3. Homeserver is polylith, enable_media_repo is set to false, media_worker is generic, false and no worker handles media
  4. Homeserver is polylith, enable_media_repo is set to false, media_worker is generic and has enable_media_repo set to true, true

This was educational when played with at https://www.w3schools.com/python/trypython.asp?filename=demo_variables4

worker_app = None
enable_media_repo = True
can_load_media_repo = True
# Only disable the media repo if it is explicitly disabled in yaml.
if enable_media_repo is False:
  # However, if we are still using the legacy app name on this instance,
  # it needs to stay enabled.
  if worker_app == "media_worker":
    # This is our backwards compatibility bypass
    can_load_media_repo = True
    
  else: can_load_media_repo = False


print(can_load_media_repo)

And leads to the question: If there is a deployment with a master and one generic_worker, and the assumption(based on docs) that enable_media_repo is left out so it should be on the main process, does it? enable_media_repo defaults to true, so it should by default be enabled on all instances and workers. This sounds dangerous and expensive, but in practice isn't a big deal because workers by default won't be receiving data over their /_matrix/media endpoints. However:

# Whether this instance should be the one to run the background jobs to
# e.g clean up old URL previews.
self.media_instance_running_background_jobs = config.get(
"media_instance_running_background_jobs",
)

Jumps to
# We run the background jobs if we're the instance specified (or no
# instance is specified, where we assume there is only one instance
# serving media).
instance_running_jobs = hs.config.media.media_instance_running_background_jobs
self._worker_run_media_background_jobs = (
instance_running_jobs is None
or instance_running_jobs == hs.get_instance_name()
)

Which passes to
if self._worker_run_media_background_jobs:
self._cleaner_loop = self.clock.looping_call(
self._start_expire_url_cache_data, 10 * 1000
)

Where it sets up a looping call to run the "clean up old url_preview media" function, which then deletes files and directorys in the "media_store" based on whatever worker(or master) got there first. My issue with this is that it does this every 10 seconds. On every worker. Again, doesn't seem to be a big deal since no one seems to have complained about it yet. Should probably make sure that this is explicitly the main process unless it's declared otherwise. I believe that changing
self.media_instance_running_background_jobs = config.get(
"media_instance_running_background_jobs",
)

to

 self.media_instance_running_background_jobs = config.get( 
     "media_instance_running_background_jobs",
     "master",
 ) 

will cleanly take care of always setting this to the main process, which also means the conditional check

instance_running_jobs is None

can be removed as it will never be None.

Oddity number 37:
Comparing

if name in ["media", "federation", "client"]:
if self.config.server.enable_media_repo:
media_repo = self.get_media_repository_resource()
resources.update(
{
MEDIA_R0_PREFIX: media_repo,
MEDIA_V3_PREFIX: media_repo,
LEGACY_MEDIA_PREFIX: media_repo,
}
)
elif name == "media":
raise ConfigError(
"'media' resource conflicts with enable_media_repo=False"
)

and
elif name == "media":
if self.config.media.can_load_media_repo:
media_repo = self.get_media_repository_resource()
# We need to serve the admin servlets for media on the
# worker.
admin_resource = JsonResource(self, canonical_json=False)
register_servlets_for_media_repo(self, admin_resource)
resources.update(
{
MEDIA_R0_PREFIX: media_repo,
MEDIA_V3_PREFIX: media_repo,
LEGACY_MEDIA_PREFIX: media_repo,
"/_synapse/admin": admin_resource,
}
)
else:
logger.warning(
"A 'media' listener is configured but the media"
" repository is disabled. Ignoring."
)

  • Line 211 in homeserver.py and Line 251 in generic_worker.py are not comparing the same type of config section. Shouldn't these match? Either one should work™️.

@github-actions github-actions bot deployed to PR Documentation Preview November 21, 2022 09:07 Active
@@ -1584,7 +1584,8 @@ Config options related to Synapse's media store.
### `enable_media_repo`

Enable the media store service in the Synapse master. Defaults to true.
Set to false if you are using a separate media store worker.
Set to false if you are using a separate media repository worker(and remember to set to
Copy link
Owner Author

@realtyem realtyem Nov 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only place that got changed in this file. Everything else is my editor happily removing whitespace without asking 🙄. It should be easier to look through if you disable whitespace changes.

@github-actions github-actions bot deployed to PR Documentation Preview November 21, 2022 09:11 Active
@github-actions github-actions bot deployed to PR Documentation Preview November 21, 2022 09:28 Active
@realtyem realtyem marked this pull request as ready for review November 21, 2022 13:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant