Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Final piece to remove for generic_workers blanket use. #69

Open
realtyem opened this issue Nov 11, 2022 · 6 comments
Open

Final piece to remove for generic_workers blanket use. #69

realtyem opened this issue Nov 11, 2022 · 6 comments

Comments

@realtyem
Copy link
Owner

realtyem commented Nov 11, 2022

Make media_repo be a generic_worker

Quoted from the modernize configure workers script PR:

  1. For Details 1: synapse.app.media_repository is a synapse.app.generic_worker, however because of this line, enable_media_repo won't work with synapse.app.generic_worker. As such, this won't be changed now. I feel like this logic should be in the workers.py file with similar type options(like run_background_tasks_on, notify_appservices_from_worker and update_user_directory_from_worker), and possibly adapted to use _should_this_worker_perform_duty(). Ideally, this should all be on a map instance that hashes by worker_name in the shared config.

Having looked into this a bit more, it's slightly more involved than a simple removal.
Things we know:

  1. In:

    # whether to enable the media repository endpoints. This should be set
    # to false if the media repository is running as a separate endpoint;
    # doing so ensures that we will not run cache cleanup jobs on the
    # master, potentially causing inconsistency.
    self.enable_media_repo = config.get("enable_media_repo", True)

    setup looks for enable_media_repo in the yaml, sets to True by default, and adds it to the server section of ConfigObject.

  2. It is then passed into /config/repository.py at:

    # Only enable the media repo if either the media repo is enabled or the
    # current worker app is the media repo.
    if (
    self.root.server.enable_media_repo is False
    and config.get("worker_app") != "synapse.app.media_repository"
    ):
    self.can_load_media_repo = False
    return
    else:
    self.can_load_media_repo = True

    where can_load_media_repo is set for that single instance(in this case a worker). If set to True then it enables the endpoint processing for media bits, according to comments in the source (included in "thing 1 above) also prevents cache cleanup inappropriately on master. Then it's added to the media section of the ConfigObject.

  3. can_load_media_repo (specifically, (hs.config.|self.config|config.)media.can_load_media_repo) is then used in potentially four other files. I think that's far enough for the purpose of this issue. This will be the last place to search for "True" or "False".

  1. Further, there is the option at this line that looks for media_instance_running_background_jobs and assigns that worker_name to handle those tasks(which I believe is only url previews presently?)

After pondering about this for a few days, I've come to the conclusion that there are two possible paths forward in order to remove synapse.app.media_repository from the source in "Thing 2" above. It will still need a yaml setting of some kind so that the worker knows it will be responsible for handling the media repo.

  1. There is a similar setting already existing that can be placed in the worker yaml to handle this, enable_media_repo. Right now, this is used to explicitly disable the media functions on master(and other workers). I think if it's put into the worker yaml fragment, that will be appropriate. Logic will have to be added to look for this, I recommend in repository.py next to the existing logic, for backwards compatibility.
  2. Build a new instance style map, similar to pusher_instances and federation_sender_instances. A quick mockup would look like:
media_repo_instances:
  - media_repository1

Bonus to this approach would be that there would be no need for an enable_media_repo setting in potentially more than two other yaml files, it could be in either homeserver.yaml or shared.yaml. Additional opportunity exists to add the media_instance_running_background_jobs function to this at the same time and kill two birds with one stone.

Additional questions:

  1. Why is enable_media_repo in the server config section? Would it be more consistent to have it in workers?
@reivilibre
Copy link
Contributor

  1. Why is enable_media_repo in the server config section? Would it be more consistent to have it in workers?

because, confusingly, this flag controls the master process.

I like your approach of defining a media_repo_instances list, similar to what we do for federation senders and the rest. (And then we deprecate enable_media_repo — there's a reusable function somewhere for dealing with this style of config that we're moving away from.)
If you need an easy approach to nominate one worker to do the media background jobs: why not just pick the first one on the list? :)

@realtyem
Copy link
Owner Author

Not completely done with it all, still brainstorming about it. BUT,

If you need an easy approach to nominate one worker to do the media background jobs: why not just pick the first one on the list? :)

Is exactly what I do in my own repo lol

A more professional answer: Because the first one may not necessarily be the one best suited for the job. In an ideal world, these might be on different machines(currently I understand this not to be the case. Something about storage providers and HDD access). Future-proofing.

@realtyem
Copy link
Owner Author

In the hunt for 'why?' config.server.enable_media_repo

if name in ["media", "federation", "client"]:
if self.config.server.enable_media_repo:
media_repo = self.get_media_repository_resource()
resources.update(
{
MEDIA_R0_PREFIX: media_repo,
MEDIA_V3_PREFIX: media_repo,
LEGACY_MEDIA_PREFIX: media_repo,
}
)
elif name == "media":
raise ConfigError(
"'media' resource conflicts with enable_media_repo=False"
)

shows this can be switched to then worker section with no issues. It's counterpart,
elif name == "media":
if self.config.media.can_load_media_repo:
media_repo = self.get_media_repository_resource()
# We need to serve the admin servlets for media on the
# worker.
admin_resource = JsonResource(self, canonical_json=False)
register_servlets_for_media_repo(self, admin_resource)
resources.update(
{
MEDIA_R0_PREFIX: media_repo,
MEDIA_V3_PREFIX: media_repo,
LEGACY_MEDIA_PREFIX: media_repo,
"/_synapse/admin": admin_resource,
}
)
else:
logger.warning(
"A 'media' listener is configured but the media"
" repository is disabled. Ignoring."
)

looks at config.media.can_load_media_repo instead, which feels inconsistent.

All these sections do is add the endpoint resources to the current instance, be it worker or master.

@realtyem
Copy link
Owner Author

realtyem commented Nov 13, 2022

because, confusingly, this flag controls the master process.

I'm trying to find where. The only reference to server.enable_media_repo inside the /config/server.py file is shown above(in the OP). Anywhere else it would be referenced from the config object in the server section. So looking at
git grep -a -n --full-name "server.enable_media_repo" shows:

synapse/app/homeserver.py:238:            if self.config.server.enable_media_repo:
synapse/config/repository.py:123:            self.root.server.enable_media_repo is False

I'll be doing some source hacking to test this out, but I'm pretty sure just moving it into the worker section will be safe. Since the worker section IS populated even on the master process, this should be a simple switch-out.
--Edit:
Additional side note in favor of what I mean, the worker section contains the instance_map of replication http workers and such related things(like pusher_instances, etc).

@realtyem
Copy link
Owner Author

realtyem commented Nov 13, 2022

As an oddity that I'm hoping can be explained
>self.root.server.enable_media_repo

Doesn't the root part of this act like a .. in directory listing to go up one level(in essence) and therefore is irrelevant here as self is already at the top level of the object?

Nevermind, I figured it out. When setting up a specific subsection(like "media" in this case) need to reference another, already established section through "root" because it's obviously not "self" until it's fully setup and passed through.

@realtyem
Copy link
Owner Author

realtyem commented Nov 15, 2022

Fun fact: the configure_workers_and_start.py file has set enable_media_repo to true for two years now. 7e460ec Pretty sure this wasn't intended but a happy accident that happened to work. So it's possible that this worked all along and could have just set it to synapse.app.generic_worker and it would have been fine.

(Which means most of what I did above was pointless. Sigh. I suppose it'll be useful for refactoring to deal with the new possible instance_map thingy.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants