Server explains the reasons to the worker if no tasks are available #1582

maximmasiutin · 2023-03-19T00:22:41Z

For example, if the worker's machine had too low memory, that worker was not assigned any task by the server, without any explanation. Now the explanation is given.

zungur · 2023-03-19T00:45:01Z

I think sri hash and worker version updates also accompany any changes made to the worker.py

dubslow · 2023-03-19T04:01:38Z

Better to use a StrEnum https://docs.python.org/3/library/enum.html#enum.StrEnum I think than a scattering of arbitrary variables. Makes it more readable and less error prone in future coding.

vdbergh · 2023-03-19T04:09:43Z

There are already the info and error fields in the reply. Please use the info field. It will be handled correctly.

vdbergh · 2023-03-19T04:53:02Z

There is also the issue that the messages depend on the run that is being handled, which is suppressed in the feedback.

In general I think most messages are uninteresting to the worker. If we want to give some feedback then I think we should restrict ourselves to messages for maxmemory and minthreads, since these depend on settings the worker has control over.

dubslow · 2023-03-19T05:01:37Z

Oh I see, the PR intent is to include all related reasons for skipping a task, not just one...

dubslow · 2023-03-19T05:06:13Z

In general I think most messages are uninteresting to the worker. If we want to give some feedback then I think we should restrict ourselves to messages for maxmemory and minthreads, since these depend on settings the worker has control over.

Most of these look like worker-controllable problems. The main exception is the per-run core limit, and the lesser exception is the STC limit. But even the STC limit can be controlled by the worker's owner duplicating the worker, and in the per-run core limit, it's still nice to know that it's the server's fault, so to speak, not the worker.

vdbergh · 2023-03-19T05:47:23Z

I think there are only two serious misconfigurations a worker can do that may cause it to get no runs. Setting minthreads > 1 or setting maxmemory too low. To me it seems most useful to give feedback for these.

Anyway if we really want reasons for all runs then I would not use booleans but counts so that we can see how many runs where affected by a particular reason.

vdbergh · 2023-03-19T05:52:18Z

I would also make the server generate the info message. In that way the worker does not need to be changed.

dubslow · 2023-03-19T06:54:16Z

I had mostly completed this code before seeing the two prior comments, and decided to finish it, if for no other reason than as exercise:

maximmasiutin#4

Refactors the present PR with a Flag Enum, greatly improving both maintainability and readability, imo, while more or less preserving the same functionality as desired by Maxim.

I think there are only two serious misconfigurations a worker can do that may cause it to get no runs. Setting minthreads > 1 or setting maxmemory too low. To me it seems most useful to give feedback for these.

It seems to me that every one of the following messages has considerable utility to a worker-operator:

__messages = {MachineLimit: "This user has reached the max machines limit.",
              LowThreads:   "An available run requires more than CONCURRENCY threads."
              HighThreads:  "An available run requires less than MIN_THREADS threads."
              LowMemory:    "An available run requires more than MAX_MEMORY memory."
              NoBinary:     "This worker has exceeded its GitHub API limit, and has no local binary for an availabe run."
              SkipSTC:      "An available run is at STC, requiring less than CONCURRENCY threads due to cutechess issues. Consider splitting this worker. See Discord."
              ServerSide:   "Server error or no active runs. Try again shortly."
             }

(Obviously some are more common than others, e.g. MIN_THREADS, but that just makes clear messages all the more important in such rare cases.)

Anyway if we really want reasons for all runs then I would not use booleans but counts so that we can see how many runs where affected by a particular reason.

Hm.... interesting idea. I don't really see the added benefit of having a count, rather than simply a boolean, but it could certainly be done.

I would also make the server generate the info message. In that way the worker does not need to be changed.

Well, I wrote my PR-to-this-PR before seeing this comment, but, at any rate, the error messages refer to worker-side command line options, so I'm not certain that referring to worker command line options in the server code is ideal, from a maintenance perspective. (On the other hand, I explicitly duplicated the Flag enum in both files, which is similar in terms of duplicating on either side, but still better imo because the duplicated code is meaningful to both sides, whereas the worker command line options are only meaningful to the worker, not the server.)

Use an Enum to greatly improve both maintainability and readability Untested

vdbergh · 2023-03-19T09:07:32Z

Well, I wrote my PR-to-this-PR before seeing this comment, but, at any rate, the error messages refer to worker-side command line options, so I'm not certain that referring to worker command line options in the server code is ideal, from a maintenance perspective. (On the other hand, I explicitly duplicated the Flag enum in both files, which is similar in terms of duplicating on either side, but still better imo because the duplicated code is meaningful to both sides, whereas the worker command line options are only meaningful to the worker, not the server.)

The info messages refer to variables which are both meaningful for the worker and for server and which the worker can set via command line options or directly via the config file.

I think from a software engineering point of view it is much better that the worker does not control the info messages. Otherwise any server change requires a coordinated worker change.

dubslow · 2023-03-19T09:12:48Z

Otherwise any server change requires a coordinated worker change.

At the present time, I'm not sure that can be avoided? And I guess your general principle here is that the server changes more often than the worker?

vdbergh · 2023-03-19T09:30:54Z

Otherwise any server change requires a coordinated worker change.

At the present time, I'm not sure that can be avoided? And I guess your general principle here is that the server changes more often than the worker?

Yes. Moreover worker changes are more difficult since they need to be tested on different architectures (@ppigazzini does a great job on that).

I think if the server sets the info field then the worker will do the right thing (that's the intention anyway).

If the worker needs to be adapted then it is preferable that the adaptation is generic so that it only has to be done once.

maximmasiutin · 2023-03-19T09:48:52Z

@vdbergh Thank you very much by accepting my proposal and writing proper code! Your objections were more than reasonable, I also thought about implementing them since I didn't like lots of ifs and Booleans. I would not touch my code since you are already wrote the correct code.

maximmasiutin · 2023-03-19T10:01:19Z

Yes, I agree, thank you @vdbergh on your observation that we should aim to avoid worker changes. If the worker is already able to display a full message received from the server, than we should proceed this way. However, I didn't notice the way how worker may display that message, therefore, I have also updated the worker. However, for this particular case, worker updates are not a problem, since existing workers already work. The issue that the current pull request addresses is new installations which may for example have low memory. These new installation will likely be a git clone of the latest master. So updating worker in this particular case is OK.

Refactor maxim's request task error messages

dubslow · 2023-03-19T10:03:51Z

I think if the server sets the info field then the worker will do the right thing (that's the intention anyway).

It does not do anything except print a generic "no dice try again" message. Printing anything from the server will require updating the worker.

That said, as you say, it certainly be done once in a generic way -- simply print whatever messages the server sends.

maximmasiutin · 2023-03-19T10:04:20Z

Now this pull request contains the updates made by @dubslow via https://github.com/maximmasiutin/fishtest/pull/4/commits

maximmasiutin · 2023-03-19T10:14:41Z

There are pros and cons of formatting the message on the worker vs formatting it on the server.

If the message is formed by the server, than the worker will not need updates on new messages. If the message is formed by the worker, it can figure exactly the reason, can display language-specific messages, and has various options on how to proceed.

Therefore, I propose to combine both methods. The server sends kind of a dictionary of dictionaries which combines "code", "message" and optional values, like:

{
  {code: "worker_low_memory", message: "The MAX_MEMORY parameter is too low"},
  {code: "worker_core_limit_reached", message: "Exceeded the limit of 2048 cores set by the server", core_limit: 2048}
}

The client at the very least will just combine the messages and display, but will be able to also handle them properly if needed. The only thing to be careful about is not to catch deserialization exploits, i.e. we should write such a code that will not be affected by deserialization attack exploits.

dubslow · 2023-03-19T10:26:52Z

Duplicating my discord comment:

If the message is formed by the worker, it can figure exactly the reason, can display language-specific messages, and has various options on how to proceed.

That's exactly the stuff that has to be updated in the worker every time we add a new reason or change existing reasons serverside. Vdbergh specifically wants to not update the worker when the server is updated. And I kind of agree, with more thought.

For now, I'll convert my recent commit to sending pre-formatted strings from the server to the worker. If we truly find something in the future that needs worker-side interpretation, it would be easy enough to add that then.

dubslow · 2023-03-19T10:47:00Z

I've pushed a new version which moves all message interpretation from the worker to the server. Now the worker looks almost like master, only now it prints whatever string the server sends.

maximmasiutin#5

It still only does boolean error reporting, but it's not so hard (more or less trivial, I daresay) to add counting on top of that if we really want it.

Per vdbergh's comments Still untested

Remove all message interpretation from worker to server

maximmasiutin · 2023-03-19T11:17:25Z

That's exactly the stuff that has to be updated in the worker every time we add a new reason or change

No, @dubslow - the worker can just enumerate (foreach) and display all the messages. No update on the worker is needed on new message with my approach proposed in the comment #1582 (comment)

Server explains the reasons to the worker if no tasks are available

0762816

maximmasiutin force-pushed the no-tasks-reasons branch from 6adc4de to 0762816 Compare March 19, 2023 01:04

Refactor maxim's request task error messages

58d42bd

Use an Enum to greatly improve both maintainability and readability Untested

Merge pull request #4 from dubslow/no-tasks-reasons

e3f105e

Refactor maxim's request task error messages

dubslow and others added 2 commits March 19, 2023 05:51

Remove all message interpretation from worker to server

8f5b567

Per vdbergh's comments Still untested

Merge pull request #5 from dubslow/server-messages-worker

532cadb

Remove all message interpretation from worker to server

maximmasiutin added 4 commits April 9, 2023 01:12

updated sri.txt hashes

e829cd9

fixed syntax errors; used pylint

91cb772

Used black

fe52d55

updated sri.txt hashes

d9d0c5e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Server explains the reasons to the worker if no tasks are available #1582

Server explains the reasons to the worker if no tasks are available #1582

maximmasiutin commented Mar 19, 2023

zungur commented Mar 19, 2023

dubslow commented Mar 19, 2023 •

edited

Loading

vdbergh commented Mar 19, 2023 •

edited

Loading

vdbergh commented Mar 19, 2023

dubslow commented Mar 19, 2023

dubslow commented Mar 19, 2023

vdbergh commented Mar 19, 2023 •

edited

Loading

vdbergh commented Mar 19, 2023

dubslow commented Mar 19, 2023 •

edited

Loading

vdbergh commented Mar 19, 2023

dubslow commented Mar 19, 2023

vdbergh commented Mar 19, 2023 •

edited

Loading

maximmasiutin commented Mar 19, 2023

maximmasiutin commented Mar 19, 2023 via email •

edited

Loading

dubslow commented Mar 19, 2023

maximmasiutin commented Mar 19, 2023

maximmasiutin commented Mar 19, 2023

dubslow commented Mar 19, 2023 •

edited

Loading

dubslow commented Mar 19, 2023

maximmasiutin commented Mar 19, 2023

Server explains the reasons to the worker if no tasks are available #1582

Are you sure you want to change the base?

Server explains the reasons to the worker if no tasks are available #1582

Conversation

maximmasiutin commented Mar 19, 2023

zungur commented Mar 19, 2023

dubslow commented Mar 19, 2023 • edited Loading

vdbergh commented Mar 19, 2023 • edited Loading

vdbergh commented Mar 19, 2023

dubslow commented Mar 19, 2023

dubslow commented Mar 19, 2023

vdbergh commented Mar 19, 2023 • edited Loading

vdbergh commented Mar 19, 2023

dubslow commented Mar 19, 2023 • edited Loading

vdbergh commented Mar 19, 2023

dubslow commented Mar 19, 2023

vdbergh commented Mar 19, 2023 • edited Loading

maximmasiutin commented Mar 19, 2023

maximmasiutin commented Mar 19, 2023 via email • edited Loading

dubslow commented Mar 19, 2023

maximmasiutin commented Mar 19, 2023

maximmasiutin commented Mar 19, 2023

dubslow commented Mar 19, 2023 • edited Loading

dubslow commented Mar 19, 2023

maximmasiutin commented Mar 19, 2023

dubslow commented Mar 19, 2023 •

edited

Loading

vdbergh commented Mar 19, 2023 •

edited

Loading

vdbergh commented Mar 19, 2023 •

edited

Loading

dubslow commented Mar 19, 2023 •

edited

Loading

vdbergh commented Mar 19, 2023 •

edited

Loading

maximmasiutin commented Mar 19, 2023 via email •

edited

Loading

dubslow commented Mar 19, 2023 •

edited

Loading