Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allowedUnmatched not working as expected. #1854

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

hartsock
Copy link

We have an unusual bug in our buildfarm deploy. Some product-teams send us "OSFamily" and "Linux" and this does NOT match "OSFamily" and "Linux" in the queue configuration used by servers or workers.

Examples:

  queues:
    - name: "*"
      allowUnmatched: true
    - name: "default"
      allowUnmatched: true
      properties:
        - name: "OSFamily"
          value: "*"
    - name: "linux"
      allowUnmatched: true
      properties:
        - name: "min-cores"
          value: "*"
        - name: "max-cores"
          value: "*"
        - name: "OSFamily"
          value: "Linux"

Even with all three queues defined, the client is seeing

(INVALID) The `Platform` of the `Command` was invalid.: properties are not valid for queue eligibility: [name: "OSFamily"
value: "Linux"
].

Conjectures:

  • Is a hidden char in the "Linux" string is making it into an upstream OSS project's actions from the Bazel side of the conversation?
  • Is a charset difference between the Bazel and BuildFarm service present and causing a mismatch?
  • Is the character * used on the service/worker side configuration from a different alphabet than the one in the JDK?

Either way, rearranging the order of the checks produces the desired effect and the client no longer receives the error that "Linux" does not equal "Linux" ... still ... it would be nice to specify "Linux" and have it match "Linux"

Is this as intended? Should allowUnmatched: true refuse to match when the properties don't match?

We have an unusual bug in our buildfarm deploy. Some product-teams send us "OSFamily" and "Linux" and this does NOT match "OSFamily" and "Linux" in the configuration used by servers or workers. 

Changing to use "*" as the OSFamily doesn't work either. We change the logic to avoid checking the properties and the situation resolves.

Is this as intended? Should `allowUnmatched: true` refuse to match when the properties don't match?
@werkt
Copy link
Collaborator

werkt commented Sep 20, 2024

I'll admit I'm not following a couple of things:
Minor: The invalid message string you pasted is missing some copy at the end, based on

"properties are not valid for queue eligibility: %s. If you think your queue"
. If you truncated it to be more brief, that's understandable.
Major: The error message above, which is the only instance of that copy, occurs on the server, not the worker, but your change only affects worker activity. I don't see how it could possibly cause the error to not occur based on where that error is generated.

The Command's definition is conveniently stored in a blob in this case which must be deserialized by the stock protobuf parser - if you can retrieve the Command's bytes (bf-cat File will retrieve this for the digest), parse those bytes into a Command message, and compare it to a config-parsed queue definition, we should be able to route out any differences due to encodings.

Happy to do that comparison myself if you want to present a pathological copy of your Command blob.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants