Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

query: expose is_executable #15948

Closed

Conversation

illicitonion
Copy link
Contributor

It can be useful to know this, and currently the best proxy appears to
be to look for attributes.$is_executable (which isn't documented), and
or-ing that with a match on rule_class.ends_with("_test").

Instead, let's have a clearly documented API here.

@illicitonion illicitonion requested a review from lberki as a code owner July 22, 2022 13:24
@lberki lberki requested review from comius and removed request for lberki July 22, 2022 18:28
@sgowroji sgowroji added team-Build-Language awaiting-review PR is awaiting review from an assigned reviewer labels Jul 25, 2022
@KushalP
Copy link

KushalP commented Sep 6, 2022

Any ETA on merging this change?

@brandjon brandjon added team-Core Skyframe, bazel query, BEP, options parsing, bazelrc and removed team-Build-Language labels Nov 5, 2022
@brandjon
Copy link
Member

brandjon commented Nov 5, 2022

Not sure whether to route this to Core (@haxorz), Configurability (@gregestren), or Rules-API (@comius), so CCing TLs from all three.

@lberki
Copy link
Contributor

lberki commented Nov 7, 2022

Since this is in the twilight zone between all these teams, I'll step in: why is this logic useful for you? This at least doesn't work with aliases (and can't, since bazel query cannot resolve them) and doesn't match the logic in RunCommand.isExecutable().

So I don't immediately see how this is superior to looking at the $is_executable attribute.

@illicitonion
Copy link
Contributor Author

So I don't immediately see how this is superior to looking at the $is_executable attribute.

It's mostly superior in terms of being documented (and, by being a proto-field, being self-documenting). As a consumer, it's currently unclear whether $is_executable is a stable public API, or an accidental implementation details.

Also, being logic-free - right now the check is that a rule either has the $is_executable attr or that its rule_class ends up _test, which is logic which may need writing multiple times.

I'd be happy to instead document $is_executable as a known attribute folks can rely on, and ideally send out a change setting that attr of test targets too (to avoid needing to write the or-ing logic), if that'd be preferable

@haxorz
Copy link
Contributor

haxorz commented Nov 7, 2022

I'd prefer the core technical parts of internal issues 28639562, 119498716, and 31180215 first get resolved then the binaries query function can be open-sourced in Bazel.

Here's the [proposed] spec of binaries (iirc from the technical discussion in 119498716):

If E evaluates to S, then binaries(E) evalutes to the subset of S containing all the "binary" targets. A target is considered a "binary" target if it's an instance of a native rule class whose name ends in "_binary" or an instance of a Starlark rule class with executable = True.

@illicitonion wdyt about this binaries function? Would it obsolete this PR or at least complement it?


Due to some internal systems, implementing this binaries function is not as trivial as it sounds. The core technical issue is discussed in #6 - #20 in issue 119498716. This issue was never handled when the binaries function was added in ~2016; the implementation doesn't do the 'an instance of a Starlark rule class with executable = True' part of the spec, instead opting for the hacky-but-easy approach of hardcoding some names of common binary-ish Starlark rule classes.

This query function was never open-sourced in Bazel because of this tech debt. I prefer we resolve this tech debt.


[@lberki ] This at least doesn't work with aliases (and can't, since bazel query cannot resolve them)

Yeah, and ditto for the proposed binaries query function. See comment #7 in internal issue 28639562. But I don't think we should support alias; see comment #8.

@illicitonion
Copy link
Contributor Author

wdyt about this binaries function? Would it obsolete this PR or at least complement it?

Yes, a binaries function would obviate my need for this PR.


I don't have access to the internal bugs, but I'd be interested in the context of not wanting to support alias (in particular, for not wanting to support alias in cquery - I can see why it's hard to support in the context of query). It's cquery where I actually want this information, where I think we should be able to resolve through an alias...

@haxorz
Copy link
Contributor

haxorz commented Nov 7, 2022

Yes, a binaries function would obviate my need for this PR.

Cool, thanks for the info! I added a note to the internal issues.

... (in particular, for not wanting to support alias in cquery... It's cquery where I actually want this information, where I think we should be able to resolve through an alias...

Yeah cquery is technically capable of resolving alias chains.

I would be fine with cquery having different binaries semantics than query. Might be best to do this as a followup FR to the internal issues though.

I don't have access to the internal bugs, but I'd be interested in the context of not wanting to support alias... I can see why it's hard to support in the context of query

Yeah sorry for the lack of transparency. I gave those links mostly for @lberki and @brandjon.

The context is that if binaries followed alias chains (in query-land at least) then there's an internal system at Google whose incremental mode of operation would become unsound due to it not being technically feasible to know when an alias chain has changed. This system isn't doing top-down things like binaries(//p/...) (that is "please tell me all the binary targets in all packages under directory p") but instead doing bottom-up incremental things like "for a given repo diff D, tell me if any of the top-down things in my universe I care about have changed"). It's not technically feasible for this system to have access to sufficient information for doing this for alias chains.

@illicitonion
Copy link
Contributor Author

This system isn't doing top-down things like binaries(//p/...) (that is "please tell me all the binary targets in all packages under directory p") but instead doing bottom-up incremental things like "for a given repo diff D, tell me if any of the top-down things in my universe I care about have changed"). It's not technically feasible for this system to have access to sufficient information for doing this for alias chains.

Interesting - the thing that prompted me to file this PR is actually a very similar-sounding system: https://github.com/bazel-contrib/target-determinator - its goal is to be able to tell you which targets (that you care about) changed between two git commits.

Roughly the form this tools operation takes is: Given some query pattern:

  1. Run a cquery deps($query)
  2. Run a cquery $query
  3. Come up with a hash for every target in the matching set, including via digesting input files and their deps

We do this at both the "before" and the "after" commit, and compare hashes to find the affected targets.

I've seen a number of tools try to do similar analysis by crafting careful queries without actually winding back to the prior commit and performing the queries there too, and they have all missed important possible changes. Is that roughly what's going on in your internal tool too? I'd be curious as to ways we can generally making solving this kind of problem easier (either inside or outside of Bazel)...

@haxorz
Copy link
Contributor

haxorz commented Nov 7, 2022

Interesting - the thing that prompted me to file this PR is actually a very similar-sounding system

Yes, very interesting!

Roughly the form this tools operation takes is: Given some query pattern...

[Since I know the alias discussion wasn't your central point, this is an aside...]

If cquery's binaries follows alias chains, then I think you're in trouble in that niche situation.

Imagine //p1:t1 is an alias to //p2:t2. Imagine the pattern of interest is binaries(//p1:t1). Then step 2's cquery "deps(binaries(//p1:t1))" will be exactly the same as cquery "deps(//p2:t2)" and step 3's cquery "binaries(//p1:t1)" will be exactly the same as cquery "//p2:t2" i.e. steps 2 & 3 will not be helpful for detecting changes to the alias target of //p1:t1.

I've seen a number of tools try to do similar analysis by crafting careful queries without actually winding back to the prior commit and performing the queries there too, and they have all missed important possible changes.

Yeah agree that this is using an algorithmic smell.

Is that roughly what's going on in your internal tool too?

Similar goal but we take a bottom-up approach. Scale is too large for us to do the top-down approach of deps(T) for all T at every commit.

@comius comius removed their request for review September 6, 2023 14:09
@meteorcloudy meteorcloudy requested a review from a team as a code owner November 13, 2023 10:19
@meteorcloudy meteorcloudy requested review from sdtwigg and removed request for a team November 13, 2023 10:19
@katre katre added the team-Rules-API API for writing rules/aspects: providers, runfiles, actions, artifacts label Nov 13, 2023
@katre
Copy link
Member

katre commented Nov 13, 2023

I've deleted my previous comments because they were inaccurate.

This PR has no impact on Configurability and is a request for existing data to be exposed via the ctx API, so it should be handled by the Rules API.

I see that @comius un-assigned this a few months ago: it should either be assigned to someone on that team or closed because we don't plan to merge it.

Sorry for the confusion, @illicitonion.

@katre katre requested review from comius and removed request for sdtwigg November 13, 2023 17:08
@comius
Copy link
Contributor

comius commented May 15, 2024

Is there still interest to get this in?

@comius comius added awaiting-user-response Awaiting a response from the author and removed awaiting-review PR is awaiting review from an assigned reviewer labels May 15, 2024
@illicitonion
Copy link
Contributor Author

Yeah, I'd happily get this merged - let me know if I should rebase

@comius
Copy link
Contributor

comius commented May 16, 2024

Yes, please go ahead with rebase.

@comius comius removed the awaiting-user-response Awaiting a response from the author label May 16, 2024
@comius comius added awaiting-review PR is awaiting review from an assigned reviewer and removed team-Core Skyframe, bazel query, BEP, options parsing, bazelrc labels May 16, 2024
@comius
Copy link
Contributor

comius commented May 16, 2024

cc @mai93 (I believe IDEs need this data as well)

@illicitonion illicitonion force-pushed the query-is-executable branch from 02d5ade to e760a13 Compare May 16, 2024 14:58
@illicitonion
Copy link
Contributor Author

@comius This is rebased - ready for review/merge :)

@comius
Copy link
Contributor

comius commented May 17, 2024

cc @tetromino, do you know if this change has any effect on stardoc?
cc @LuckyGeck, @sophie4869 I know you've been asking for something like this a couple months ago :) ? Is this still usable?

It can be useful to know this, and currently the best proxy appears to
be to look for `attributes.$is_executable` (which isn't documented), and
or-ing that with a match on `rule_class.ends_with("_test")`.

Instead, let's have a clearly documented API here.
@illicitonion illicitonion force-pushed the query-is-executable branch from d8d5ee5 to 31d43ad Compare May 17, 2024 09:52
@illicitonion
Copy link
Contributor Author

All addressed - thanks!

@comius comius added awaiting-PR-merge PR has been approved by a reviewer and is ready to be merge internally and removed awaiting-review PR is awaiting review from an assigned reviewer labels May 17, 2024
@iancha1992
Copy link
Member

@illicitonion @comius looks like src/main/java/com/google/devtools/build/lib/rules/starlarkdocextract/ModuleInfoExtractor.java file is no longer in the master. Can you please resolve this?

@tetromino
Copy link
Contributor

@illicitonion @comius looks like src/main/java/com/google/devtools/build/lib/rules/starlarkdocextract/ModuleInfoExtractor.java file is no longer in the master. Can you please resolve this?

@iancha1992 - mea culpa, I had moved ModuleInfoExtractor.java to a different package in 9af4e69

I just now merged master into the pr's branch to fix the problem.

@comius
Copy link
Contributor

comius commented May 22, 2024

Update: Not stuck, I'm still fixing internal tests.

@comius
Copy link
Contributor

comius commented May 29, 2024

After importing the PR, I realised the blast radius of the denormalization of build.proto (having information on target instead on rule class) is much bigger that what I previously thought, the field is always there, and the change is not reversible. I gathered additional opinion from other internal reviewers and we don't feel ok going forward with changes to build.proto.

One alternative was already mentioned by @haxorz in #15948 (comment)

Information is available from stardoc already: https://cs.opensource.google/bazel/bazel/+/master:src/main/protobuf/stardoc_output.proto;l=103;drc=f4cfc846dbdf5f6c19d0a716fccd2ddcdae0d609

We plan to do a safer modification to expose more information about the rules, eta start of Q3.

Bazel rules were quite different when the PR was proposed. Nowadays as far as I know only genrule remains, that can be either executable or not based on an attribute. For all other rule classes, executable is now a static property.

I'll import the part of the PR, without the change of build.proto, which is a nice improvement.

@copybara-service copybara-service bot closed this in 0c8cb4b Jun 4, 2024
@github-actions github-actions bot removed the awaiting-PR-merge PR has been approved by a reviewer and is ready to be merge internally label Jun 4, 2024
@sophie4869
Copy link

cc @tetromino, do you know if this change has any effect on stardoc? cc @LuckyGeck, @sophie4869 I know you've been asking for something like this a couple months ago :) ? Is this still usable?

Yes we rely on $is_executable to decide if a target is runnable. I don't see how this new API may have an effect on the Go interpreter for Starlark though? We don't interact with the Java implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team-Rules-API API for writing rules/aspects: providers, runfiles, actions, artifacts
Projects
None yet
Development

Successfully merging this pull request may close these issues.