-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
build: introduce mechanism to implant generated code from the bazel sandbox back into the source tree #58018
Comments
I believe this mechanism should be folded into #56965. |
Part of #57801. Generating these files within the sandbox was a bit tricky seeing as how the optgen language compiler uses itself to generate its AST expressions. This is a form of compiler bootstrapping[1], and needed some manual overrides to fit into how Bazel wants things to be. To this end, we define a second `go_library` target[2] that sources pre-generated `og.go` files that are already checked into the source tree. These files are then used to compile `langgen`, which is in turn used to generate newer revisions of those same generated files. All dependants of [3] will need to depend on the og.go files generated in the Bazel sandbox, which is achieved by [4]. Conversely, to instruct gazelle/Bazel to resolve langgen's import of [3] appropriately, we added a custom resolve directive. We'll eventually want to be able to implant the sandbox generated files back into the source tree. This will be necessary for IDEs, but also for a sane workflow for the engineers working on langgen. If they were only using Bazel today, langgen dependent builds would misbehave as we wouldn't be updating the checked-in og.go files used to build langgen itself (assuming any diffs there would result in different code/compilation behaviour). See #58018. [1]: https://en.wikipedia.org/wiki/Bootstrapping_(compilers) [2]: //pkg/sql/opt/optgen/lang:bootstrap [3]: github.com/cockroachdb/cockroach/pkg/sql/opt/optgen/lang [4]: See the "gazelle:resolve" directive in the top-level BUILD.bazel file Release note: None
Part of #57801. Generating these files within the sandbox was a bit tricky seeing as how the optgen language compiler uses itself to generate its AST expressions. This is a form of compiler bootstrapping[1], and needed some manual overrides to fit into how Bazel wants things to be. To this end, we define a second `go_library` target[2] that sources pre-generated `og.go` files that are already checked into the source tree. These files are then used to compile `langgen`, which is in turn used to generate newer revisions of those same generated files. All dependants of [3] will need to depend on the og.go files generated in the Bazel sandbox, which is achieved by [4]. Conversely, to instruct gazelle/Bazel to resolve langgen's import of [3] appropriately, we added a custom resolve directive. We'll eventually want to be able to implant the sandbox generated files back into the source tree. This will be necessary for IDEs, but also for a sane workflow for the engineers working on langgen. If they were only using Bazel today, langgen dependent builds would misbehave as we wouldn't be updating the checked-in og.go files used to build langgen itself (assuming any diffs there would result in different code/compilation behaviour). See #58018. [1]: https://en.wikipedia.org/wiki/Bootstrapping_(compilers) [2]: //pkg/sql/opt/optgen/lang:bootstrap [3]: github.com/cockroachdb/cockroach/pkg/sql/opt/optgen/lang [4]: See the "gazelle:resolve" directive in the top-level BUILD.bazel file Release note: None
Part of #57801. Generating these files within the sandbox was a bit tricky seeing as how the optgen language compiler uses itself to generate its AST expressions. This is a form of compiler bootstrapping[1], and needed some manual overrides to fit into how Bazel wants things to be. To this end, we define a second `go_library` target[2] that sources pre-generated `og.go` files that are already checked into the source tree. These files are then used to compile `langgen`, which is in turn used to generate newer revisions of those same generated files. All dependants of [3] will need to depend on the og.go files generated in the Bazel sandbox, which is achieved by [4]. Conversely, to instruct gazelle/Bazel to resolve langgen's import of [3] appropriately, we added a custom resolve directive. We'll eventually want to be able to implant the sandbox generated files back into the source tree. This will be necessary for IDEs, but also for a sane workflow for the engineers working on langgen. If they were only using Bazel today, langgen dependent builds would misbehave as we wouldn't be updating the checked-in og.go files used to build langgen itself (assuming any diffs there would result in different code/compilation behaviour). See #58018. [1]: https://en.wikipedia.org/wiki/Bootstrapping_(compilers) [2]: //pkg/sql/opt/optgen/lang:bootstrap [3]: github.com/cockroachdb/cockroach/pkg/sql/opt/optgen/lang [4]: See the "gazelle:resolve" directive in the top-level BUILD.bazel file Release note: None
Part of #57801. Generating these files within the sandbox was a bit tricky seeing as how the optgen language compiler uses itself to generate its AST expressions. This is a form of compiler bootstrapping[1], and needed some manual overrides to fit into how Bazel wants things to be. To this end, we define a second `go_library` target[2] that sources pre-generated `og.go` files that are already checked into the source tree. These files are then used to compile `langgen`, which is in turn used to generate newer revisions of those same generated files. All dependants of [3] will need to depend on the og.go files generated in the Bazel sandbox, which is achieved by [4]. Conversely, to instruct gazelle/Bazel to resolve langgen's import of [3] appropriately, we added a custom resolve directive. We'll eventually want to be able to implant the sandbox generated files back into the source tree. This will be necessary for IDEs, but also for a sane workflow for the engineers working on langgen. If they were only using Bazel today, langgen dependent builds would misbehave as we wouldn't be updating the checked-in og.go files used to build langgen itself (assuming any diffs there would result in different code/compilation behaviour). See #58018. [1]: https://en.wikipedia.org/wiki/Bootstrapping_(compilers) [2]: //pkg/sql/opt/optgen/lang:bootstrap [3]: github.com/cockroachdb/cockroach/pkg/sql/opt/optgen/lang [4]: See the "gazelle:resolve" directive in the top-level BUILD.bazel file Release note: None
58019: lang{,gen},bazel: generate more files within the sandbox r=irfansharif a=irfansharif Part of #57801. Generating these files within the sandbox was a bit tricky seeing as how the optgen language compiler uses itself to generate its AST expressions. This is a form of compiler bootstrapping [1], and needed some manual overrides to fit into how Bazel wants things to be. To this end, we define a second `go_library` target [2] that sources pre-generated `og.go` files that are already checked into the source tree. These files are then used to compile `langgen`, which is in turn used to generate newer revisions of those same generated files. All dependants of [3] will need to depend on the og.go files generated in the Bazel sandbox, which is achieved by [4]. Conversely, to instruct gazelle/Bazel to resolve langgen's import of [3] appropriately, we added a custom resolve directive. We'll eventually want to be able to implant the sandbox generated files back into the source tree. This will be necessary for IDEs, but also for a sane workflow for the engineers working on langgen. If they were only using Bazel today, langgen dependent builds would misbehave as we wouldn't be updating the checked-in og.go files used to build langgen itself (assuming any diffs there would result in different code/compilation behaviour). See #58018. ``` [1]: https://en.wikipedia.org/wiki/Bootstrapping_(compilers) [2]: //pkg/sql/opt/optgen/lang:bootstrap [3]: github.com/cockroachdb/cockroach/pkg/sql/opt/optgen/lang [4]: See the "gazelle:resolve" directive in the top-level BUILD.bazel file ``` Release note: None Co-authored-by: irfan sharif <irfanmahmoudsharif@gmail.com>
* Add a target to build `docs/generated/logging.md` in the Bazel sandbox. * Add some machinery to run `pkg/util/fsm/gen` in the sandbox, and add a target in `pkg/sql` to generate the `txnstatetransitions` files. * A couple build targets that were already using `//pkg/util/log/gen` had it in their `srcs` list instead of the `exec_tools` list, so fix that in a few places. * Augment the `check-genfiles` script to check that no new `go:generate` comments are added. This will help us ensure the shape of the build doesn't change under our feet too much. The following work items still remain to be done: * Start building this generated code in CI to ensure that there aren't regressions. * Start consuming generated file targets from `dev` and stage them in the working directory (cockroachdb#58018). * The following files contain `go:generate` comments that are not bazelfied: `pkg/ccl/sqlproxyccl/admitter/service.go`, `pkg/ccl/sqlproxyccl/tenant/directory.go`, `pkg/security/certmgr/cert.go`, and `pkg/server/api_v2.go`. Release note: None
* Add a target to build `docs/generated/logging.md` in the Bazel sandbox. * Add some machinery to run `pkg/util/fsm/gen` in the sandbox, and add a target in `pkg/sql` to generate the `txnstatetransitions` files. * A couple build targets that were already using `//pkg/util/log/gen` had it in their `srcs` list instead of the `exec_tools` list, so fix that in a few places. * Augment the `check-genfiles` script to check that no new `go:generate` comments are added. This will help us ensure the shape of the build doesn't change under our feet too much. The following work items still remain to be done: * Start building this generated code in CI to ensure that there aren't regressions. * Start consuming generated file targets from `dev` and stage them in the working directory (cockroachdb#58018). * The following files contain `go:generate` comments that are not bazelfied: `pkg/ccl/sqlproxyccl/admitter/service.go`, `pkg/ccl/sqlproxyccl/denylist/service.go`, `pkg/ccl/sqlproxyccl/tenant/directory.go`, `pkg/security/certmgr/cert.go`, and `pkg/server/api_v2.go`. Release note: None
64658: bazel: build some generated files in the bazel sandbox r=rail a=rickystewart * Add a target to build `docs/generated/logging.md` in the Bazel sandbox. * Add some machinery to run `pkg/util/fsm/gen` in the sandbox, and add a target in `pkg/sql` to generate the `txnstatetransitions` files. * A couple build targets that were already using `//pkg/util/log/gen` had it in their `srcs` list instead of the `exec_tools` list, so fix that in a few places. * Augment the `check-genfiles` script to check that no new `go:generate` comments are added. This will help us ensure the shape of the build doesn't change under our feet too much. The following work items still remain to be done: * Start building this generated code in CI to ensure that there aren't regressions. * Start consuming generated file targets from `dev` and stage them in the working directory (#58018). * The following files contain `go:generate` comments that are not bazelfied: `pkg/ccl/sqlproxyccl/admitter/service.go`, `pkg/ccl/sqlproxyccl/denylist/service.go`, `pkg/ccl/sqlproxyccl/tenant/directory.go`, `pkg/security/certmgr/cert.go`, and `pkg/server/api_v2.go`. Release note: None Co-authored-by: Ricky Stewart <ricky@cockroachlabs.com>
quoth irfan who poked at this during breather week: this is too difficult so I’ll give up. the basic version through dev where it searches through bazel to find all generated pb.go files is easy enough, but I don’t think it’s the right approach. It doesn’t take advantage of any sort of lazy evaluation (“don’t copy any files if nothing’s changed”), and to generate the proto files, we’d basically have to re-build the entire source tree (there’s no easy way to bazel build a go_proto_library target). |
Some relevant links: bazel-contrib/rules_go#2111. I think the long-term strategy the bazel/rules_go folks are thinking about with |
I haven't studied the https://github.com/dropbox/dbx_build_tools too closely. @otan or @AlexTalks, by any chance do you know where this "copy over generated files" code lives? Or know a friend who knows? |
gazel was done here: https://github.com/dropbox/dbx_build_tools/blob/master/build_tools/bzl_lib/gazel.py note we didn't really allow go directives (e.g. |
One might be expected that this command be implemented as `dev generate protobuf` or `dev generate execgen` or similar, but that's not really practical for a couple reasons: 1. Building all the protobuf in-tree isn't really possible on its own: see cockroachdb#58018 (comment). 2. For non-proto generated code, one can imagine that we do a bunch of `bazel query`s to list all the generated files in tree and to find which files are generated. Given that there are hundreds of generated files in tree, and a single `bazel query` can take ~1s, this is not tractable. The implementation here is ad-hoc and a bit hacky. Basically we search `bazel-bin` for files ending in `.go` and apply some heuristics to figure out whether they should be copied and where those files should go. This is gated behind a feature flag for now because actually copying these files into the workspace right now dirties a ton of files. (The diffs are benign -- mostly comments.) We can flip this option to be on by default when it's safe (probably after the `Makefile` is deleted). Closes cockroachdb#68565. Release note: None
One might be expected that this command be implemented as `dev generate protobuf` or `dev generate execgen` or similar, but that's not really practical for a couple reasons: 1. Building all the protobuf in-tree isn't really possible on its own: see cockroachdb#58018 (comment). 2. For non-proto generated code, one can imagine that we do a bunch of `bazel query`s to list all the generated files in tree and to find which files are generated. Given that there are hundreds of generated files in tree, and a single `bazel query` can take ~1s, this is not tractable. The implementation here is ad-hoc and a bit hacky. Basically we search `bazel-bin` for files ending in `.go` and apply some heuristics to figure out whether they should be copied and where those files should go. This is gated behind a feature flag for now because actually copying these files into the workspace right now dirties a ton of files. (The diffs are benign -- mostly comments.) We can flip this option to be on by default when it's safe (probably after the `Makefile` is deleted). Closes cockroachdb#68565. Release note: None
One might be expected that this command be implemented as `dev generate protobuf` or `dev generate execgen` or similar, but that's not really practical for a couple reasons: 1. Building all the protobuf in-tree isn't really possible on its own: see cockroachdb#58018 (comment). 2. For non-proto generated code, one can imagine that we do a bunch of `bazel query`s to list all the generated files in tree and to find which files are generated. Given that there are hundreds of generated files in tree, and a single `bazel query` can take ~1s, this is not tractable. The implementation here is ad-hoc and a bit hacky. Basically we search `bazel-bin` for files ending in `.go` and apply some heuristics to figure out whether they should be copied and where those files should go. This is gated behind a feature flag for now because actually copying these files into the workspace right now dirties a ton of files. (The diffs are benign -- mostly comments.) We can flip this option to be on by default when it's safe (probably after the `Makefile` is deleted). Closes cockroachdb#68565. Release note: None
68102: storage/storageccl: Add an option to break export mid key r=pbardea,sumeerbhola a=aliher1911 Previously when exporting ranges with very large number of versions export could hit the situation where single key export would exceed maximum export byte limit. To resolve this, safety threshold has to be raised which could cause nodes going out of memory. To address this, this patch adds an ability to stop and resume on the arbitrary key timestamp so that it could be stopped mid key and resumed later. Release note: None Fixes #68231 68390: ui: rebuild the databases pages r=matthewtodd a=matthewtodd Previously, the databases pages featured an older, outdated design. This change aligns their UX with that of the other modern pages in the DB console, statements, transactions, and sessions. This is a first pass at the job. We will return to layer on more features, including search, and more data, including node & region information. Addresses #65737. Before | After --- | --- <img width="1372" alt="Screen Shot 2021-08-05 at 4 54 31 PM" src="https://user-images.githubusercontent.com/5261/128420405-7a096c2f-32a1-42d9-a8f0-186f699fb612.png"> | <img width="1372" alt="Screen Shot 2021-08-03 at 2 50 12 PM" src="https://user-images.githubusercontent.com/5261/128070069-28d0f461-69dd-47cb-a04b-bc6cd0126ca4.png"> <img width="1372" alt="Screen Shot 2021-08-05 at 4 54 57 PM" src="https://user-images.githubusercontent.com/5261/128420491-71d03570-1c0d-413b-ab40-09b45cf614ee.png"> | <img width="1372" alt="Screen Shot 2021-08-03 at 2 50 21 PM" src="https://user-images.githubusercontent.com/5261/128070090-bd62d125-a1b2-4c83-b0a5-33d0723849eb.png"> <img width="1372" alt="Screen Shot 2021-08-05 at 5 03 37 PM" src="https://user-images.githubusercontent.com/5261/128420959-338ea79b-abf0-4773-91bf-cecac64fec5f.png"> | (This overview of database grants no longer exists.) (This rolled-up view of table grants did not previously exist.) | <img width="1372" alt="Screen Shot 2021-08-03 at 2 50 27 PM" src="https://user-images.githubusercontent.com/5261/128070100-1305c94f-c017-40a5-ba2d-f19fd61bdecf.png"> <img width="1372" alt="Screen Shot 2021-08-05 at 4 55 21 PM" src="https://user-images.githubusercontent.com/5261/128420596-2d9f04ad-718c-4dce-b045-b5647f718339.png"> | <img width="1372" alt="Screen Shot 2021-08-03 at 2 50 35 PM" src="https://user-images.githubusercontent.com/5261/128070103-831fd5c3-38da-4864-9d82-9091f3209606.png"> <img width="1372" alt="Screen Shot 2021-08-05 at 4 55 28 PM" src="https://user-images.githubusercontent.com/5261/128420658-4042c631-b283-4b0c-ac4a-3235bd3bcf39.png"> | <img width="1372" alt="Screen Shot 2021-08-03 at 2 50 41 PM" src="https://user-images.githubusercontent.com/5261/128070113-73528f2c-56bf-4fcf-a8d8-fcc3f100695f.png"> Release note (ui change): The databases pages in the DB console were updated to bring them into alignment with our modern UX. 68619: storage: add file storing min version needed for backwards compatibility r=jbowens a=andyyang890 A new STORAGE_MIN_VERSION file containing the minimum version that the storage engine needs to maintain backwards compatibility with will be added to the directory for each storage engine. This will be used in the encryption-at-rest registry migration. Release note: None 68621: dev: implement `dev generate docs` r=rail a=rickystewart Do some refactoring so that we don't have to repeat code in a bunch of places. The implementation is pretty straightforward and just involves building the docs then copying them over to the workspace. Closes #68563. Release note: None 68663: dev: hoist generated code out of sandbox into workspace r=rail a=rickystewart (Only the most recent commit applies for this review.) One might be expected that this command be implemented as `dev generate protobuf` or `dev generate execgen` or similar, but that's not really practical for a couple reasons: 1. Building all the protobuf in-tree isn't really possible on its own: see #58018 (comment). 2. For non-proto generated code, one can imagine that we do a bunch of `bazel query`s to list all the generated files in tree and to find which files are generated. Given that there are hundreds of generated files in tree, and a single `bazel query` can take ~1s, this is not tractable. The implementation here is ad-hoc and a bit hacky. Basically we search `bazel-bin` for files ending in `.go` and apply some heuristics to figure out whether they should be copied and where those files should go. This is gated behind a feature flag for now because actually copying these files into the workspace right now dirties a ton of files. (The diffs are benign -- mostly comments.) We can flip this option to be on by default when it's safe (probably after the `Makefile` is deleted). Closes #68565. Release note: None 68806: bazel: set `--include_source_info` when generating protobuf code r=rail a=rickystewart This flag [instructs Bazel](bazelbuild/rules_proto#56 (comment)) to set a [command-line flag](protocolbuffers/protobuf#7623 (comment)) when invoking `protoc` that causes the generated proto descriptor sets to contain extra info: ``` --include_source_info When using --descriptor_set_out, do not strip SourceCodeInfo from the FileDescriptorProto. This results in vastly larger descriptors that include information about the original location of each decl in the source file as well as surrounding comments. ``` Setting this solves two problems: 1. We need the descriptor sets to have comments for #65814. 2. Without this change, generated `.pb.go` files from the sandbox won't contain comments. This makes the files more difficult to read and dirties the files in your checkout if you copy those `.pb.go` files to your workspace. Also delete an unnecessary `--symlink_prefix=_bazel/` from the `test` configuration (it's inherited from the `build` configuration so it's redundant). Release note: None Co-authored-by: Oleg Afanasyev <oleg@cockroachlabs.com> Co-authored-by: Matthew Todd <todd@cockroachlabs.com> Co-authored-by: Andy Yang <ayang@cockroachlabs.com> Co-authored-by: Ricky Stewart <ricky@cockroachlabs.com>
boom baby |
noiceee |
For bazel generated files, especially pb.go ones once we make that happen (#56067), we’ll want a way for it to exist outside the sandbox and in the source tree. The reason being that IDEs would want to know where certain definitions exist, and it’s difficult to reach individual IDEs/editors to look within the sandbox. (Related: bazel-contrib/rules_go#512)
See https://github.com/dropbox/dbx_build_tools, the folks over at dropbox have a way to explicitly pull out bazel generated files out of the sandbox and into the working directory for what I think is this exact reason.
Epic CRDB-8036
The text was updated successfully, but these errors were encountered: