Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Bazel for Kibana builds #69706

Closed
19 of 23 tasks
mistic opened this issue Jun 23, 2020 · 8 comments
Closed
19 of 23 tasks

Use Bazel for Kibana builds #69706

mistic opened this issue Jun 23, 2020 · 8 comments
Assignees
Labels
build chore epic ReleaseStatus Item of high enough importance that it should be called out in release status meetings Team:Operations Team label for Operations Team

Comments

@mistic
Copy link
Member

mistic commented Jun 23, 2020

We are planning to slowly introduce and using Bazel as our main build tool.
That issues summarises the steps we plan to take during the process.

Phase 1 (COMPLETED) #79757 #92220

Goals

  • Introduce the first building blocks in order to be possible to use bazel into the project
  • Build packages/** using Bazel
  • Share built code on the CI for packages/** with subsequent CI builds and local developer installations

Expected Benefits

  • Little reduction on CI build times because in the majority of the times packages won't need to be rebuilt
  • Reduce bootstrap time on local development installations specially when switching between branches
  • Introducing the benefits on Bazel for packages/**: reproducible and correct builds that only rebuilds when the inputs changes

Rough Planned Timeline

We are planning to land the phase 1 between 7.13-7.14

Steps

Phase 2 (ON GOING) #104519

Rough Planned Timeline

We are planning to land the phase 2 between 8.0-8.1

Goals

  • Improve the developer experience on the features introduced on the previous Phase 1
  • Enable the benefit of the remote cache for the majority of the developers
  • Remove legacy code from the previous bootstrap process
  • Simplify Windows development

Expected Benefits

  • Further reduction in the bootstrap time for packages
  • Improve running time of unit tests for packages

Steps

Phase 3

Goals

  • Migrate production build system to Bazel (rpm, deb, Docker, tar, zip)
  • Migrate plugins to build with Bazel (keep both legacy and Bazel build systems running in parallel). We will continue to use the legacy build system until all plugins have been migrated and will cut over.
  • Builds will no longer copy to target directories, which was used in Phase 1.
  • babel/register can be completely removed since server-side code will be pre-built.
  • Server restarts can be optimized to only restart on non-type changes as identified by Bazel.
  • Open questions:
    • Should we move scripts to run through Bazel, or update the existing scripts to reference Bazel dist?

Phase 4

Goals

  • Improve identification of when tests should be ran - no more need to explicitly check docs.
  • Remove kbn/pm, or simplify enough to remove build step
  • Implement CI automated merging queue to run unit tests prior to merging
@mistic mistic added chore Team:Operations Team label for Operations Team build labels Jun 23, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-operations (Team:Operations)

@mistic
Copy link
Member Author

mistic commented Sep 1, 2020

Every detail written below intends to describe and make public all the progress we have made in our bazel exploration so far.
Feedback and ideas are very welcoming! The current POC work is available in my bazel-poc branch at https://github.com/mistic/kibana/tree/bazel-poc

Where are we and our current assumptions

We have been developing a POC in order to investigate and learn about bazel as well as understand how could we benefit from adding it to our stack of tools. We realised that one of the areas we could change and use as a first step in our future bazel integration would be our local packages.

  • Bazel was setup to being installed using .bazeliskversion and a .bazelversion files which are the ones responsible for setting bazel version management tools.
  • A bazel workspace file and a root build file plus a build file for each package were created.
  • @kbn/pm was changed for now to only run bazel build //packages:build instead of running bootstrap scripts for each package. For now the approach used was to produce similar targets to the one previously created by npm build scripts plus always generate sourcemaps and types when possible. The only exception is the kibana project itself because it needs to register the pre-commit hook (still awaiting for another solution).
  • Use ts_project to build typescript projects instead of ts_library. The latter can offer more performance but it introduces technical trade-offs that we don't want to take right now. A worker mode is being under development for ts_project rule which will bring the performance benefits from ts_library into ts_project in the future.
  • Each package produces a same name label target with a js_library which should be used to reference that package inside the bazel build.
  • A pkg_npm target with the name npm_module will always be created for each package in order to create a point of interaction between the bazel-bin and the legacy build system that only relies on the workspace state itself.
  • The repo was migrated to only rely in the top level package.json listed dependencies as Bazel doesnt support yarn workspaces.
  • @kbn/pm clean was reworked. A clean right now only deletes packages and plugins target folders plus bazel bin and other soft bazel caches running bazel clean under the hood. Another new command called destroy was introduced which does everything clean does plus deleting node_modules and also running a more in depth bazel clean using --expunge flag. yarn kbn destroy should be used everytime during development a switch from a non bazel branch into a bazel branch is done.
  • @kbn/pm was also changed with logic to support running npm scripts in packages and plugins after migrating into a single package.json by changing linkProjectExecutables to always link the root node_modules/.bin into every @kbn/pm project.

Pending details

The following items will describe topics where some more thought or at least an agreement under the current solution is still needed in order to move them into next steps

Redesign pre-commit hook

The pre-commit hook as is presents now a problem on yarn kbn bootstrap in bazel context. Previously we were always running the pre-commit hook installation process every time the bootstrap ran. Once we get bazel in the bootstrap command will be responsible for running the packages build with bazel (clearly knowing and tracking the dependency graph), copying the bazel-bin outputs back into the workspace and manage the node modules dependencies across the workspace. If it also has the need to install the pre-commit hook we are always adding at least 4 seconds to the bootstrap process where it could have been 0.

Possible solutions:

1 - Follow the elasticsearch approach and give the developers the option to use or not the pre-commit hook. In our case that will be achieved only by removing the pre-commit hook from being run during the bootstrap as it can be already manually registered by node scripts/register_git_hook.

2 - I'm not 100% certain we can do that but it seems likely we can check the ability to write a repository rule to be called in the WORKSPACE file at the loading phase of bazel just like yarn_install. If that is possible to achieve it will run in the first time the workspace installs and then every time any files passed as data have changed.

Packages

Currently for the packages build with bazel I've setup bazel rules to mimic the production targets plus sourcemaps and types (if applicable).

Problem 1:

Will we want to keep supporting development and production builds? In case we want we could use the following (please be aware that if we want to have 2 different targets for packages both should be built on CI in order to update the remote cache so both the CI and the developers could benefit from successfull cache hits):

Problem 1 possible solutions:

1 - Keep the following approach described above which is about only generate one build type (prod one) plus sourcemaps and types where needed

2 - Develop two set of rules (some intermediate targets could be used I think) and expose a two top level targets //packages:build and //packages:build_dev.

3 - Use a bazel build flag --compilation_mode that will inject an env var process.env['COMPILATION_MODE'] and will allow the use of select() on bazel rules. That is not an option yet as the bazel rules for node_js still needs to be changed a little to support that behaviour. I'm just listing it here for future reference.

Problem 2:

Right now if a local kbn package with a build step also requires it to be published into the external npm repository it would cause problems. During the build or the workspace setup bazel will generate BUILD files for the node_modules. We are linking kbn packages in the package.json in order to keep the legacy build system working and as such BUILD files will be generated for local kbn packages (acting as node_modules) for their current contents. In the cases where we only reference those kbn packages inside the bazel build we can solve the problem by using the packages js_library targets as the dependency point for other packages inside the bazel build. However, once the package gets published it can be used by other node_modules that depend on it and we also install which will cause a problem because at install time the BUILD files content those packages will read will not be up to date with the the content later generated during the bazel packages build step.

Problem 2 possible solutions:

1 - We can re-implement the way bazel rules for node_js are generating the build files for node_modules as it can be solved if we create a symlink for the package source BUILD.bazel on the npm workspace instead of generating a file. I've validated that potential solution with the bazel team and it seems promising and doable. If we agree that is a future requirement for us we can move that into next steps and go ahead and implement that feature for bazel.

Remote cache

Remote cache is one of the most benefits we can get once we start using bazel to build our project. Currently on my POC I've set up a google cloud storage bucket and integrated it. My idea around it is that our CI will build caches at every merge on tracked branches so other CI builds or the development team can all benefit from them. Those are my findings so far:

Findings:

  • Every time the local build got remote hits from cache the underlying on going build is pretty fast (~40% faster in a cold yarn kbn bootstrap without any node_modules). Is good to mention that the local bazel instance also caches the remote caches locally so even the download time is non noticeable in future builds.
  • Changing between branches with bazel doesn't need any kind of cleaning as bazel will take the needed actions according to the inputs that have changed and also checking against the cache accordingly.
  • Changing between non bazel branches and bazel branches (or the other way around) requires an yarn kbn destroy which should clean local caches, bazel local cache, plugins and packages targets and also the node_modules tree.

Problem 1:

The cache mechanism generates a different actions output tree for different platforms or rule sets (for example if we are using dev and production builds according the situation).

Problem 2:

Node native modules generates non hermetic outputs if it generates/compiles the .node files at install time per machine which would result in rules depending on those node_modules never got cached correctly as the hash for them is not reproducible.

Problem 1 possible solutions:

1 - We should generate caches on CI for every platform we want to use them (windows, linux and macOS) and then inside each platform for every different rule set we have, if any (dev, prod for example). I think I very good idea is to use CircleCI to generate those caches for us while we keep going with the other CI tasks.

Problem 2 possible solutions:

1 - The most problematic native module for now was fsevents 1.x. A solution for it in that case was to force a resolution for chokidar 3.x and fsevents 2.x because the newer version compiles and ship the .node files on npm publish, so the file is the same for every installation. If we found other problematic native node modules in the future we might have the need to fork, build them for every platform we need to use it and then publish the .node files along with the any other files at npm publish.

Bazel Kibana Integration

Currently I have envisioned @kbn/pm to encapsulate some logic to install and manage bazel version (according some dot config files .bazelversion and .bazeliskversion) and also to run bazel build //packages:build as part of yarn kbn bootstrap, bazel clean as part of yarn kbn clean and bazel clean --expunge as part of yarn kbn destroy. I believe those are the minimal required integration.

Problem 1:

From that minimal required integration how will we want to expose bazel in the future?

Problem 1 possible solutions:

1- Keep integrating underlying bazel commands on kbn/pm or another new cli

2 - Give full control of bazel to devs because they will be able to just call bazel in the shell as there is a node_modules/.bin/bazel?

Webpack and Babel under Bazel

Currently webpack and babel doesn't offer a proper integration under bazel with worker mode to persist some state and speed up incremental builds. While I don't think for babel is not a big problem because we are only using it in a couple of packages, for webpack the whole story could be different.

Problem 1:

Webpack doesn't have a worker implementation under bazel yet

Problem 1 possible solutions:

1 - Implementing that feature on bazel (it would take some time but is definitely possible I think)

2 - Try to build plugins with rollup instead which already has worker mode integration under bazel?

Next steps

Single package.json

I think that is the only next step we can start thinking right now about its design in order to go ahead and make it a reality. I believe we would want to have it in the future, with or without bazel. The following is a replication of what can be found on: #76412

Design

The proposal is to use a single top level package.json (in the kibana folder) to manage all the dependencies we need in the workspace. Sub-directories package.json will still be found from now in order to keep defining and informing boundaries around plugins and packages from the core and also to allow to run specific npm scripts. In the future those sub directories package.json could maybe be removed and replaced with BUILD.bazel files which can both be used to inform boundaries and also to define nodejs scripts to be run with bazel instead of npm.

Benefits

  • Ability to easily and quickly have an overview about all the dependencies we are using across the project
  • Avoids installs of multiple versions of the same dependency
  • Helps avoiding install unwanted or non needed dependencies
  • Makes security patches easier
  • Makes dependency upgrades easier
  • Possibly enforces a continuous work across teams to keep dependencies up to date and their code working with the latest versions
  • Removes the lock-in to a specific package manager
  • Possibly makes node_modules installation faster as it simplifies the flattening algorithm
  • Improves the ability to cache node_modules on the CI independently of the package manager being used

Cons

  • Bigger root package.json
  • Makes it difficult to test a new package version in isolation
  • Could potentially introduce extra work on solution teams to work with the supported versions of a given package across the workspace

Steps

  • Condense every workspace dependency and devDependency under the root package.json
  • Remove dependencies and devDependencies declarations from sub directories package.json
  • Remove yarn workspaces support from @kbn/pm
  • Delete .yarnrc
  • Remove workspaces key from package.json
  • Local packages should be declared with link: instead of version in the root package.json
  • Rework link_project_executables.ts on @kbn/pm to link the root ./node_modules/.bin for every @kbn/pm project so we can run npm scripts on them
  • Implement build step to understand what are the oss dependencies so we can delete them from the OSS distributable build

@mistic
Copy link
Member Author

mistic commented Sep 1, 2020

All left my feedback and favourite solutions for each problem mentioned above.

Redesign pre-commit hook

Solution 1 at the beginning and re-evaluate later about Solution 2

Packages

Problem 1 -> Solution 1 as I don't think the changes on those are too frequently. We can implement solution 2 later if we need it.
Problem 2 -> For now we won't have problems because we don't have any local and published package with a build step but I think we should go ahead and implement the solution I proposed in the bazel repo as soon as we can to avoid future problems.

Remote Cache

Problem 1 -> I believe we should use CircleCI to build the caches. To begin with, and according to my favourite solution to Packages problem 1,
we can just build caches by running the build on windows, macOS and linux.
Problem 2 -> For our current node_modules I think we would be fine with the resolutions and for future problems I like the idea of forking and publishing .node files compiled at npm publish time where we need it. It will also work as a way to rethink wether we need to introduce a new native module or not.

Bazel Kibana Integration

Problem 1 -> I will advocate for solution 2: a minimal required integration with @kbn/pm and then just use bazel cli but on that one I also like the idea of creating a top level CLI that calls bazel cli under the hood. It gave us more control about what we should easily make available to everyone.

Webpack and Babel under Bazel

Problem 1 -> I think Solution 2 is only worth it in case we can do it without problems. However due to our advanced bundling usage I don't think it would be easy enough or even possible. I think we should go for Solution 1.

Single package.json

Let's do it!

@stacey-gammon stacey-gammon added the ReleaseStatus Item of high enough importance that it should be called out in release status meetings label Sep 17, 2020
@mistic mistic changed the title New build toolchain POC New build toolchain Oct 1, 2020
@mshustov
Copy link
Contributor

@elastic/kibana-operations at what stage you are going to add support for functional tests running from a plugin Bazel package? #92758 (comment)

@tylersmalley tylersmalley added 1 and removed 1 labels Oct 11, 2021
@exalate-issue-sync exalate-issue-sync bot added impact:low Addressing this issue will have a low level of impact on the quality/strength of our product. loe:small Small Level of Effort labels Oct 12, 2021
@alexh97 alexh97 added the epic label Dec 21, 2021
@tylersmalley tylersmalley removed loe:small Small Level of Effort impact:low Addressing this issue will have a low level of impact on the quality/strength of our product. EnableJiraSync labels Mar 16, 2022
@tylersmalley
Copy link
Contributor

Closing as this issue is no longer providing value for tracking the project.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build chore epic ReleaseStatus Item of high enough importance that it should be called out in release status meetings Team:Operations Team label for Operations Team
Projects
None yet
Development

No branches or pull requests

7 participants