Skip to content

Conversation

@arohner
Copy link
Contributor

@arohner arohner commented Aug 7, 2025

Significantly improve compile time, CPU usage and memory usage. Significant rework of internals.

The previously observed memory "leak" was not a leak. To avoid race conditions and to get parallelism, we were creating a classloader per thread and compiling in every thread. This wasn't leaking memory, just using more memory than a dev laptop has available. Wall clock time was also poor because reloading the same classes once per core (8-10x) is inefficient.

We instead use a single classloader shared by all threads. We avoid the previous need to discard a classloader (causing more wasted work!) by instead reloading downstream dependencies, the same way a human would at a REPL. Parallelism is handled via agents. We create an agent for each namespace in the project, and send the commands to it. Before compiling a namespace, we send require to all of its dependents. If the namespace contains deftype, definterface or defprotocol, we send reload to all downstream dependencies, to avoid AOT hell.

@arohner arohner force-pushed the speed branch 3 times, most recently from 7e9d37d to 3b4d55c Compare August 12, 2025 16:56
@arohner
Copy link
Contributor Author

arohner commented Aug 12, 2025

Oh, for posterity: this has known issues with potemkin, manifold and byte-streams. Those libraries use the funky definterface+ et. al, which contain

(defmacro definterface+ [name & body]
  (when-not (resolve name)
    `(definterface ~name ~@body)))

If the namespace is already present on the classpath, calling compile on this code will not output .class files to be placed in the jar. The problem is exacerbated by bazel sometimes compiling jars twice, once for compile-time deps, and once for runtime. The two solutions are: use the griffinbank/* forks of these libraries, or in the future, use bazel's path-mapping feature to avoid the double compile.

@miridius miridius mentioned this pull request Aug 13, 2025
@miridius miridius requested a review from a team August 13, 2025 06:45
Copy link

@miridius miridius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first pass complete, 4 files left to review, will finish them later today

Comment on lines 125 to 128
(do
(doseq [p classpath]
(add-url classloader p))
classloader)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a risk that this will cause some things to successfully compile even though they depend on something no longer in the classpath?

@miridius
Copy link

first pass complete, 4 files left to review, will finish them later today

oh yeah I finished this yesterday but didn't have any more comments to add :)

@arohner arohner force-pushed the speed branch 9 times, most recently from f25ed9f to 74574f8 Compare August 18, 2025 20:39
Copy link

@miridius miridius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking much cleaner now! very nice

;; files. If we reload after compiling, that breaks some OSS code
;; doing janky things. Therefore, cache every compile request, and if
;; bazel asks for the same ns again, just copy the files. We
;; fingerprint namespaces using the SHA of the file contents

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍


(defn agent-compile [all-ns-decls dest-dir ns]
((ensure-agent ns)
(fn [_]
Copy link

@miridius miridius Aug 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we're never using the agent state, we're using it to do to 3 things:

  1. run thunks one at a time per ns
  2. run thunks in different nses in parallel
  3. wait for those to be done

Could we:

  1. create a lock per ns
  2. require and compile would be (future (locking ns-lock (do-stuff)))
  3. deref the returned futures

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ lock + future works if we don't care about ordering of the submitted requests, which maybe we don't (because of sha?). But if we do, I think a better idea (also mentioned in Slack) is to create a (Executors/newSingleThreadExecutor) per ns and then (.submit ns-exec ^Runnable (bound-fn [] ...)). that will return a future that you can deref to await

@arohner arohner force-pushed the speed branch 10 times, most recently from 4264172 to 65c1132 Compare August 21, 2025 21:46
@arohner arohner force-pushed the speed branch 3 times, most recently from c45708e to d0eb1ea Compare August 29, 2025 14:56
(assert (loaded? ns) (print-str ns "should have already been compiled"))
(copy-classes (fs/->path cache-dir) (fs/->path dest-dir))))

(def no-aot '#{;; specs are too big to AOT
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This moved from banksy's deps.edn because any require on the source tree can trigger it. The proper way to fix is to move it into a bazel toolchain, but that's a lot more work

@arohner arohner force-pushed the speed branch 7 times, most recently from fce9c70 to 7c5bc2f Compare August 30, 2025 00:50
@arohner arohner force-pushed the speed branch 6 times, most recently from 09dcc95 to 8458beb Compare October 9, 2025 19:45
@arohner arohner force-pushed the speed branch 5 times, most recently from eb58e0b to 5eee1f5 Compare November 13, 2025 16:44
Vendor tools.reader and java.classpath dependencies
@arohner arohner merged commit 8efa263 into main Nov 13, 2025
1 check was pending
@arohner arohner mentioned this pull request Dec 2, 2025
arohner added a commit that referenced this pull request Dec 2, 2025
- When determining whether two classloaders are compatible?, also check for java class names, not just clj namespaces. This identifies JarHell as a major source of compile errors and non-determinism
- gen_build: Make clojure_library targets depend on clojure.core.specs.alpha rather than clojure.core. This fixes a bug where clojure.core.specs.alpha gets compiled into user jars, causing false positives in compatibility checks.
- minor comments and code golfing
arohner added a commit that referenced this pull request Dec 2, 2025
- When determining whether two classloaders are compatible?, also check for java class names, not just clj namespaces. This identifies JarHell as a major source of compile errors and non-determinism
- gen_build: Make clojure_library targets depend on clojure.core.specs.alpha rather than clojure.core. This fixes a bug where clojure.core.specs.alpha gets compiled into user jars, causing false positives in compatibility checks.
- minor comments and code golfing
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants