-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build inference libs as independent opaque plugins and remove deduplication of their deps #177
Comments
poc project here: https://github.com/alpaca-core/poc-plugins |
Some thoughts, decisions and implications: Breaking up the monorepoNow, we could keep the inference libs in the monorepo, but they will still have to be configured separately. Almost all benefits of the monorepo are gone in this case and the negative aspects remain:
And... the same can be said about the language wrappers. Having them here was decided only because of monorepo purity. So, I decided to ditch the monorepo and move inference libs and language wrappers to separate ones. (Alas, as anticipated, we went full circle with the wrappers and #113 is no more. We're keeping CMake, though) API-level clashes and ODR violationsHaving inference libs as plugins removes clashes and the risk of ODR violations between themselves, but not with ac-local itself. They are expected to link with ac-local's C++ interface. So, if any inference libs does use nlohmann::json, it will (subtly and nastily) clash with Dict. That's true on two levels:
So, we are to wrap But that's not all. We also expose jalog. We do so only implicitly by relying on a common shared library using the same global storage. There is no risk of ODR violations or crashes here, but if not handled, inference libs won't be able to attach to the logger. To solve this, we should wrap jalog too, but it's too much work at this point, so we'll add it to an issue for the future. For now we will optionally allow exposing jalog publicly so that inference libs may use it if they want. Currently we don't expose anything other third party library implicitly or explicitly (well, besides the standard library, but let's just not go there, please. If we need to go there, we should just ditch C++ for the language api altogether) But! Now exposing third party libraries via the language api should be guarded and each and every possible entry should be closely evaluated and a mitigation should be decided for it. |
Moving inference libs to separate repos is tracked here. As #184, this also includes deleting issue labels and transferring issues. |
Exposing jalog as discussed previously is not an option. It has dependencies of its own which can't be managed safely. Delving deeper into this. The underlying problem is package/dependency consolidation. Unfortunately C++ has no standard package manager to speak of. Many projects use vcpkg, conan, and CPM, many use ad-hoc solutions like submodules or direct splicing. Even though a proper package manager would be able to consolidate the dependencies properly, we can't impose it on third party code. If we were big and strong, we possibly could: If you want to be an Moreover the examples in #166 are one of the worst offenders in dependency management. Saying "If you want to be an Our voice is not strong enough to influence bigger projects. Imposing a package manager is not an option. The only way to consolidate dependencies at this point is to have no dependencies. For this reason, we're merging jalog and parts of itlib in our codebase, just like we did with Sad |
Renaming the targets sounds very appealing:
However it also increases the load for new inference plugins. Clashing targets need to be renamed. This goes potentially deep into the dep trees. This does turn into an endless support features. Granted this should be a rare occurence. GGML is the only known bad violater of package purity. The above on its own would have been an acceptable sacrifice had it not been for the following problem: Options will also clash. GGML has a lot of options. Some of them might be set something in a fork and other to another thing. Even if the target is renamed, the options will remain the same, so supporting the This is a risk we're not willing to take. Now, this doesn't mean that using (As usual, this is subject to change) |
Another variant which was considered (a brutal hack): write a script which renames all symbols (cmake, c, and c++) in a project, to produce another, renamed project. Thus no duplications would exist and everything can be added in a single build with But:
|
Yet another variant which was considered (a not-so-brutal hack): rename the symbols at object level. If successful, this would allow static library builds, and, combined with renaming CMake targets (and all of its cons) would also allow This is only a viable option for C codebases. While renaming C symbols is relatively easy, C++ mangling is super hard. Renaming C++ symbols seems close to impossible (or at least worthy of a super long project on its own). So you can only rename C symbols, which basically means a whitelist. Composing such a whitelist is almost as hard as the previous variant (not JUST as hard, because macros are exempt in this case). But that's not all, now this doesn't take care of clashing C++ symbols, so they will have to be taken care of somehow... It's a mess. Moreover even if we did have a C codebase, this will completely break the debugger. So if you want to debug something, you will have to do it in isolation, without renames. And even more... er... overer this will absolutely break the build system. You either rename in place, in which case renamed output doesn't match the build output and would seem out of sync, or rename to another file, in which case you can't use target dependency management, and will have to resort to CMake 2.6 techniques, like it's 2005. |
Dropping CMake was also considered. Semi-manually building with make (or rake, or nix, or many other options) will make the build and integration process more straight-forward. But first it doesn't fix the clashes. Plugins will still have to be used (and the main cons of plugins are deployment on restricted systems like mobile and browser). But mainly this will make supporting multiple platforms super hard. CMake may be lacking in some regards, but it's the de-facto build system standard for C++. Practically every project uses it. Practically every IDE or external build system (like gradle) has CMake support. Dropping CMake will make us invest a considerable amount of toolchain-specific effort, which we now get for free. Dropping CMake will mean that we have to recreate the build of practically every third party lib we use, which we now get for free, by just using their CMakeLists. |
This is done, further plugin development to be tracked separately |
Basically follow #166 : Don't use third party code
The text was updated successfully, but these errors were encountered: