Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pre-load patching capability #279

Open
gotmachine opened this issue Oct 24, 2024 · 3 comments
Open

Pre-load patching capability #279

gotmachine opened this issue Oct 24, 2024 · 3 comments
Labels
enhancement New feature or request

Comments

@gotmachine
Copy link
Contributor

gotmachine commented Oct 24, 2024

I've been giving some thought about implementing some form of pre-load patching ability for KSPCF.

Main motivations :

  • Lifting runtime-patching limitations : unreliable generics support, performance overhead, generated member name conflicts when using a publicizer. We already have multiple cases of stuff we can't fix because patching generic methods with harmony is ranging from unreliable to impossible.
  • Avoiding being stuck with stock implementation details. While we should be careful about changing (private/internal) implementations, there are many cases were this would be relatively safe to do and would help a lot. And just having the ability to add new members to existing stock types would be a game-changer, especially on the performance optimization front.
  • Being able to patch stuff earlier. This would notably allow us to optimize the initial config parsing, which is awfully slow, and the main remaining bottleneck in the stock loader.

The base requirement would be to use an injector, giving us the ability to run code before any dll in KSP_x64_Data\Managed is loaded.
There are multiple off the shelf options for this, the most stable and well maintained one seems to be UnityDoorstop .

Another option would be BepInEx, but the scope of BepInEx goes well beyond what we need. It is built upon UnityDoorstop, but also has a full fledged mod loader (that we have no real use for), is packaging a custom version of Harmony (which is always somewhat lagging behind the original and has a few compatibility quirks), and comes with a lot of extra bloat that we have no use for.

Moving KSPCF to BepInEx would be a somewhat breaking change for the ecosystem, as we would have to retire the current HarmonyKSP plugin and swap it for BepInEx, which would become the new dependency for mods using Harmony. This being said, this would allow other plugins to have pre-load patching capabilities as well, but in the end, pre-load patching is a very non-cooperative technique and it would be very likely other plugins modifications it would end up conflicting with KSPCF own modifications, plus there are probably not that many real use cases for it, other than in a few "foundational" mods like Kopernicus or ModuleManager, and for such cases, it would likely be worthwhile to put the changes in KSPCF anyway.

One issue is that UnityDoorstop is relying on a platform specific native dll, so we would have to redistribute a separate version for every platform (windows, linux, macos), and this is especially problematic in the context of CKAN, which AFAIK doesn't provide any way to selectively install stuff based on the current platform.

This injection related stuff put aside, there are other considerations as to how exactly we would implement Assembly-CSharp patching.

The usual way (like it exist in BepInEx) is to use either Mono.Cecil to edit the assembly at the IL level, which is extremely cumbersome, or to use MonoMod.Patcher, which is a helper/wrapper built on top of Mono.Cecil and that makes the whole process a bit more usable and practical, a bit like Harmony for runtime patching.

There are a few caveats to the MonoMod.Patcher approach :

  • As the modified code is living in a separate assembly that is never loaded and just exist as a IL source for MonoMod/Cecil, it isn't debuggable from the IDE. The only option is to debug it through DnSpyEx.
  • Plugins (including KSPCF itself) wouldn't be able to statically use new members / APIs that we define in the stock classes, which is really annoying.
  • Looking around some example usages, things can become quite messy, for example when needing to reference a patch-added member from another patch. Since the pattern involve defining new/overridden members in a derived class, publicizing the patched class members is required, and there are real limitations with that as well, like name conflicts with compiler-generated members.
  • Documentation is very barebones, not to say non-existent.

Which is why I'm tempted to look into a more radical approach : doing all our changes in a (stripped from all stock code) decompiled assembly-csharp source. The idea would be to have every stock method body replaced with throw new NotImplementedException(), and every type made partial. This would allow us to make our changes in separate source files, although replaced methods would likely have to be deleted from the original source (full support for partial methods is implemented in C# 9 / .NET 5, but it's doubtful we can use such a target).

Added and modified members would have to be annotated with an attribute. Then, on pre-loading, we could inject the changes identified by those attributes into the original Assembly-CSharp (or maybe the other way around, inject the original Assembly-CSharp into our own assembly), then load the modified version. In theory, it should be possible to merge the debug symbols as well. Options for doing all this are either Mono.Cecil or the newer and much better documented AsmResolver

Doing this would obviously require quite a lot of low level infrastructure work, but there are several benefits :

  • The codebase and changes would be clearly separated, the overall project would be pretty clean.
  • Assuming merging the debug symbols can actually be done, the source could be debuggable without having to rely on DnSpyEx.
  • The stripped/modified assembly-csharp could be used as a reference by other plugins, allowing them to use the API we add to it
  • We could also start adding xml comments (and merging the existing ones from various sources) to the our stripped source and redistribute them alongside the assembly, which would be helpful for the community at large.
  • Looking at the future, this feel like a necessary step to make possible more ambitious KSP codebase improvement and modernization projects.
@gotmachine gotmachine added the enhancement New feature or request label Oct 24, 2024
@gotmachine
Copy link
Contributor Author

gotmachine commented Oct 26, 2024

So, did some prototyping, this was actually pretty fun.

Writing here what I did, as I'm not sure if and when I will push this further.

First step was the usual de4dot --dont-rename, then to put together a functional, compilable but stripped from all code C# project for Assembly-CSharp.

This was actually more difficult than anticipated. There are plenty of off the shelf assembly strippers (and often publiciziers), but they are meant to produce a reference assembly that can be used in another project, they don't particularly care about producing a valid assembly that is somewhat close to the original source once decompiled. They also tend to mess around with member visibility and attributes.

I ended up putting together a custom command line utility for that, using Mono.Cecil, which :

  • Replace method bodies with a throw null.
  • Remove all compiler generated types and members (state machines, enumerators, etc), at the notable exception of event methods and auto-properties backing fields, as this is necessary so they can be decompiled back by ILSpy.
  • As a bonus, "simple" properties (getters / setters mapping to a field directly) are preserved too (instead of being replaced by a throw null).
  • Preserve base constructor calls. This is the part that gave me the hardest time, as this involve selectively preserving the matching part of the constructor IL, and is necessary as a derived type constructor won't compile is there is no default constructor in the base type. There are still many holes in what I did, but that solved the vast majority of the issues, the rest I just fixed manually in the decompiled source.

Once I got my somewhat clean stripped DLL, I decompiled it back to C# with ILSpy, with the C# 7.3 profile. A bit of compiler error fixing latter, I had my re-compilable-but-stripped Assembly-CSharp source project. There are probably a few things that didn't round-trip perfectly identically like base constructor calls, and I suspect some event declarations might be messed up too, but that wasn't strictly necessary anyway.

Now the fun part could begin.

The pre-loader and pre-patcher prototype is based on UnityDoorstop and AsmResolver respectively. Getting AsmResolver to run wasn't trivial, as it depends on various nuget packages, but more problematic, is targeting netstandard 2.0. But a nice thing about preloading is that it gives total control over which assemblies are loaded, so I took up the task of adding all the missing stuff in the KSP BCL distribution.

The easiest way to find out what was need was to make a 2019.4 unity project, import AsmResolver with NugetForUnity, build the result and grab everything missing from the KSP distribution :
image

The only Mono/BCL dll I removed was System.Windows.Forms. System.Drawing should likely be removed too, as it is dependent on the mono GDI+ native libraries (libgdiplus.so) on non-windows systems, so having them functional in KSP would require shipping them as well (and acquiring them should be possible by building the mac/linux players for the unity project).

The key point here is having the netstandard.dll facade, which is type-forwarding everything to the relevant implementations. This mean I now have a .Net Standard 2.0 compatible KSP, at least in theory (Unity does some type-forwarding rewriting in libraries to adjust things to work with its Frankenstein mono, but I'm not sure to which extent).

UnityDoorstop only loads mscorlib, so from my managed entry point, I manually loaded the few BCL libraries shipped with KSP from KSP_x64_Data\Managed, and the rest from a game root subfolder, and finally the AsmResolver libraries.

So back the actual patching, I took my stripped source project, made all classes partial, moved all sources files to a subfolder, made a new Part.cs file :
image

I'm then building the assembly, putting it somehwere in a KSP sub-dir, and using AsmResolver, I loaded the original Assembly-CSharp, my stripped/modified assembly, and proceeded to basically copy-paste the modified method in place of the original. The [ModifiedType] and [ModifiedMember] attribute are there to identify what should be patched, likely [AddedType] and [AddedMember] ones will be needed too as we definitely want to support the capability, but I haven't tested it yet.

Fortunately, AsmResolver provide a very neat MemberCloner thing handling most of the heavy lifting of rewiring everything properly. What we are trying to do (replacing a method body by another) isn't exactly what it was designed for. The purpose of that tool is to rewire all type/member cross references when moving stuff from an assembly to another.

But in our case, we not only need cross references to be rewired, but also any reference to a type/member in Assembly-CSharp from our stripped version to the original version. This can be achieved by feeding the MemberCloner instance with a custom CloneContextAwareReferenceImporter. The importer provide various overridable methods for post-processing a reference import. So when we see a reference from the stripped Assembly-CSharp being passed, we just return the matching reference in the real Assembly-CSharp. For example, for types :

protected override ITypeDefOrRef ImportType(TypeDefinition type)
{
    if (type.Module == patchAssembly)
        foreach (TypeDefinition kspType in kspAssembly.GetAllTypes())
            if (Comparer.Equals(kspType, type))
                return kspType;

    return base.ImportType(type);
}

The final step is to call Clone() on our MemberCloner and to grab the modified methods, then to simply replace the method body, and finally to write our modified assembly somewhere to load it :
image

I love this AsmResolver thing. The code is fully documented, the manual is extensive and super clear. The absolute opposite of Mono.Cecil and MonoMod. I guess things might get a bit more complicated in some cases (generics ? compiler generated members ?), but so far this seems to be working nicely with a minimal amount of intervention.

A few remarks / ideas :

  • I'm enthusiastic about this, but this would need a lot more work and validation before going public.
  • For testing, I was writing the modified Assembly-CSharp with a different name to KSP_x64_Data\Managed. This isn't really necessary, the only reason is that if a plugin is looking at the assembly Location / Codebase and the assembly is loaded from elsewhere, things might get ugly (I vaguely remember that stock might be doing that too, actually). I'm not sure what the best option is, or if a workaround can be found. Putting stuff in KSP_x64_Data\Managed is a bit dirty...
  • Unfortunately, I couldn't get debugging to work. AsmResolver doesn't (yet) support PDB rewriting, only some relatively barebone parsing is supported, and after giving a brief look, it seems would be awfully complex to achieve anyway. I don't know if Mono.Cecil can do better, probably not. Debugging the modified assembly with DnSpyEx works, but it's less convenient, I guess I just have to become more accustomed to it.
  • Putting together some basic infrastructure allowing other mods to take advantage of pre-patching doesn't feel too difficult, they would use the base stripped source we provide and we would process the compiled modifications. The main practical issue is that I feel their changes would quickly end up conflicting with ours, or with each other. In the end, if we want to provide the capability for other mods, it might be a better option to integrate MonoMod.Patcher as a pre-patcher working on top of our modified assembly, MonoMod.Patcher provide some restricted cooperative patching ability similar to harmony prefixes/postfixes, in the form of being able to insert a call to the original method in the middle of the replacement method.
  • This is a lot of reinventing the wheel when MonoMod.Patcher (not to mention BepInEx) exists. The incentive behind the custom approach is that we could start doing deep modifications in a relatively clean way, and to expose new / improved APIs for mods to consume, which wouldn't be possible with MonoMod.Patcher. Looking at the current KSPCF codebase, maintainability is starting to become a real issue. We have a mess of scattered patches and files, sometimes targeting the same stock code, the idea of just being in control of the full source like if we were simply developing the game ourselves is very appealing.
  • I would like to avoid creating a ModuleManager like problem with loading times, so keeping an eye on how much time it takes to perform the pre-patching would be a good idea. The prototype patch I've put together is taking about 3 seconds, but that include the JITing of basically the whole BCL, so I'm not too worried, and AsmResolver looks like much better optimized than Cecil/MonoMod, but the operation is nevertheless a heavy one in any case. On a side note, runtime patching time by KSPCF is also starting to get quite significant, so that's another reason to move things to pre-patching, as at least it only has to be done once when the mod set is updated/changed.

For reference :

@gotmachine
Copy link
Contributor Author

gotmachine commented Oct 27, 2024

So, update on this journey (this is kinda starting to feel like a story, no ?)

I refactored the whole thing to get it a bit more streamlined, looks much better than the previous mess.

I also implemented the additional patching options, in addition to modifying stock methods, it's now possible to add new methods and fields to stock types, and to add whole new types as well.

I was suspecting things might not go so smoothly with adding fields top stock types, and indeed my intuition turned out to be true.

Adding fields to classes deriving from UnityEngine.Object imply changing the class layout, and that make Unity (de)serialization very unhappy. I'm not yet sure to what extent a limitation this is, but I can say for sure that modifying the field layout of an object that is serialized in an assetbundle as a prefab will cause an error upon KSP attempting to load the assetbundle :

The file '.../KSP_x64_Data/sharedassets0.assets' is corrupted! Remove it and launch unity again!
[Position out of bounds!] 

This is quite unfortunate, as I believe more or less every type where adding fields would has been useful is coming from a prefab at some point. And even for other types, I'm not sure Unity will handle serialization/deserialization properly. Unity serialization doesn't work on types not defined in Assembly-CSharp, so likely there is some bundled cache about the layout of every type Unity knows. And even if we somehow manage to hack that type layout information, that would still leave us with the issue that the serialized assets don't match the new layout...

Edit : Ha stupid me !
So, it doesn't work when the added fields are candidates for serialization, which makes perfect sense.
However, if they aren't serialized, either by making them private or by putting the [NonSerialized] attribute of them, everything seems to be working fine.

@gotmachine
Copy link
Contributor Author

gotmachine commented Oct 29, 2024

Okay, so I have a somewhat finalized implementation, but I've been giving some thought about the more general picture.

We need to rely on pre-loading / UnityDoorstop. The implicit consequence is that we will be doing things in a new way that while standard in other games modding scenes, is quite new and unusual for the KSP modding scene :

  • Manual installation steps would mean copying an additional dll and configuration file in the game root directory, as well as copying a directory, either in GameData, or in the root directory. As of writing, the latter isn't supported by CKAN. One solution to make the user experience a bit more streamlined would be to have a single root-installed dll packaging everything we need as in-file resources, and to auto-extract them in GameData. I think this could still work with CKAN, we would just have to tell CKAN to create/track that GameData directory as well, and it would make manual installs a lot less prone to user error, the instructions would just be "put all 3 files in the game root".
  • On Linux/Mac, users will have to run the game through a bash script installed/copied in the root directory as well. This has various implications, like not being able to launch the game through Steam without additional manual configuration, and launching through CKAN would also not work.
  • For users running the Windows version under Wine/Proton/SteamOS, additional configuration steps are needed
  • Everything we want to do during pre-patching obviously won't be able to be configured with ModuleManager.

The prototyped Assembly-CSharp patching infrastructure is based on a stripped from every implementation copy of its source code, where all stock types are made partial :

  • This allow to implement new members or to re-implement existing members of a stock type in a separate file, but as if we were directly editing the source code for that type.
  • This is a quite clean and maintainable way of handling modifications to the stock code, with a minimal amount of boilerplate : added / modified members just need to be marked with an attribute to identify the nature of the modification, and in the case of modified members, the name of the original member. For methods and property getters/setters, it is possible to inject a call to the original stock method inside the replacement method, which is functionally identical as harmony prefixing/postfixing, but without all the boilerplate.
  • This allow other mods to be built against our modifications, which mean we can implement additional public APIs for them to consume.
  • This could be a good way to provide/maintain XML docs for the stock assembly, which could be a continuation / merge of previous efforts.
  • A limitation with this approach (compared to one similar to the MonoMod.Patcher one) is that we cannot freely bypass the original KSP private/protected access modifiers like we could with a publicizer. This might end up being annoying in some cases, and while I can think of a few workarounds, they would have various limitations and side effects. On the other hand, one goal of such a project is to do things in a more maintainable and less hacky way, so if we need some stock type or member to not be private/protected, we should likely just make it public or implement a public API for it. At the moment, the current prototype doesn't support modifying access modifiers of the stock members, but that is something that could be done.
  • Another limitation, compared to runtime patching, is that we loose the ability to do some conditional patching, and notably the ability to have MM patches being able to enable or disable patches. Still, we might want to implement config file based switches (as this has proven to be quite useful for issue analysis), and maybe something very basic allowing other mods to redistribute config overrides.
  • I'm not too worried about the loading delay. On my 5800x3D, there is a more or less fixed cost to rewrite the assembly taking around 2 seconds, and actual patch processing seems reasonably fast, patching a relatively complex method takes between 10 and 30ms, other member types are a lot faster. I haven't done rigorous measurements, but I think this is broadly similar to doing the equivalent modifications at runtime with Harmony, and contrary to runtime patching, this doesn't have to be done every time, we can just re-load the patched assembly on the next KSP launch.

Moving forward with this, one major consideration that we would impose our solution as the sole pre-loader solution for the KSP modding ecosystem, excluding notably BepInEx. AFAIK, there isn't currently any KSP mod using BepInEx or any other preloader, but this mean that we need to consider what the modding ecosystem might want to do, and which capabilities we want to expose :

  • We are currently unable to upgrade Harmony to the 2.3 version due to a combination of Harmony not IL-packing its dependencies anymore, and KSP coming with an horribly outdated version of Mono.Cecil. Forking Harmony to build an IL-packed version is non-trivial and a pretty bad solution anyway. Our preloader would allow to solve this issue by simply loading the right version of Mono.Cecil instead of the KSP-provided one, and while we're at it, it would likely be just as good to provide Harmony directly as well.
  • We probably want to expose some pre-patching capability to other mods. Our stripped-source approach doesn't lend itself well to this, but we could provide a few things on top of the current prototype :
    • Something similar to MonoMod.Patcher : ship an assembly were you define a type deriving from the stock type you want to edit, in which you can either add new members or re-implement existing members with the new C# keyword. Access to the patched type private members (or to the type itself) having to be done through a publicizer. There are a few holes and quirks with this approach, like not being able to edit structs or sealed classes (since it's not possible to define a derived type), or not being able to patch constructors (since the replacement would call the base constructor), but nothing too limiting in practice. If that really become a problem, we could implement an alternate patching mode to lift those limitations, were you re-define the target class with every needed member, in which case this is the same patching scheme as the stripped-source one. This capability could actually be complementary to the main one for cases were we really need to bypass access modifiers.
    • Direct exposure of the modified Assembly-CSharp AsmResolver.ModuleDefinition, which would allow low level editing through the AsmResolver library. AsmResolver is definitely much nicer to work with than Mono.Cecil for such tasks, even if you put MonoMod into the mix, so I would personally call that a win compared to using BepInEx, but I have a personal grudge against Mono.Cecil, so take my words with a grain of salt.
    • The ability to pre-patch other assemblies (typically the UnityEngine ones) would be nice to have, both in the context of KSPCF and for other mods.
  • Either way, we want to think through how we expose all of this. Stuff like inter-mod configuration or dependency / conflict resolution doesn't feel like that much necessary here, but maybe we can provide a minimal thing anyway.

Finally, we need to think through how to handle the transition from the current ecosystem. We have two separate plugins, HarmonyKSP, and KSPCF having a dependency on it. Through the modding ecosystem, there are various mods having a dependency on either or both. These dependencies are expressed at two levels : in the CKAN metadata and through the KSPAssembly attributes.

If we go through with this, we would have up to four somewhat separate components instead of two :

  1. The preloader and prepatcher base infrastructure.
  2. The KSPCF pre-patches, with a dependency on (1), and possibly on (4).
  3. Harmony with a dependency on (1), assuming we want to be able to update Harmony to 2.3+.
  4. The KSPCF runtime patches with a direct dependency on (2) and (3), and indirectly on (1).

Given the level of cross dependency, it would make sense to have a single monolithic distribution as a major 2.0 update of KSPCF. We would retire the existing HarmonyKSP mod, having KSPCF 2.0 be a provider for the identifier at the CKAN metadata level, and we would load "fake" assemblies with the KSPAssembly attribute we currently define in KSPHarmony (which is what we are doing already, more or less).

However, the major consequence would be effectively forcing every mod currently depending on Harmony but not on KSPCF to depend on it, but also to depend on the preloader, which comes with caveats for non-windows end users (see at the beginning of this post).

This is the current list of (CKAN tracked) mods having a direct dependency on HarmonyKSP :
image

This might not seem much, but Kopernicus, CC and Shabby are themselves very common dependencies for many other mods. In practice, I'd say HarmonyKSP is likely installed for the overwhelming majority of modded KSP end users.

The middle ground solution could be :

  • Keep HarmonyKSP as a separate plugin, and be stuck with the outdated 2.2.x version (or investigate making an IL-packed 2.3 fork, but my previous attempts failed miserably).
  • Make a KSPCF 2.0 that include the preloader, the assembly-csharp pre-patches and runtime patches, and has a dependency on HarmonyKSP.

In this scenario, this would avoid the caveats for non-windows end users that don't have KSPCF but have mods depending on HarmonyKSP. In the end, this is likely a small subset of users and Linux users are usually tech-savvy enough for this not to be a problem, I would be more worried about the average Mac end user.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Development

No branches or pull requests

1 participant