Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[monodroid] Embedded assemblies store (#6311)
What do we want? Faster (Release) App Startup! How do we get that? Assembly Stores! "In the beginning", assemblies were stored in the `assemblies` directory within the `.apk`. App startup would open the `.apk`, traverse all entries within the `.apk` looking for `assemblies/*.dll`, `assemblies/*.dll.config`, and `assemblies/*.pdb` files. When a "supported" `assemblies/*` entry was encountered, the entry would be **mmap**(2)'d so that it could be used; see also commit c195683. Of particular note is: 1. The need to enumerate *all* entries within the `.apk`, as there is no guarantee of entry ordering, and 2. The need for *N* `mmap()` invocations, one per assembly included in the app, *plus* additional `mmap()` invocations for the `.pdb` and `.dll.config` files, if present. Useful contextual note: a "modern" AndroidX-using app could pull in dozens to over 200 assemblies without really trying. There will be *lots* of `mmap()` invocations. Instead of adding (compressed! d236af5) data for each assembly separately, instead add a small set of "Assembly Store" files which contain the assembly & related data to use within the app: * `assemblies/assemblies.blob` * `assemblies/assemblies.[ARCHITECTURE].blob` `assemblies.[ARCHITECTURE].blob` contains architecture-specific assemblies, e.g. `System.Private.CoreLib.dll` built for x86 would be placed within `assemblies.x86.blob`. `ARCHITECTURE` is one of `x86`, `x86_64`, `armeabi_v7a`, or `arm64_v8a`; note use of `_` instead of `-`, which is different from the `lib/ARCHITECTURE` convention within `.apk` files. This is done because this is apparently what Android and `bundletool` do, e.g. creating `split_config.armeabi_v7a.apk`. Once the architecture-neutral `assemblies.blob` and appropriate (singular!) `assemblies.[ARCHITECTURE].blob` for the current architecture is found and `mmap()`'d, `.apk` entry traversal can end. There is no longer a need to parse the entire `.apk` during startup. The reduction in the number of `mmap()` system calls required can have a noticeable impact on process startup, particularly with .NET SDK for Android & MAUI; see below for timing details. The assembly store format uses the followings structures: struct AssemblyStoreHeader { uint32_t magic, version; uint32_t local_entry_count; // Number of AssemblyStoreAssemblyDescriptor entries uint32_t global_entry_count; // Number of AssemblyStoreAssemblyDescriptor entries in entire app, across all *.blob files uint32_t store_id; }; struct AssemblyStoreAssemblyDescriptor { uint32_t data_offset, data_size; // Offset from beginning of file for .dll data uint32_t debug_data_offset, debug_data_size; // Offset from beginning of file for .pdb data uint32_t config_data_offset, config_data_size; // Offset from beginning of file for .dll.config data }; struct AssemblyStoreHashEntry { union { uint64_t hash64; // 64-bit xxhash of assembly filename uint32_t hash64; // 32-bit xxhash of assembly filename }; uint32_t mapping_index, local_store_index, store_id; }; The assembly store format is roughly as follows: AssemblyStoreHeader header {…}; AssemblyStoreAssemblyDescriptor assemblies [header.local_entry_count]; // The following two entries exist only when header.store_id == 0 AssemblyStoreHashEntry hashes32[header.global_entry_count]; AssemblyStoreHashEntry hashes64[header.global_entry_count]; uint8_t data[]; Note that `AssemblyStoreFileFormat::hashes32` and `AssemblyStoreFileFormat::hashes64` are *sorted by their hash*. Further note that assembly *filenames* are not present. `EmbeddedAssemblies::blob_assemblies_open_from_bundles()` will hash the filename, then binary search the appropriate `hashes*` array to get the appropriate assembly information. As the assembly store format doesn't include assembly names, `.apk` and `.aab` files will also contain an `assemblies.manifest` file, which contains the assembly names and other information in a human- readable format; it is also used by `assembly-store-reader`: Hash 32 Hash 64 Blob ID Blob idx Name 0xa2e0939b 0x4288cfb749e4c631 000 0000 Xamarin.AndroidX.Activity … 0xad6f1e8a 0x6b0ff375198b9c17 001 0000 System.Private.CoreLib Add a new `tools/assembly-store-reader` utility which can read the new `assemblies*.blob` files: % tools/scripts/read-assembly-store path/to/app.apk Store set 'base_assemblies': Is complete set? yes Number of stores in the set: 5 Assemblies: 0: Name: Xamarin.AndroidX.Activity Store ID: 0 (shared) Hashes: 32-bit == 0xa2e0939b; 64-bit == 0x4288cfb749e4c631 Assembly image: offset == 1084; size == 14493 Debug data: absent Config file: absent … 16: Name: System.Private.CoreLib Store ID: 1 (x86) Hashes: 32-bit == 0xad6f1e8a; 64-bit == 0x6b0ff375198b9c17 Assembly image: offset == 44; size == 530029 Debug data: absent Config file: absent … On a Pixel 3 XL (arm64-v8a) running Android 12 with MAUI 6.0.101-preview.10.1952, we observe: ~~ MAUI: Displayed Time ~~ | Before ms | After ms | Δ | Notes | | ---------:| --------: | -----------: | ------------------------------------- | | 1016.800 | 892.600 | -12.21% ✓ | defaults; profiled AOT; 32-bit build | | 1016.100 | 894.700 | -11.95% ✓ | defaults; profiled AOT; 64-bit build | | 1104.200 | 922.000 | -16.50% ✓ | defaults; full AOT+LLVM; 64-bit build | | 1102.700 | 926.100 | -16.02% ✓ | defaults; full AOT; 32-bit build | | 1108.400 | 932.600 | -15.86% ✓ | defaults; full AOT; 64-bit build | | 1106.300 | 932.600 | -15.70% ✓ | defaults; full AOT+LLVM; 32-bit build | | 1292.000 | 1271.800 | -1.56% ✓ | defaults; 64-bit build | | 1307.000 | 1275.400 | -2.42% ✓ | defaults; 32-bit build | Displayed time reduces by ~12% when Profiled AOT is used. It is interesting to note that **Displayed time** is nearly identical for the default (JIT) settings case. It's most probably caused by the amount of JIT-ed code between `OnCreate()` and the time when the application screen is presented, most likely the time is spent JIT-ing MAUI rendering code. ~~ MAUI: Total native init time (before `OnCreate()`) ~~ | Before ms | After ms | Δ | Notes | | --------: | --------: | -----------: | ------------------------------------- | | 96.727 | 88.921 | -8.07% ✓ | defaults; 32-bit build | | 97.236 | 89.693 | -7.76% ✓ | defaults; 64-bit build | | 169.315 | 108.845 | -35.71% ✓ | defaults; profiled AOT; 32-bit build | | 170.061 | 109.071 | -35.86% ✓ | defaults; profiled AOT; 64-bit build | | 363.864 | 208.949 | -42.57% ✓ | defaults; full AOT; 64-bit build | | 363.629 | 209.092 | -42.50% ✓ | defaults; full AOT; 32-bit build | | 373.203 | 218.289 | -41.51% ✓ | defaults; full AOT+LLVM; 64-bit build | | 372.783 | 219.003 | -41.25% ✓ | defaults; full AOT+LLVM; 32-bit build | Note that "native init time" includes running `JNIEnv.Initialize()`, which requires loading `Mono.Android.dll` + dependencies such as `System.Private.CoreLib.dll`, which in turn means that the AOT DSOs such as `libaot-System.Private.CoreLib.dll.so` must *also* be loaded. The loading of the AOT DSOs is why JIT is fastest here (no AOT DSOs), and why Profiled AOT is faster than Full AOT (smaller DSOs). ~~ Plain Xamarin.Android: Displayed Time ~~ | Before ms | After ms | Δ | Notes | | --------: | --------: | -----------: | ------------------------------------- | | 289.300 | 251.000 | -13.24% ✓ | defaults; full AOT+LLVM; 64-bit build | | 286.300 | 252.900 | -11.67% ✓ | defaults; full AOT; 64-bit build | | 285.700 | 255.300 | -10.64% ✓ | defaults; profiled AOT; 32-bit build | | 282.900 | 255.800 | -9.58% ✓ | defaults; full AOT+LLVM; 32-bit build | | 286.100 | 256.500 | -10.35% ✓ | defaults; full AOT; 32-bit build | | 286.100 | 258.000 | -9.82% ✓ | defaults; profiled AOT; 64-bit build | | 328.900 | 310.600 | -5.56% ✓ | defaults; 32-bit build | | 319.300 | 313.000 | -1.97% ✓ | defaults; 64-bit build | ~~ Plain Xamarin.Android: Total native init time (before `OnCreate()`) ~~ | Before ms | After ms | Δ | Notes | | --------: | --------: | -----------: | ------------------------------------- | | 59.768 | 42.694 | -28.57% ✓ | defaults; profiled AOT; 64-bit build | | 60.056 | 42.990 | -28.42% ✓ | defaults; profiled AOT; 32-bit build | | 65.829 | 48.684 | -26.05% ✓ | defaults; full AOT; 64-bit build | | 65.688 | 48.713 | -25.84% ✓ | defaults; full AOT; 32-bit build | | 67.159 | 49.938 | -25.64% ✓ | defaults; full AOT+LLVM; 64-bit build | | 67.514 | 50.465 | -25.25% ✓ | defaults; full AOT+LLVM; 32-bit build | | 66.758 | 62.531 | -6.33% ✓ | defaults; 32-bit build | | 67.252 | 62.829 | -6.58% ✓ | defaults; 64-bit build |
- Loading branch information