-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Single-File: Process bundles in the framework #34274
Conversation
@@ -46,7 +46,7 @@ class CursorStreamWrapper : public GenericStreamWrapper<InputStream, Encoding> { | |||
// counting line and column number | |||
Ch Take() { | |||
Ch ch = this->is_.Take(); | |||
if(ch == '\n') { | |||
if (ch == '\n') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a harmless change, but I'd prefer if rapidjson were kept as a pristine copy from upstream, as it's a 3rd-party component. (If we need to actually fix things in there, I'd prefer that we also sent patches their way instead of maintaining a fork.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK @lpereira I'll back off the change from this PR.
I've submitted a PR for rapidjson: Tencent/rapidjson#1685
We should subsequently fix it in our code too.
I usually do if(
for(
while(
searches before submitting PRs, and it would be nice to not hit these cases every time.
realloc_buffer(location->size); | ||
memcpy(m_json.data(), data, location->size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I usually see the idiom of using the swap()
method to initialize a std::vector
from an array, so I was curious to see how Clang would generate the code. It's identical. I'm impressed. :)
However, you can avoid this copy/allocation altogether here if this file is mapped with MAP_PRIVATE
. In fact, this will simplify how this function works, since it won't need the #ifdef
. Pseudocode:
if (is_single_file) {
bundled_file_t bundled{path, location}; // RAII to map/unmap
return parse_json(bundled.data(), context);
}
pal::ifstream_t file{path};
return parse_stream(file, path);
Then you don't need to split parse_json()
into a parse/validate combo, and parse_json()
can always work with a mutable char*
. This also gets rid of duplicate, implementation-detail specific comments.
// On Linux, copy the input data, since m_document.ParseInsitu | ||
// requires a mutable data input. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// On Linux, copy the input data, since m_document.ParseInsitu | |
// requires a mutable data input. | |
// On other platforms, copy the input data, since m_document.ParseInsitu | |
// requires a mutable data input. |
@@ -398,13 +401,13 @@ bool runtime_config_t::ensure_parsed() | |||
trace::verbose(_X("Did not successfully parse the runtimeconfig.dev.json")); | |||
} | |||
|
|||
if (!pal::file_exists(m_path)) | |||
json_parser_t json; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
json_parser_t json; |
{ | ||
// Not existing is not an error. | ||
return true; | ||
} | ||
|
||
json_parser_t json; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion:
json_parser_t json; | |
json_parser_t json; |
// * Files can only be memory mapped at page-aligned offsets, and in whole page units. | ||
// Therefore, mapping only portions of the bundle will involve align-down/round-up calculations, and associated offset adjustments. | ||
// We choose the simpler approach of rounding to the whole file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be a good idea to comment that this might be a problem in 32-bit systems and very large applications due to exhaustion of address space.
private: | ||
pal::string_t m_path; | ||
const location_t *m_location; | ||
} json_info; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
json_info
doesn't seem to be used from a quick glance.
{ | ||
struct config_t | ||
{ | ||
config_t() {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't initialize m_location
, and config_t::matches()
will dereference that pointer. When info_t::m_deps_json
and info_t::m_runtimeconfig_json
are constructed, they use this constructor here.
@@ -12,7 +12,9 @@ manifest_t manifest_t::read(reader_t& reader, int32_t num_files) | |||
|
|||
for (int32_t i = 0; i < num_files; i++) | |||
{ | |||
manifest.files.emplace_back(file_entry_t::read(reader)); | |||
file_entry_t entry = file_entry_t::read(reader); | |||
manifest.files.emplace_back(entry); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Is there a benefit for using emplace_back()
here? Maybe file_entry_t
should have a move constructor and std::move
be used instead?
const int8_t* info_t::config_t::map(const pal::string_t& path, const location_t* &location) | ||
{ | ||
const bundle::info_t* app = bundle::info_t::the_app; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the_app
is static and initialized to nullptr
, might be a good idea to assert if it's not nullptr
where it's dereferenced
pal::string_t file_base = entry->needs_extraction() ? app->extraction_path() : app->base_path(); | ||
|
||
// Reserve space for the path below | ||
full_path.reserve(file_base.length() + relative_path.length() + 3); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the 3
for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is code copied from https://github.com/dotnet/runtime/blob/master/src/installer/corehost/cli/deps_entry.cpp#L32.
Two characters are required for DIR_SEPARATOR
and null-terminator. Not sure why the third spot is necessary.
pal::string_t pal_relative_path = asset.relative_path; | ||
if (_X('/') != DIR_SEPARATOR) | ||
{ | ||
replace_char(&pal_relative_path, _X('/'), DIR_SEPARATOR); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pal::string_t pal_relative_path = asset.relative_path; | |
if (_X('/') != DIR_SEPARATOR) | |
{ | |
replace_char(&pal_relative_path, _X('/'), DIR_SEPARATOR); | |
} | |
pal::string_t pal_relative_path = normalize_dir_separator(asset.relative_path); |
|
||
if(address == nullptr) | ||
if (address == nullptr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (address == nullptr) | |
close(fd); | |
if (address == MAP_FAILED) |
|
||
if(address == nullptr) | ||
if (address == nullptr) | ||
{ | ||
trace::error(_X("Failed to map file. mmap(%s) failed with error %d"), path.c_str(), errno); | ||
close(fd); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
close(fd); |
(And also remove the other close(fd)
below, which GitHub doesn't allow me to comment on.)
{ | ||
if (input->app_bundle != nullptr) | ||
{ | ||
static bundle::runner_t bundle_runner(input->app_bundle); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I keep reading this as Blade Runner. :)
@@ -2094,11 +2094,11 @@ class GenericValue { | |||
|
|||
const SizeType len1 = GetStringLength(); | |||
const SizeType len2 = rhs.GetStringLength(); | |||
if(len1 != len2) { return false; } | |||
if (len1 != len2) { return false; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have yet to look at tests. The most important feedback is about the UX when we run 5.0 single-file on machine which doesn't have 5.0. We need to get that right.
@@ -238,7 +238,8 @@ bool hostpolicy_resolver::try_get_dir( | |||
|
|||
// Resolve hostpolicy version out of the deps file. | |||
pal::string_t version = resolve_hostpolicy_version_from_deps(resolved_deps); | |||
if (trace::is_enabled() && version.empty() && pal::file_exists(resolved_deps)) | |||
if (trace::is_enabled() && version.empty() && | |||
(bundle::info_t::config_t::probe(resolved_deps)|| pal::file_exists(resolved_deps))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(bundle::info_t::config_t::probe(resolved_deps)|| pal::file_exists(resolved_deps))) | |
(bundle::info_t::config_t::probe(resolved_deps) || pal::file_exists(resolved_deps))) |
@@ -343,6 +343,12 @@ bool deps_resolver_t::probe_deps_entry(const deps_entry_t& entry, const pal::str | |||
else | |||
{ | |||
// Non-rid assets, lookup in the published dir. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume we only handle bundle for non-rid assets as we expect the app to be rid-specific (since it's an apphost after all). But I think this is potentially dangerous. For framework depend apps it is in theory possible to have an app which is published for linux-x64 but carries different assets for different distros. I don't know how SDK would handle that, but it feels possible. At least add this scenario to your list of things to validate.
I would probably also prefer if we actually checked the path even in the rid-specific case above and maybe failed if we can't handle it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vitek-karas, regarding rid-specific assets:
- I've added the bundle-check in the rid-specific assets case too, as you've recommended.
- However, when an app is published specific to a RID, my understanding is that the SDK will never create assets for other RIDs. The SDK will find the best compatible RID assets and copy them to the publish directory. So, there won't be a
<rid>\publish\runtimes\<rid>
structure.
For example, if we have a package (libuv 1.9.0 is a very good example)
runtimes
|- debian-x64/native/lib.so
|- rhel-x64/native/lib.so
|- win-x64/native/lib.dll
dotnet publish
will generate the above directory structure within the bin/.net5/publish
directory.
However, dotnet publish -r debian-x64 --self-contained=false
will generate bin/.net5/debian-x64/publish
that only contains lib.so
from runtime/debian-x64/native
.
dotnet publish -r ubuntu-x64 --self-contained=false
will generate bin/.net5/ubuntu-x64/publish
that contains lib.so
from runtime/debian-x64/native
.
dotnet publish -r linux-x64 --self-contained=false
will generate bin/.net5/linux-x64/publish
without lib.so
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dotnet publish -r linux-x64 --self-contained=false will generate bin/.net5/linux-x64/publish without lib.so
This sounds like a problem/bug. I assume the app would not run in that case at all... right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that there is an underlying problem, because the app may fail to run because of missing assemblies in this situation. However, I'm not sure it is an SDK bug, but rather a product issue.
The missing RID-asset (here linux-x64
) in a package may actually mean that the library (lib.so
) is unnecessary for that platform -- although it is weird that the lib is not necessary at a higher RID but necessary in a lower RID. This is somewhat similar to missing an assembly that is only loaded at runtime for reflection -- where the SDK cannot know the exact requirements.
@dsplaisted @wli3: Should we consider guards in the SDK to check for this case? (ex: missing assets at a parent RID)? Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@swaroop-sridhar I have a feeling trying to add guards for this type of thing would be more complex and complicated than would be warranted for the benefit we would get out of it. But you can file an issue in the SDK repo if you want to discuss it further.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change implements the host changes proposed in the [design](https://github.com/dotnet/designs/blob/master/accepted/2020/single-file/design.md#startup) The main changes for single-file bundles are: * Bundle processing code is moved from apphost to hostpolicy * HostFxr and HostPolicy process deps.json and runtimeconfig.json files directly from the bundle. * HostPolicy performs verification wrt deps.json based on the contents of the single-file bundle. * AppContext.BaseDirectory is set as explained [here](https://github.com/dotnet/designs/blob/master/accepted/2020/single-file/design.md#appcontextbasedirectory) Currently, all files except deps.json and runtimeconfig.json are extracted to disk. Once the runtime is able to processing assemblies directly from the bundle, they will no longer be extracted. Notable details: * The bundle driver (formarly runner.cpp) is divided into two parts: * bundle::info which describes basic information about the bundle available from the headers only * bundle::runner which has information about all embedded files, and the ability to drive extraction This facilitates linking only parts of the bundle handling code with hostfxr, while all code is linked with hostpolicy. * The AppHost only links with bundle_marker to identify itself as a single-file bundle. * If the AppHost is a single-file bundle, it notifies hostfxr using the new hostfxr_main_bundle_startup_info() API * The HostFxr comminucates the single-file-information with HostPolicy using the host_interface_t structure. Fixes dotnet#32821
The rename test would hang forever if AppWithWait() fails before creating the resume file. So, while waiting, check that the app-process is still running.
* Fix HostFxr startup function naming in CoreHost * deps_resolver: Lookup up bundle even for rid-specific assets within the app * Ensure CUI/GUI error output when expected framework is not found for single file apps. * Other minor fixes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I 'm struggling to understand what is the hostpolicy/coreclr interface. In the design doc https://github.com/dotnet/designs/blob/master/accepted/2020/single-file/design.md#dependency-resolution it says:
The assemblies bundled within the single-file are not enumerated in TRUSTED_PLATFORM_ASSEMBLIES. The absolute-path of assemblies on disk are listed in TRUSTED_PLATFORM_ASSEMBLIES as usual.
But if I read the code correctly it will populate TPA with full paths even for bundled assemblies - relative to the app's base.
Similar comment on resource paths.
src/installer/test/Assets/TestProjects/AppWithSubDirs/Program.cs
Outdated
Show resolved
Hide resolved
There is a small nuance to this:
|
Also update a few comments to clarify that assemblies are currently extracted to disk.
@@ -58,6 +59,7 @@ struct host_interface_t | |||
const pal::char_t* host_info_host_path; | |||
const pal::char_t* host_info_dotnet_root; | |||
const pal::char_t* host_info_app_path; | |||
const bundle::info_t *app_bundle; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bundle::info_t
does not look to be plain data, so using it cross-binary like this where hostpolicy reads the class allocated by hostfxr could lead to compat issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @elinor-fung . I've fixed this in 12b63ba
@@ -12,6 +12,7 @@ | |||
#include <utils.h> | |||
|
|||
#include "json_parser.h" | |||
#include "bundle/info.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good - the one pending question I have is the RID specific asset handling.
Host::info* is not POD, so pass the bundle-header offset instead. The deps-json and runtime-config json locations are recomputed in hostpolicy. Since hostpolicy needs to re-map the bundle, recomputing these locations is not expensive than passing them through the interface. This also keeps the host_interface cleaner.
Thanks a lot for the detailed feedback @vitek-karas
I've now answered this in #34274 (comment) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
Please followup on the issue with RID-specific assets. I do agree that vast majority of the apps won't have the problem. If SDK will never add a RID-specific asset to a RID-specific build, then this is not an issue for host/runtime (yet), but it seems to be an SDK issue.
…sing extraction Customer Scenario Self-contained Apps published as a single-file fail at run-time. Problem Single-file bundles are now processed in the framework rather than the apphost (dotnet/runtime#34274). This means that hostpolicy and hostfxr DLLs are excluded from being bundled themselves. In the case of self-contained single-file apps, these files need to be separate files until static-apphost is available. This needs to be ensured by the SDK; otherwise app execution will fail. Solution This change fixes the problem by adapting the SDK to: * Pass the correct target-OS settings (otherwise cross-targetting builds will not work correctly) * Copy files excluded by the Bundler to the publish directory. The stage-0 netcoreapp is updated to preview4, because preview2 apphost is not compatible with preview4 bundler. Risk Low
This commit reverts: Revert "Single-File: Process bundles in the framework (dotnet#34274)" This reverts commit 78b303d. Revert "Single-File Bundler: Add a FileSize test (dotnet#35149)" This reverts commit 779588a.
This commit reverts: Revert "Single-File: Process bundles in the framework (#34274)" This reverts commit 78b303d. Revert "Single-File Bundler: Add a FileSize test (#35149)" This reverts commit 779588a. *Customer Scenario* Publishing apps as a self-contained single-file doesn't work as expected. * Publish needs to generate hostpolicy and hostfxr separate from the single file bundle * Cross-platform publishing is incorrect *Problem* Since Static-apphost is not yet ready, processing bundle content in hostpolicy means that hostpolicy and hostfxr DLLs need to be separate from the bundle. This causes self-contained single-file apps to not be a "single file" temporarily. The change also requires supporting changes from the SDK, to publish hostfxr and hostpolicy as separate files, and to invoke HostModel library with arguments that facilitate cross-platform publishing. *Solution* To solve these, problem, this change reverts: Revert "Single-File: Process bundles in the framework (#34274)" commit 78b303d. and a dependent test-only change: Revert "Single-File Bundler: Add a FileSize test (#35149)" commit 779588a. *Risk* Medium The change is contained to only host components: apphost, hostpolicy, and hostfxr. However, the change is big, and needs testing in runtime and SDK repos. *Testing* Manually tested the SDK by inserting apphost, hostfxr, hostpolicy, and hostmodel library from this build into the `dotnet/packs` preview-4 SDK from dotnet/sdk#11518 build. Verified that: * Singlefile apps can be published and run OK for { Windows, Linux, Osx } x {netcoreapp3.0, netcoreapp3.1, netcoreapp5.0} * Cross-targeting builds of single-file apps build and run OK (ex: built on Windos, run on Mac).
This change implements the host changes proposed in the design
The main changes for single-file bundles are:
Currently, all files except deps.json and runtimeconfig.json are extracted to disk.
Once the runtime is able to processing assemblies directly from the bundle, they will no longer be extracted.
Notable details:
This facilitates linking only parts of the bundle handling code with hostfxr, while all code is linked with hostpolicy.
Fixes #32821