Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix crash with big apks on assembly registration (fixes 1673) #2570

Merged
merged 7 commits into from
Jan 3, 2019

Conversation

mathieubourgeois
Copy link
Contributor

@mathieubourgeois mathieubourgeois commented Dec 24, 2018

Xamarin.Android, on launch, needs to do a pass through the files in the apk to register

  • assemblies (.dll/.exe)
  • dll configuration files (.config)
  • symbols (.mdb/.pdb)
  • bindings typemaps (.jm/.mj)

To do this pass, the process was to mmap the apk in the process' adress space, then grab the memory offsets for each file and register them on each system as needed. However, mmap-ing the whole apk has the consequence of taking potentially a lot of adress space.

Considering it is legal to have a big apk (even though you can't submit an apk of more than 100 Mb to the Play Store, you can still submit it somewhere else, or embed your data in the apk during the development process for simplicity's sake), having an apk of about ~800 Mb in size automatically crashes on launch in armeabi-v7a because it can't find a contiguous 800 Mb block of adress space. The following log is usually found when that issue hits:

I/monodroid-assembly(14915): start: 0xffffffff end: 0x3da16747 len: 1033987912 apk: /data/app/<package_name>-1/base.apk

However, we don't need to mmap the whole apk, only the files that we are actually registering. Therefore, refactor the registration code to not mmap the whole apk. Instead, open the apk regularly, then when we actually need one of the entries, mmap the area of the file in the apk from a page-aligned offset and use those mmap sections for registration instead.

I tested on an ~800 Mb application that would crash before, everything ran smoothly (containing every single extension type) and the application ran properly.

Fixes #1673

Xamarin.Android, on launch, needs to do a pass through the files in the apk to register assemblies (.dll/.exe), dll configuration files (.config), symbols (.mdb/.pdb) and bindings typemaps (.jm/.mj). To do this pass, the process was to mmap the apk in the process' adress space, then grab the memory offsets for each file and register them on each system as needed. However, mmap-ing the whole apk has the consequence of taking potentially a lot of adress space. Considering it is legal to have a big apk (even though you can't submit an apk of more than 100 Mb to the Play Store, you can still submit it somewhere else, or embed your data in the apk during the development process for simplicity's sake), having an apk of about ~800 Mb in size automatically crashes on launch in armeabi-v7a because it can't find a contiguous 800 Mb block of adress space. The following log is usually found when that issue hits:
`I/monodroid-assembly(14915): start: 0xffffffff end: 0x3da16747 len: 1033987912 apk: /data/app/<package_name>-1/base.apk`
However, we don't need to mmap the whole apk, only the files that we are actually registering. Therefore, refactor the registration code to not mmap the whole apk. Instead, open the apk regularly, then when we actually need one of the entries, mmap the area of the file in the apk from a page-aligned offset and use those mmap sections for registration instead.
Fixes dotnet#1673
md_mmap_info file_info;
};

MONO_API int monodroid_getpagesize (void);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should instead be declared in monodroid-glue.h, and add a #include "monodroid-glue.h" to this file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 4a08ef9

uLong offsetPage = offset - offsetFromPage;
uLong offsetSize = size + offsetFromPage;

info.mmap_info.area = mmap(NULL, offsetSize, PROT_READ, MAP_PRIVATE, fd, offsetPage);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be some form of error detection around this; mmap(2) can fail, hence issue #1673. That said, the mmap() is required in order for anything to work, so we should check for the error and fail with a more appropriate error message than what we currently do: crash with a SIGSEGV.

Suggested change
info.mmap_info.area = mmap(NULL, offsetSize, PROT_READ, MAP_PRIVATE, fd, offsetPage);
info.mmap_info.area = mmap (NULL, offsetSize, PROT_READ, MAP_PRIVATE, fd, offsetPage);
if (info.mmap_info.area == MAP_FAILED) {
log_fatal (LOG_DEFAULT, "Could not `mmap` apk `%s` entry `%s`: %s", app, filename, strerror (errno));
exit (FATAL_EXIT_CANNOT_FIND_APK);
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, good catch. Implemented and tested in 4a08ef9

- Added check to mmap with fatal log and forced exit when mmap fails
- moved forward declaration to it's proper place in monodroid-glue.h
uLong offsetPage = offset - offsetFromPage;
uLong offsetSize = size + offsetFromPage;

info.mmap_info.area = mmap(NULL, offsetSize, PROT_READ, MAP_PRIVATE, fd, offsetPage);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can/should we also use MAP_NORESERVE? Quoting the docs:

MAP_NORESERVE
Do not reserve swap space for this mapping. When swap space is reserved, one has the guarantee that it is possible to modify the mapping. When swap space is not reserved one might get SIGSEGV upon a write if no physical memory is available. See also the discussion of the file /proc/sys/vm/overcommit_memory in proc(5). In kernels before 2.6, this flag only had effect for private writable mappings.

This is plausibly moot for Android, as Android doesn't have swap, but this code could also be used by desktop apps (e.g. the Android designer), which could run on Linux...

However, macOS doesn't have MAP_NORESERVE...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My 2 cents :

Based on what I can see in the kernel code, this is for the overflow accounting, and based on https://github.com/torvalds/linux/blob/6f0d349d922ba44e4348a17a78ea51b7135965b1/Documentation/vm/overcommit-accounting.rst, we have

For a file backed map
SHARED or READ-only - 0 cost (the file is the map not swap)
PRIVATE WRITABLE - size of mapping per instance

Since our mappings are file backed and read-only, it doesn't seem like that flag would make any difference (though, in theory, according to the documentation, it's possible to make an implementation that would need that flag).

I also took a quick look at Mono for an example and I only found one usage in libgc for Solaris on a read+write mapping. Though, again, it's possible in theory to have a kernel that needs it and then Mono would have their read-only mapping taken into account.

@@ -243,6 +243,35 @@ struct md_mmap_info {
size_t size;
};

struct md_apk_mmap_info {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this type actually necessary? md_apk_mmap_info::mmap_info appears to only be used is within md_mmap_apk_file(), so couldn't mmap_info be function-local variables, as that information doesn't appear to be needed elsewhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure that could be done, the only reason I kept this info was that at the start I was thinking about equivalent calls to munmap (otherwise, you'd need to realign the adress before munmap-ing, but we don't even do a call to munmap on those, and if we start doing it, it's just a realign so it's not really needed indeed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed at fe1236c

@@ -420,7 +430,7 @@ gather_bundled_assemblies_from_apk (
// assemblies must be 4-byte aligned, or Bad Things happen
if ((offset & 0x3) != 0) {
log_fatal (LOG_ASSEMBLY, "Assembly '%s' is located at a bad address %p\n", cur_entry_name,
((const unsigned char*) mmap_info.area) + offset);
offset);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The message no longer makes sense, as offset isn't a process-global address, it's an offset within the file. The message should be updated accordingly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed at 6dc1432

@jonpryor
Copy link
Member

jonpryor commented Jan 2, 2019

@grendello: related-to-this-pr-but-not-necessarily-part-of-it, I wonder if many of these md_mmap_OP_file functions could be replaced with member functions on a new type, lots-o-spitballing here:

class EmbeddedFileOps {
public:
    EmbeddedFileOps(int fd);
    void* open (const char *filename, int mode);
    uLong read (void *stream, void *buf, uLong size);
    // ..
private:
    int fd;
};

// ...
static int
gather_bundled_assemblies_from_apk (...)
{
    zlib_filefunc_def funcs;
    funcs.zopen_file = [](void *opaque, const char *filename, int mode) {
        return reinterpret_cast<EmbeddedFileOps*>(opaque)->open (filename, mode);
    };
    funcs.zopen_read = [](void *opaque, void *stream, void *buf, uLong size) {
        return reinterpret_cast<EmbeddedFileOps*>(opaque)->read (stream, but, size);
    };
    // ...
}

@grendello
Copy link
Contributor

@jonpryor yep, I'll work on converting this to a class and then we can probably use lambdas like so

We only use the file_info part since we never munmap those. Therefore, the whole type (which contains the info of the actual realigned mmap) is not needed. It can always be recomputed anyway by realigning to the page size
The assemblies, at that point, aren't mmap-ed anymore so we can't use a process space adress. Instead, mention that we're talking about an offset in the apk.
@jonpryor
Copy link
Member

jonpryor commented Jan 2, 2019

build

@jonpryor
Copy link
Member

jonpryor commented Jan 3, 2019

build

@jonpryor jonpryor merged commit c195683 into dotnet:master Jan 3, 2019
@mathieubourgeois mathieubourgeois deleted the fix/mmap-apk-parts branch January 3, 2019 21:36
jonpryor pushed a commit that referenced this pull request Jan 17, 2019
Fixes: #1673

During process startup, Xamarin.Android looks through the `.apk` for
various files to register within mono and itself, including:

  * Assemblies (`.dll`/`.exe`)
  * Assembly configuration files (`.dll.config`)
  * Debug symbols (`.mdb`/`.pdb`)
  * Type Map files (`.jm`/`.mj`)

Such files are stored *uncompressed* within the `.apk` and are
aligned on 4-byte boundaries via `zipalign`.

In order to *use* these files, the entire `.apk` would be
**mmap**(2)'d into the process address space, so that e.g. the
assemblies could be registered directly with mono without copying the
assembly contents "elsewhere" (e.g. disk, RAM), and instead would be
demand-paged *into* RAM from disk as needed.

There is an unfortunate downside to this approach: `.apk` files can
contain lots of content which *isn't* of interest to mono/etc., such
as Android Assets and Resources (large MP4 video files?), and the
"`mmap()` everything!" approach means that all this *unneeded* data
is *also* `mmap()`ed into the process address space.

Which is what Issue #1673 triggered: a 930MB `.apk` file with lots of
"game assets" could not load within a 32-bit address space, because
there wasn't 930MB of free contiguous address space to use:

	I/monodroid-assembly(14915): start: 0xffffffff end: 0x3da16747 len: 1033987912 apk: /data/app/<package_name>-1/base.apk
	A/libc(14915): Fatal signal 11 (SIGSEGV), code 1, fault addr 0x3da16343 in tid 14915

`mmap()`ing the whole `.apk`, while effective, is overkill.

Update `gather_bundled_assemblies_from_apk()` so that instead of
loading the entire `.apk` with one `mmap()` call, we instead use
separate `mmap()` calls for each distinct file of interest within the
`.apk` -- on OS page-aligned memory -- so that we don't excessively
use process address space.
@github-actions github-actions bot locked and limited conversation to collaborators Feb 1, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants