Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

provide an option to handle unintroduced APIs as weak symbols #837

Closed
DanAlbert opened this issue Nov 5, 2018 · 43 comments
Closed

provide an option to handle unintroduced APIs as weak symbols #837

DanAlbert opened this issue Nov 5, 2018 · 43 comments
Assignees
Milestone

Comments

@DanAlbert
Copy link
Member

Tracking bug for the roadmap item "Weak symbols for API additions": https://android.googlesource.com/platform/ndk/+/master/docs/Roadmap.md#weak-symbols-for-api-additions

iOS developers are used to using weak symbols to refer to function that may be present in their equivalent of targetSdkVersion but not in their minSdkVersion. They use a run-time null check to decide whether the new function is available or not. Apparently clang also has some support for emitting a warning if you dereference one of these symbols without a corresponding null check.

This seems like a more convenient option than is currently available on Android, especially since no currently shipping version of Android includes a function to check which version of Android you're running on.

We might not want to make this the default (because it's such a break with historical practice, and might be surprising), but we should offer this as an option.

An interesting technical problem here will be dealing with the DT_NEEDED situation for “I need this library (but it might not exist yet)”

I spent a bit of last week investigating possible solutions to that last part and haven't found anything that will work on existing Android versions, so it may be that APIs added in new libraries either can't be covered by this or will need to be handled in a different manner (a wrapper library or something like that).

@DanAlbert DanAlbert added this to the unplanned milestone Nov 5, 2018
@alexcohn
Copy link

alexcohn commented Nov 6, 2018

Consider building a set of fallback shim libraries. If my system does not resolve some API, the dummy entry in shim library will be dynamically linked instead. You will need a set of static libs for each pair of (compileSdkVersion, minSdkVersion). Actually, it's enough to have one lib per (compileSdkVersion, minSupportedSdkVersion) for given NDK release.

Unlike weak symbols, this workaround does not require runtime check of nullptr, but still the app is expected to misbehave if it actually performs a call to a function that is not actually implemented. From the programmer's point of view, this works similar to the Java SDK.

As for DT_NEEDED, it can be resolved with similar dummy shim libraries. The lookup order on recent Androids favors system folder over the app folder, so if the APK supplies, e.g. libneuralnetworks.so.so, it will be loaded only on Nougat and below.

@DanAlbert
Copy link
Member Author

Yeah, something along those lines is more or less what we landed on after discussing this yesterday. One of the original motivations for trying to do this via weak symbols is that they have a few nice properties.

The first is that Clang's __attribute__((introduced=21)) can be configured to automatically create a weak symbol when the target API level predates the introduced API level.

The second is that using some maybe-not-available API would look like the following:

if (some_api != nullptr) {
  some_api();
} else {
  fallback_impl();
}

And supposedly Clang is able to generate a warning if a maybe-not-available API is used without a guard, which covers my concern that this is just going to give people a false sense of security when turning their build failures into run time failures on devices they don't test. Aside from my concerns that the warning would not be reliable enough (or alternatively too noisy and just disabled by developers), enh pointed out that it works nicely for single calls like the above but poorly for what is probably a more common use case:

if (new_api_1 && new_api_2 && new_api_3) {
  new_api_1();
  new_api_2();
  new_api_3();
} else {
  fallback();
}

For the latter it really would be preferable to just write the single if (device_api() >= required_api) and be done with it.

Plus, with weak symbols any error in checking for the existence of an API would just result in a segfault when the user tries to call a null function. With a shim library we can at least log something useful before aborting.

Given that (and the fact that we have yet to find a way to make the weak symbol option work at all with the case of unavailable libraries), I think a stub library is the way to go.

@wang-bin
Copy link

wang-bin commented Nov 7, 2018

The second is that using some maybe-not-available API would look like the following:

if (some_api != nullptr) {
some_api();
} else {
fallback_impl();
}

For apple platforms it's not recommended, instead we can use

if (__builtin_available(iOS 11.0, macOS 10.13, *)) {
   new_api();
} else {
  ...
}

Letting the macro support android should be a good choice.

@DanAlbert
Copy link
Member Author

DanAlbert commented May 19, 2020

Another issue I brought up on #1262 that isn't mentioned here is that there's no non-infectious way to set ANDROID_API_WEAK_AVAILABILITY or whatever we'd call it. Say you have your code (foo.cpp) and a dependency (bar.h) with the following:

// android/some_header.h
#if __ANDROID_API__ >= 30
void some_api();
#elif defined(ANDROID_API_WEAK_AVAILABILITY)
void some_api() __attribute__((weak));
#endif
// bar.h
#include <android/some_header.h>
// foo.cpp
#include <bar.h>
#define ANDROID_API_WEAK_AVAILABILITY
#include <android/some_header.h>

void foo() {
    if (android_get_device_api_level() >= 30) {
         some_api();
    } else {
        fallback_impl();
    }
}

When this is build for minSdkVersion 29 or lower, foo.h will not compile because android/some_header.h will not be re-included with the new context because of the header guard. It also wouldn't be possible to separate these into android/some_header.h and android/some_header_weak.h because that would result in redefinition errors when both are included in the same TU.

In this particular example the problem can be fixed by including bar.h after android/some_header.h, but that's not actually a solution, it just reverses the problem in a way that happens to work for this example.

I don't think this is a particularly unrealistic issue given the C++ community's favor for header-only libraries.

As I said on the other thread, I think the better solution here is to make the jetpack libraries that handle the dlopen/dlsym shims for you. That's something we'll be doing either way. Still, not going to close this until we're sure.

@cpsauer
Copy link

cpsauer commented Jan 29, 2021

Chiming in: This would indeed be super useful to me and my team.

Coming from iOS, I'd originally assumed that the NDK already had support for this as a basic backwards compatibility feature.

Their solution (support null checks+attributes, make library load weak, use available builtins, and let the compiler take care of enforcing the proper checks) is a really great developer experience.

@cpsauer
Copy link

cpsauer commented Jan 29, 2021

If anyone has gotten this working already by patching the header, I'd love to hear from you! I can imagine ways you might be able to redefine __INTRODUCED_IN to get the desired behavior.

@ringoz
Copy link

ringoz commented Sep 25, 2021

Looks like the issue is fixed?
I am using __ANDROID_UNAVAILABLE_SYMBOLS_ARE_WEAK__ with great success in r23.

@DanAlbert
Copy link
Member Author

I suppose we've technically done this because we added the option, but it's completely unsupported and untested. Until we're actually doing that I'm not comfortable even advertising the option (the option was added for a different use case than the NDK, with the expectation that it would also be used for the NDK once we had time to make it a tested configuration).

If you do end up running into issues later, please file them, but understand that they are inherently low priority as this isn't something we support yet.

@DanAlbert
Copy link
Member Author

@ZijunZhaoCCK is looking into this atm. Assuming it all goes well, we'll include this in r26. We did find at least one issue with the current headers where this doesn't work for libc symbols because they're preprocessed. We know how to fix that, but will need to see if anything breaks when we remove the preprocessing (it definitely has been a problem in the past, but I think it's finally obsolete).

@DanAlbert
Copy link
Member Author

This does have an annoying source compat implication for apps. Any code that defines its own implementation for an API that's not available in their minSdkVersion will fail to build with -Werror=unguarded-availability (and using this feature without that -Werror flag would be extremely foolish). We have an example of that failing to build when I tried to remove the #ifdef guards from libc so those symbols would be weak rather than omitted:

The build fails with:

external/giflib/dgif_lib.c:399:27: error: 'reallocarray' is only available on Android 29 or newer [-Werror,-Wunguarded-availability]
            (SavedImage *)reallocarray(GifFile->SavedImages,
                          ^~~~~~~~~~~~
external/giflib/gif_lib.h:248:1: note: 'reallocarray' has been marked as being introduced in Android 29 here, but the deployment target is Android 21
reallocarray(void *optr, size_t nmemb, size_t size);
^
external/giflib/dgif_lib.c:399:27: note: enclose 'reallocarray' in a __builtin_available check to silence this warning
            (SavedImage *)reallocarray(GifFile->SavedImages,
                          ^~~~~~~~~~~~

The advice in the error here is actually wrong. It should not use __builtin_available, since the definition is always available; it's provided by the same library.

Any app doing something like this (it's not unheard of, but I don't have any way of estimating how common it is) would need to fix that decl to read:

extern void * reallocarray(void *optr, size_t nmemb, size_t size)
    __attribute__((availability(android,introduced=1)));

Doing that won't be enough though, because it will cause -Wavailability to be thrown for the redecl with different availability:

external/giflib/gif_lib.h:248:68: error: availability does not match previous declaration [-Werror,-Wavailability]
reallocarray(void *optr, size_t nmemb, size_t size) __attribute__((availability(android,introduced=1)));
                                                                   ^
out/soong/ndk/sysroot/usr/include/malloc.h:73:106: note: previous attribute is here
void* reallocarray(void* __ptr, size_t __item_count, size_t __item_size) __BIONIC_ALLOC_SIZE(2, 3) __wur __INTRODUCED_IN(29);
                                                                                                         ^
out/soong/ndk/sysroot/usr/include/android/versioning.h:62:36: note: expanded from macro '__INTRODUCED_IN'
#define __INTRODUCED_IN(api_level) __BIONIC_AVAILABILITY(introduced=api_level)
                                   ^
out/soong/ndk/sysroot/usr/include/android/versioning.h:53:54: note: expanded from macro '__BIONIC_AVAILABILITY'
#define __BIONIC_AVAILABILITY(__what) __attribute__((__availability__(android,__what)))
                                                     ^

That warning can of course be disabled, but I'm not sure if that warning also covers more useful diagnostics. Besides, I don't like the source compat impact.

I think ideally we'd split that warning into -Wavailability-redefined so it's tightly scoped enough to be safely disabled, and also change clang so that any redeclaration of a function without __attribute__((availability)) clears any prior availability attributes. Those are changes that would need to be upstreamed to clang though, and I'm not familiar enough to know if the second change is viable.

@cpsauer
Copy link

cpsauer commented Aug 23, 2022

IMO still super valuable for most use cases! (Even if it'd require, e.g., renaming backport name collisions until a patch lands in clang.)

@DanAlbert
Copy link
Member Author

Oh, definitely. Just noting a hiccup we found on the way that we'll need to deal with before I'm happy calling this "complete"

@cpsauer
Copy link

cpsauer commented Aug 23, 2022

And an nonobvious one at that! Thanks for all you're doing, Dan!

@alexcohn
Copy link

For giflib or any other project that does backporting of system APIs or otherwise breaks when weak symbols are available, the best would be to have one switch that would disable this feature altogether. Or let them use older NDK until they fix it.

On the other hand, for libraries this introduces a new vulnerability, doesn't it? When a consumer of a binary lib is built with the new weak symbols support, and the library includes its own shim (like giflib above), would it not cause strange conflicts on different target platforms?

@DanAlbert
Copy link
Member Author

the best would be to have one switch that would disable this feature altogether

There are no plans to remove the current behavior, nor even to make weak symbols the default. Turning build failures into crashes by default would be a terrible move IMO :)

On the other hand, for libraries this introduces a new vulnerability, doesn't it? When a consumer of a binary lib is built with the new weak symbols support, and the library includes its own shim (like giflib above), would it not cause strange conflicts on different target platforms?

I don't think this would end up being all that different than the current behavior in the same situation? In either case the app ends up with a definition of the symbol, and I believe whether it or the one in the OS gets used depends on the OS version (because prior to M or so the behavior was non-standard).

@alexcohn
Copy link

I don't think this would end up being all that different than the current behavior in the same situation

I don't know. The question is whether adding such library to a build will have side-effects on the rest of the project (even if this means choosing a different shim for, e.g. reallocarray API).

@DanAlbert
Copy link
Member Author

I don't believe it does. Resolution order at load time is unaffected by the type of symbol, so that's unaffected. I believe the same is true for link time when using a shared libc. Anyone using a static libc would be affected, but there's no point in enabling this flag if you use a static libc.

@DanAlbert DanAlbert pinned this issue Sep 28, 2022
@DanAlbert
Copy link
Member Author

@pixelflinger found that this doesn't work for anything in opengl (and similarly, probably not vulkan, opensles, or zlib) because those headers aren't annotated with __INTRODUCED_IN (which is how this feature is implemented), because we don't want to modify third-party headers.

In the case of opengl, I think we can actually fix this, because it looks like we generate that header rather than shipping one from khronos. I believe that's generated by glgen.py from entries.in.

Vulkan headers come directly from Khronos, so I'm not sure what to do about those.

OpenSLES isn't really a problem, because if you want non-painful audio you're using oboe anyway.

zlib probably doesn't matter because if you care enough about zlib to want new features you're probably bundling your own anyway?

@ZijunZhaoCCK
Copy link
Collaborator

Repository owner moved this from Triaged to Merged in NDK r26 Nov 10, 2022
@DanAlbert DanAlbert unpinned this issue Nov 10, 2022
@cpsauer
Copy link

cpsauer commented Nov 10, 2022

So excited to use this. Thank you!

@cpsauer
Copy link

cpsauer commented Nov 17, 2022

Re the first of the known issues and limitations:

Only symbols are affected, not libraries. The only way to conditionally depend on a library that is not available in the app's minSdkVersion is with dlopen. We do not know how to solve this in a backwards compatible manner.

IIRC, Xcode has an optional/weak link feature that gets around this problem somehow. Any idea how Apple implemented that one?

@DanAlbert
Copy link
Member Author

They do use an entirely different binary format. ELF doesn't support this. Presumably mach-o does?

They also have the super power of their users upgrading much more frequently, so they can assume new OS features are present soon after adding them. If we change the behavior of the loader in the next Andrlid release it still wouldn't matter because you couldn't use it until that release was your minSdkVersion, and the whole point of this bug is that that's impractical.

@cpsauer
Copy link

cpsauer commented Nov 18, 2022

Ah. Yeah, something requiring loader changes for sure wouldn't be a short term fix. (On the flip side, the slower update cycle is why the work you all have done here is so extra valuable!)

Reading more, it looks like Apple's functionality is based around allowing libraries to be missing iff all symbols are weak. I remember this came up over here, but was then debated. Pretty much confidence that this behavior is indeed different on Android?

It strikes me that maybe this'd be patchable in the linker or build system, where linking against system libraries that are conditionally available is automatically replaced with dlopen (plus maybe wrappers) that implement that. Strikes me that the lazy loading flag might already be enough to get the no-crash-if-unused behavior. But something equivalent to {if should be available -> dlopen now} would, too.

Anyway, just poking around brainstorming in case there remains an interesting possibility somewhere. Hopefully this isn't too annoying.

@DanAlbert
Copy link
Member Author

Pretty much confidence that this behavior is indeed different on Android?

No one was able to back up that claim. If you mean different between Android and iOS (but the same between Android and every other ELF platform), I think that much is clear. It sounds like the person that made that claim was talking about ELF, but they didn't provide a source and we have quite a few ELF experts that have never heard of this. There aren't many things we love more than being proven wrong, so if anyone can point us at the spec for that behavior, we'd appreciate it. Like I said previously though, it wouldn't be useful until your minSdkVersion was V+, since that's the absolute earliest we'd be able to ship that in our loader.

@cpsauer
Copy link

cpsauer commented Nov 23, 2022

Totally. That loader behavior would sure be handy. I'll bump in case he has a lead. But otherwise a quick-and-dirty test might be worthwhile, unless you're fairly sure he's wrong?

If not, what'd you think about the linker/build system solution for back porting without loader changes?

@DanAlbert
Copy link
Member Author

It sounds significantly messier and harder to implement correctly than just providing an androidx library.

osspop pushed a commit to osspop/android-ndk that referenced this issue Jan 17, 2023
Bugs: android/ndk#837
Test: run_test.py
Change-Id: I8e353a2adc46dfd36bfc289ecb0ac0a7078b515c
osspop pushed a commit to osspop/android-ndk that referenced this issue Jan 17, 2023
Bug: android/ndk#837
Tests: ./run_tests.py –rebuild –filter weak_symbols
Change-Id: I2743698b0063c636200528d278d38fd27162a3c3
osspop pushed a commit to osspop/android-ndk that referenced this issue Jan 17, 2023
This is overly verbose atm because the parts relevant to app developers
will be migrated to DAC later on.

Bug: android/ndk#837
Test: None
Change-Id: I3508cdd17ba7dedffb49aa1bb76e64f7aee5e1a4
osspop pushed a commit to osspop/android-ndk that referenced this issue Jan 17, 2023
Bug: android/ndk#837
Tests: ./run_tests.py –rebuild –filter weak_symbols_build_support
Change-Id: I08b45b667a7480fc6e68d6edffeecdb14700cde0
osspop pushed a commit to osspop/android-ndk that referenced this issue Jan 17, 2023
Bugs: android/ndk#837
Test: ./run_tests.py –rebuild –filter weak_symbols_off_by_default
Change-Id: I99e15f666b157e1ec6a7eaeaf97e65f80676105b
osspop pushed a commit to osspop/android-ndk that referenced this issue Jan 17, 2023
Bug: android/ndk#837
Tests: ./run_tests.py –rebuild –filter
weak_symbols_unguarded_availability

Change-Id: Iedd55e9a81e5f61a59a922f453efafe29bb6e768
osspop pushed a commit to osspop/android-ndk that referenced this issue Jan 17, 2023
Bug: android/ndk#837
Test: ./run_tests.py –rebuild –filter "weak_symbols*"
Change-Id: I3c35195e0ddebeafc09139c29589a184d0040069
@smeenai
Copy link

smeenai commented Sep 14, 2023

Were there any fundamental changes to this feature between r23 and r26 that would make it not recommended to use in r23? We're stuck on r23 for a bit because of needing to support API levels older than 19 for a while longer, but it'd be nice to be able to use this in the meantime. We do have the ability to tweak things in the NDK if there's only a small number of changes needed to make it reliable on r23.

@smeenai
Copy link

smeenai commented Sep 14, 2023

We just found a pretty big landmine with this feature. In the example we just encountered this on:

if (__builtin_available(android 29, *)) {
  ATrace_beginAsyncSection("ndk::asyncBeginEndSection", 0);
}

The code is right. It includes a check for __builtin_available to make sure the app is running on a new enough version of the OS before making the maybe-available API call.

The part that was wrong isn't visible here (and therefore isn't visible to the compiler, so it can't diagnose this): the build script forgot to link libandroid. The loader saw the weak symbol, couldn't find a definition for it, and set ATrace_beginAsyncSection to nullptr. __builtin_available returned true though, because the device we were running on was API 32, so the call segfaulted.

Could you add a #pragma comment(lib, "android") in the relevant headers so that including the header also ensures that you link against the library? That started out as a Windows thing, but Clang and LLD have supported it for ELF for a while now, and I believe e.g. Fuchsia uses it pretty extensively. I guess it'd still need to be guarded by an __ANDROID_API__ check for libraries which were introduced in later API levels, so it's not foolproof, but it's still something.

@DanAlbert
Copy link
Member Author

Were there any fundamental changes to this feature between r23 and r26 that would make it not recommended to use in r23?

The only way to answer that question would be looking at two years of git history. We don't keep details of unsupported things in the changelog, so until r26 it won't have been tracked.

@alecazam
Copy link

alecazam commented Sep 26, 2023

Do these availability macros just not work, or is this work not finished? There are no tutorials on how to do this basic task that is so simple on iOS/macOS. I'm compiling NDK-based code where with a min targetSDK of 26 using NDK 25.2.xxxx. I'm assuming I need to set a flag to enable weak NDK symbols.

android/Debug_android.cpp:44:62: error: '__android_log_get_minimum_priority' is unavailable: introduced in Android 30
                if ( __android_log_is_loggable( androidPriority, data.tag, __android_log_get_minimum_priority() ) )
                
android/Debug_android.cpp:44:8: error: '__android_log_is_loggable' is unavailable: introduced in Android 30
                if ( __android_log_is_loggable( androidPriority, data.tag, __android_log_get_minimum_priority() ) )
                     ^
if (__builtin_available(android 30, *)) {
		useAndroidLogMessage = true;
		
		// have to test this manually
		// default priority is ANDROID_LOG_DEFAULT if not set
		if ( __android_log_is_loggable( androidPriority, data.tag, __android_log_get_minimum_priority() ) )
		{
			return;
		}
}

@DanAlbert
Copy link
Member Author

using NDK 25.2.xxxx.

image

You're using the wrong release.

sepehrst pushed a commit to spsforks/android-bionic-libc that referenced this issue Apr 22, 2024
Abandoned

Can't be done until we solve android/ndk#837 (comment)

Patch-set: 2
Status: abandoned
Tag: autogenerated:gerrit:abandon
Attention: {"person_ident":"Gerrit User 1043845 \u003c1043845@85c56323-6fa9-3386-8a01-6480fb634889\u003e","operation":"REMOVE","reason":"Change was abandoned"}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Status: Merged
Development

No branches or pull requests

10 participants