Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for capturing errno across calls #38832

Open
mkustermann opened this issue Oct 11, 2019 · 29 comments
Open

Support for capturing errno across calls #38832

mkustermann opened this issue Oct 11, 2019 · 29 comments
Assignees
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. library-ffi

Comments

@mkustermann
Copy link
Member

When calling low level functions, e.g. libc wrappers around syscalls, one often gets -1 back and errno will be set to the actual error message.

When using dart:ffi to to do such calls there is no atomicity between the Dart -> C call and an access of errno (if we even support that). In fact, arbitrary VM code (e.g. GC), can run in between the two. That can cause errno to be clobbered.

An option would be to save errno into a slot on Thread::last_errno_ before returning from the call. We could then add an accessor for the last_errno_.

/cc @sjindel-google @dcharkes

@mkustermann mkustermann added area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. library-ffi labels Oct 11, 2019
@ds84182
Copy link
Contributor

ds84182 commented Mar 15, 2020

This makes error handling tricky and potentially flaky when dealing with Win32 API calls. Many Win32 APIs return boolean return results (or handles) instead of an error code. It's a lot more widespread for Win32 since GetLastError is a general paradigm when working with any Windows API.

@Sunbreak
Copy link

Sunbreak commented Mar 1, 2021

This makes error handling tricky and potentially flaky when dealing with Win32 API calls. Many Win32 APIs return boolean return results (or handles) instead of an error code. It's a lot more widespread for Win32 since GetLastError is a general paradigm when working with any Windows API.

Encounter the same problem when debug https://github.com/Sunbreak/logic_conf.dart/blob/e4935e68399987d61262b3a7d20d0a2c38808471/lib/src/logic_conf_windows.dart#L54-L58

@basyskom-bmeier

This comment has been minimized.

@mraleph

This comment has been minimized.

@basyskom-bmeier

This comment has been minimized.

@krjw-eyev
Copy link

I think for desktop applications especially Windows (either dart or flutter), this would be really helpful. Are there any updates regarding this issue?

@dcharkes
Copy link
Contributor

GetLastError and errno are both available on Windows, and they are set for two different sets of functions. The first one for win32 and the latter one for c library functions. So we'll likely have to provide both on Windows.

@glanium
Copy link

glanium commented Aug 27, 2023

In .NET Framework.
There is attribute called SetLastError and function called Marshal.GetLastPInvokeError for this purpose.

.NET is designed well.

@ds84182
Copy link
Contributor

ds84182 commented Aug 27, 2023

You mean .NET made a workaround for a poorly designed API, right.

@ds84182
Copy link
Contributor

ds84182 commented Aug 28, 2023

Comment aside, just save errno when returning back from native code, and restore when calling native code. This is the method LuaJIT uses. https://luajit.org/ext_ffi_api.html#:~:text=Utility%20Functions

@tvolkert
Copy link
Contributor

Is there a way to access the errno in a platform-agnostic way? In https://github.com/tvolkert/libjpeg/blob/898aaf01bbf07619cb8a9047f7b732b1b544d01f/lib/libjpeg.dart#L197, the return code will be 0 if there was an error, and errno will contain the actual error -- and I'd like to be able to access it in a platform-agnostic way.

@dcharkes
Copy link
Contributor

Is there a way to access the errno in a platform-agnostic way? In https://github.com/tvolkert/libjpeg/blob/898aaf01bbf07619cb8a9047f7b732b1b544d01f/lib/libjpeg.dart#L197, the return code will be 0 if there was an error, and errno will contain the actual error -- and I'd like to be able to access it in a platform-agnostic way.

The idea with the native-assets feature is that the native code can be wrapped in C to catch the error in C and return a tagged union back to Dart. And then use FFIgen on the C wrapper function.

GetLastError and errno are both available on Windows, and they are set for two different sets of functions. The first one for win32 and the latter one for c library functions. So we'll likely have to provide both on Windows.

I believe all OSes will be errno, unless libjpeg is using win32 under the hood on Windows. In that case we'd need to use some ifdef to define a platform agnostic catch.

@dcharkes
Copy link
Contributor

Update on this issue: Our current thinking is that people should write wrapper C functions to capture the errno. (We should consider if we can ease that in FFIgen.)

Work on the native assets feature should enable bundling C code easily with Dart packages. (Currently in experimental.)

@mkustermann
Copy link
Member Author

We have a customer that has a valid use case for this where it's not about getting the wrong error message with some low probability but the actual application behavior requires checking the error numbers (see b/323386486).

There's a few things

  • Figure out how to get the error: Apart from calling the actual C apis, it may be sufficient to load a value from a segment register (i.e. close to one instruction to obtain it). We should investigate whether this is feasible on all OSes/archs.

  • Figure out what the overhead is if we always capture the error: For leaf and non-leaf calls. It may be that the overhead of non-leaf calls is big enough that doing this extra errno load (+stash on THR) wouldn't make them much slower. So we may consider capturing it always, e.g. in non-leaf calls. ( If the overhead is too big, we may provide a flag (similar to isLeaf) to opt into the capturing behavior.)

  • The actual captured error number could be exposed as a getter in dart:ffi.

  • We could have a compile-time solution: Generate C wrappers or runtime solution: Generate trampolines at runtime (can be done in Dart code, but only works in JIT-able environments, i.e. non-iOS)

We should validate what's the bets for package:win32.

@dcharkes Could you do some investigations here?

@dcharkes
Copy link
Contributor

If the overhead is too big, we may provide a flag (similar to isLeaf) to opt into the capturing behavior.
The actual captured error number could be exposed as a getter in dart:ffi.

We already built that once: https://dart-review.googlesource.com/c/sdk/+/240847

@mkustermann What are your thoughts on solving this with native assets instead? (And FFIgen. But win32 has it's own code gen.)

@mkustermann
Copy link
Member Author

mkustermann commented Feb 14, 2024

We already built that once: https://dart-review.googlesource.com/c/sdk/+/240847

I was hoping there's a way to do this more efficiently than what that CL does (e.g. load from segment register, store into THR). If it's a high-overhead + high-complexity solution it would certainly push the favor towards codegen.

@mkustermann What are your thoughts on solving this with native assets instead? (And FFIgen. But win32 has it's own code gen.)

Certainly open to it, you can prototype changing the win32 codegen and seeing how it would look like in the end. It may push it out until native assets are non-experimental (or some end-users can use pre-release versions that use native assets).

@matanlurey
Copy link
Contributor

Was both sad there wasn't an easy fix to this but appreciative that we're tracking the issue.

It is fantastic to be able to use dart:ffi directly without a build system or secondary toolchains, so I'm a tad sad that I might have to write (or generate) and compile custom C-code in order to access lower-level functions correctly - it's not the end of the world (the vast majority of code won't do this, and almost certainly no applications will do it directly) but it does feel like a hard limitation.

@dcharkes
Copy link
Contributor

Was both sad there wasn't an easy fix to this but appreciative that we're tracking the issue.

It is fantastic to be able to use dart:ffi directly without a build system or secondary toolchains, so I'm a tad sad that I might have to write (or generate) and compile custom C-code in order to access lower-level functions correctly - it's not the end of the world (the vast majority of code won't do this, and almost certainly no applications will do it directly) but it does feel like a hard limitation.

We have a larger set of features that we could add to dart:ffi which can also be solved by adding a thin C wrapper. The problem would be that we would have to build and maintain all these features in the Dart VM. (Unfortunately I can't scale myself in that way.) Instead, we're envisioning making it easy to bundle C code (native assets work), and making it easy to generate that thin wrapper (FFIgen or something similar). This would enabled an arbitrary set of features that can be achieved by having a generated layer of C code.

@brianquinlan
Copy link
Contributor

I wonder if it would be possible to have a special type that indicates that a call should include errno/GetLastError(). Like:

@Native<[Int, Errno] Function(Int, Pointer<Uint8>, Int)>(isLeaf: true)
external [int, Errno] read(int fd, Pointer<Uint8> buf, int count);

The idea would be that calls to read would also include a lookup to errno and errno would be included in the result.

The special types that I can imagine now are:
Errno
GetLastError
WSAGetLastError

@mraleph
Copy link
Member

mraleph commented Oct 16, 2024

I had a discussion about this with @mkustermann few months ago - but I have never wrote it down. I think we could just maintain an invariant that errno / GetLastError is preserved on the FFI transition boundary. We only destroy it whenever we go into runtime - which means we should be able to simply save and restore it on the runtime transition boundary - and that would be fairly cheap. This would mean that runtime never clobbers errno and it can be accessed.

We should then some form of errno getter to the dart:ffi... GetLastError and WSAGetLastError can be bound using normal mechanism, but errno is a more complicated beast.

@brianquinlan
Copy link
Contributor

We have a larger set of features that we could add to dart:ffi which can also be solved by adding a thin C wrapper. The problem would be that we would have to build and maintain all these features in the Dart VM. (Unfortunately I can't scale myself in that way.) Instead, we're envisioning making it easy to bundle C code (native assets work), and making it easy to generate that thin wrapper (FFIgen or something similar). This would enabled an arbitrary set of features that can be achieved by having a generated layer of C code.

Unless we bundle a tool chain with the SDK, it would mean that our users need to install their own.

@brianquinlan
Copy link
Contributor

I had a discussion about this with @mkustermann few months ago - but I have never wrote it down. I think we could just maintain an invariant that errno / GetLastError is preserved on the FFI transition boundary. We only destroy it whenever we go into runtime - which means we should be able to simply save and restore it on the runtime transition boundary - and that would be fairly cheap. This would mean that runtime never clobbers errno and it can be accessed.

Is there a single boundary point into the runtime?

@mraleph
Copy link
Member

mraleph commented Oct 21, 2024

Is there a single boundary point into the runtime?

There is a very limited amount of ways to transition from Dart code to native code.

  • FFI
  • CallToRuntime stub for non leaf calls (accessed via CallRuntime macroassembler helper).
  • For leaf calls: LeafRuntimeCall IL instruction and LeafRuntimeScope helper

I think that's it.

@brianquinlan
Copy link
Contributor

I'd like to try to figure this out.

@brianquinlan
Copy link
Contributor

An alternative approach, suggested by @aam, is to capture errno at the end of every FFI call and restore it before every FFI call.

As long as we never expose errno to the user directly from Dart, this should be semantically equivalent to saving errno on runtime entry but:

  1. it would not incur any cost except when executing FFI code
  2. it would work even if an isolate using ffi is scheduled across multiple threads e.g.
if (read(...) == -1) {
  await Future.delayed(const Duration());
  print(__errno());
}

I have a POC sketch here:
https://dart-review.googlesource.com/c/sdk/+/391843

@mraleph
Copy link
Member

mraleph commented Oct 25, 2024

@brianquinlan

it would not incur any cost except when executing FFI code

That's exactly what we wanted to avoid though. The idea has been considered before, as @mkustermann wrote all the way in 2019:

An option would be to save errno into a slot on Thread::last_errno_ before returning from the call. We could then add an accessor for the last_errno_.

But we were somewhat hesitant to implement this because it would add 4 memory moves for each FFI transition - which probably is invisible on non-leaf calls, but would likely be visible for leaf calls.

On the other hand maybe we overestimated the cost... So we could just benchmark it and see how big the overhead actually is.

I have a POC sketch here:

It looks reasonable but for non-leaf calls save needs to happen after safepoint transition (and restore before transitioning out of safepoint) - because it is the safepoint transition which destroys it.

@brianquinlan
Copy link
Contributor

I tested using this patch: https://dart-review.googlesource.com/c/sdk/+/391843/9

Here is a test to demonstrate the semantic change:

import 'dart:ffi';
import 'dart:io';
import 'package:ffi/ffi.dart';

@Native<Int Function(Pointer<Utf8>, Int)>(isLeaf: true)
external int open(Pointer<Utf8> buf, int flags);

@Native<Pointer<Int> Function()>(isLeaf: true)
external Pointer<Int> __error();

int get errno => __error().value;

main() async {
  print('About to call open (EPERM=1, ENOENT=2)');
  final openResult = open("/tmp/doesnotexist".toNativeUtf8(), 0);
  print('Open returned: $openResult, errno=$errno');
  try {
    Directory('/tmp').deleteSync();
  } on Exception {}
  print("Called `Directory('/tmp').deleteSync()`, errno=$errno");
}

Without the patch:

About to call open (EPERM=1, ENOENT=2)
Open returned: -1, errno=2
Called `Directory('/tmp').deleteSync()`, errno=1      <- the failed `deleteSync` changed errno

With the patch:

About to call open (EPERM=1, ENOENT=2)
Open returned: -1, errno=2
Called `Directory('/tmp').deleteSync()`, errno=2      <- the failed `deleteSync` did *not* change errno

I'm running some benchmarks now but, if there is a significant performance regression, I think that it would be fine to force the developer to specify whether they want to save errors or not e.g.

@Native<Int Function(Pointer<Utf8>, Int)>(isLeaf: true, saveErrors: true)
external int open(Pointer<Utf8> buf, int flags);

The documentation for saveErrors could explain why certain patterns won't work:

  final openResult = open("/tmp/doesnotexist".toNativeUtf8(), 0);
  await ...  // Isolate might be run on a different OS thread.
  if (errno == ...

and:

  final errnoPtr = __error();
  final openResult = open("/tmp/doesnotexist".toNativeUtf8(), 0);
  try {
    Directory('/tmp').deleteSync();
  } on Exception {}
  final errno = errnoPtr.value(); // not an FFI call, errno will not be restored

@Levi-Lesches
Copy link

Hm, if performance is a concern, what about a parameter to ffigen's config that lists which functions for which we are interested in preserving errno? That way maybe performance can be unaffected for members which we don't care about, while still being able to opt-into special treatment

@brianquinlan
Copy link
Contributor

Hm, if performance is a concern, what about a parameter to ffigen's config that lists which functions for which we are interested in preserving errno? That way maybe performance can be unaffected for members which we don't care about, while still being able to opt-into special treatment

The approach is still on the table if I can't get #38832 (comment) to work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. library-ffi
Projects
None yet
Development

No branches or pull requests