Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an option to generate layout tests in a different file. #1655

Open
Lokathor opened this issue Oct 23, 2019 · 33 comments
Open

Add an option to generate layout tests in a different file. #1655

Lokathor opened this issue Oct 23, 2019 · 33 comments

Comments

@Lokathor
Copy link
Contributor

Replying to #1651 (comment) as a new issue


My initial comment was "these generated tests sure don't seem to actually be useful" and then I was asked for an explanation.

So, I guess it depends on who these tests are for.

let's look at an example

#[repr(C)]
#[derive(Debug, Copy, Clone, PartialEq)]
pub struct SDL_AssertData {
  pub always_ignore: winapi::ctypes::c_int,
  pub trigger_count: winapi::ctypes::c_uint,
  pub condition: *const winapi::ctypes::c_char,
  pub filename: *const winapi::ctypes::c_char,
  pub linenum: winapi::ctypes::c_int,
  pub function: *const winapi::ctypes::c_char,
  pub next: *const SDL_AssertData,
}
#[test]
fn bindgen_test_layout_SDL_AssertData() {
  assert_eq!(
    ::core::mem::size_of::<SDL_AssertData>(),
    48usize,
    concat!("Size of: ", stringify!(SDL_AssertData))
  );
  assert_eq!(
    ::core::mem::align_of::<SDL_AssertData>(),
    8usize,
    concat!("Alignment of ", stringify!(SDL_AssertData))
  );
  assert_eq!(
    unsafe { &(*(::core::ptr::null::<SDL_AssertData>())).always_ignore as *const _ as usize },
    0usize,
    concat!(
      "Offset of field: ",
      stringify!(SDL_AssertData),
      "::",
      stringify!(always_ignore)
    )
  );
  // and it goes on and on for all the other fields.
}

(We're ignoring the "derefing a null pointer" UB, which the other issue covers, we have to ask ourselves what these tests are intended to check.)

So the Rust types have defined widths and alignments, but the C types are implemented as aliases and they theoretically vary by platform (c_int isn't necessarily always i32, it just usually is).

Now maybe it's a useful thing to have a test where we assert the sizes we expect for the C types based on the local C types when we generated the bindings. You might care to know that the bindings were generated on one machine where C was like "x" and now you're on a machine where C is like "y".

If we do want to check that the local C environment is the same as the one bindgen saw when it ran, we should do that via a single test once at the top of the module:

#[test]
fn test_local_c_env_matches_bindgen_c_env(){
  // assert stuff here
}

We should not have tests per field per struct spread all across the module basically asserting that 2+2 is 4 for thousands of lines.

On the other hand, it's entirely possible that you don't care that the target C environment doesn't match the C environment at the time of running bindgen, because you expected that, and all the structs are defined in terms of the ctype aliases so with the local aliases they've already been corrected to the proper sizes. In this scenario, the current test suite system of being per field is also still mostly useless, since if you have just a single test_local_c_env_matches_bindgen_c_env function that tells you about the changes too.

Either way, we don't need to have size, alignment, and field offset tests per struct. I can't see what scenario that it's guarding against that having a single test of the local ctypes doesn't solve.

@emilio
Copy link
Contributor

emilio commented Oct 23, 2019

Not really, the key is that those offsets come from clang itself, not from bindgen, so we know them to be right.

Bindgen does a bunch of transformations, like adding padding when needed, adding custom types for bitfields, handle base classes, vtables, etc.

For trivial structs like your example obviously it's pretty hard for bindgen to get wrong, but for edge cases it does catch bugs and has caught bugs in the past. See #1007 #1516 etc...

Having a clear indication that something is wrong is much better than hitting weird memory corruptions.

@Lokathor
Copy link
Contributor Author

Alright, so those are more complicated. My experience has been colored by just binding various kinds of "plain C" headers I guess.

Would it be appropriate for "plain C" structs (not using bitfields or vtables) to cut down on all the tests being output?

@emilio
Copy link
Contributor

emilio commented Oct 23, 2019

You can do so yourself with --no-layout-tests or the equivalent builder option.

@Lokathor
Copy link
Contributor Author

Doesn't that block all layout tests though?

@emilio
Copy link
Contributor

emilio commented Oct 24, 2019

Yes, it does... You want the size and alignment tests but not the field offset tests?

@Lokathor
Copy link
Contributor Author

I would like a default where the things that really need tests (bitfields and vtables) get tests, but plain C structs don't get tests.

In my particular case of Xlib, there was a 50% file size difference (14k lines down to 7k lines), which actually caused my compile time to double as well because the bottleneck on bindings is just reading the disk and parsing it all.

@emilio
Copy link
Contributor

emilio commented Oct 24, 2019

Well, the issue is that issues are often found when those complex structs get used in "simple" structs, so this is not as easy as it may seem.

Maybe a good trade off for your case is generating tests only if #![cfg(test)] is set? This can be checked from a build script pretty trivially, IIRC.

@Lokathor
Copy link
Contributor Author

Lokathor commented Oct 24, 2019

What if it just made two module files, one for the real output and then a separate file for the tests, then the user can put that 2nd file in the tests/ folder separate from the src/ files (or wherever else they want).

@emilio
Copy link
Contributor

emilio commented Oct 24, 2019

That'd be a big-ish change, specially for the CLI, that could no longer output to stdout, or would need to have different commands to support both the old and the new behavior.

Not necessarily opposed, but not sure if the trade offs make it worth it.

@Lokathor
Copy link
Contributor Author

good point!

ah, here's an idea easy to implement: a flag for --sort-tests-at-end (or other name) which just puts all test entries at the end of the output, then people can easily just cut and paste that content to another location manually.

@emilio
Copy link
Contributor

emilio commented Oct 25, 2019

I think we could reasonably do that with or without a flag... But caveats below.

I think it would be bad recommendation to make people rely on the order we emit the bindings, if only because it depends on system headers for example...

So I thought it'd be reasonable to add a "test_destination" option of sorts (a Writer in the Builder API), so that if not specified all tests go at the end, and otherwise they go at the destination.

If that sounds good, it's not too hard. But we need to figure out what to do when C++ namespaces are enabled... Right now the tests rely on having the structure name available on their scope. With namespaces we'd need to absolutize them, or such...

@Lokathor
Copy link
Contributor Author

That sounds great!

I only use the CLI version, but I assume that this could be a CLI flag too?

@emilio
Copy link
Contributor

emilio commented Oct 25, 2019

Sure, cli could use it too, I guess.

@emilio emilio changed the title What benefit do all the generated tests provide exactly? Add an option to generate layout tests in a different file. Oct 28, 2019
@emilio
Copy link
Contributor

emilio commented Oct 28, 2019

Retitling to reflect the discussion.

@gnzlbg
Copy link
Contributor

gnzlbg commented Nov 4, 2019

I would like a default where the things that really need tests (bitfields and vtables) get tests, but plain C structs don't get tests.

Plain C structs do need tests though.

@Lokathor Can you show a single C struct for which a generated Rust struct would never need tests ?

@Lokathor
Copy link
Contributor Author

Lokathor commented Nov 4, 2019

I said "don't get".

@gnzlbg
Copy link
Contributor

gnzlbg commented Nov 4, 2019

Sure, and I understand that as a claim that these tests are not needed, since you also claimed:

I can't see what scenario that it's guarding against that having a single test of the local ctypes doesn't solve.

Hence why I asked if there is a single struct for which that (no tests needed) is actually the case.

@Lokathor
Copy link
Contributor Author

Lokathor commented Nov 4, 2019

Uh, weird request, I think that should be obvious but okay.

typedef struct SDL_Point
{
    int x;
    int y;
} SDL_Point;

@gnzlbg
Copy link
Contributor

gnzlbg commented Nov 4, 2019

So that code needs tests, because rust bindgen will generate, amongst other things:

#[repr(C)]
struct SDL_Point {
    x: user_provided::c_int,
    y: user_provided::c_int,
}

where user_provided::c_int can have the wrong size and the wrong alignment, resulting in SDL_Point having the wrong size and alignment as well (and well, resulting in y having the wrong offset).

Also, even if libc::c_int (or std::os::raw::c_int) was used, the C library that actually gets linked might have been compiled for a slightly different target (or different compiler options) by mistake resulting in the int in C mismatching the libc::c_int, or a different incompatible version of the library might have been linked, resulting in the C type having a completely different layout.

@Lokathor
Copy link
Contributor Author

Lokathor commented Nov 4, 2019

It's not the job of every single struct to check and recheck that the c_types declarations are correct. You only need to check them once at the top of the module. They won't change mid-module.

Also, the tests are generated in terms of hard values (eg: 4usize), not in terms of the C types (eg: size_of::<c_int>()), which makes them not even portable to other C runtimes where the sizes actually are different, even though the struct definition under test is properly portable.

@gnzlbg
Copy link
Contributor

gnzlbg commented Nov 4, 2019

You can have a header containing SDL_Point:

typedef struct SDL_Point
{
    int x;
    int y;
} SDL_Point;

and for which rust-bindgen generates:

#[repr(C)]
struct SDL_Point {
    x: user_provided::c_int,
    y: user_provided::c_int,
}

but when linking you can link against a different version of your library which has the following type as C layout:

typedef struct SDL_Point
{
    long x; long y;
    // or:
    int* xs; 
    // or....
} SDL_Point;

The current tests diagnose this problem, which is quite common (use previously-saved generated bindings while linking a newer version of the library). Checking c_int and c_long size and alignment in your module does not diagnose this problem.

@Lokathor
Copy link
Contributor Author

Lokathor commented Nov 4, 2019

Since the tests don't call into C and use the C sizeof operator on the current library version's definition of the struct I don't see how they'd catch that particular problem.

Either way, Emilio had already very kindly agreed to put this change into the plans so I don't care to continue the discussion.

@gnzlbg
Copy link
Contributor

gnzlbg commented Nov 4, 2019

Since the tests don't call into C and use the C sizeof operator on the current library version's definition of the struct I don't see how they'd catch that particular problem.

The tests should call into C (that's how they are able to use C's sizeof), and when they are generated they should use the header of the C library actually being used, which might differ from the one used to generate the Rust bindings.

@Lokathor
Copy link
Contributor Author

Lokathor commented Nov 4, 2019

Please look at this definition and the test that goes with it: https://github.com/Lokathor/fermium/blob/master/src/SDL2-v2.0.10-x86_64-pc-windows-msvc.rs#L2780

The generated tests don't call into C code at all.

  • If you move this code to a new C environment, the struct itself will continue to be correct, but the test could start being wrong even though nothing is really wrong.
  • If you change to a new version of the C library and it has a different header file definition the test doesn't check that at all and will pass despite the binding definition being incorrect.

@gnzlbg
Copy link
Contributor

gnzlbg commented Nov 4, 2019 via email

@emilio
Copy link
Contributor

emilio commented Nov 5, 2019

Bindgen generates tests calling into libclang, so yeah, the tests depend on the target you generate the bindings for.

@gnzlbg
Copy link
Contributor

gnzlbg commented Nov 13, 2019

@emilio is it possible for bindgen to generate the tests on the fly while testing ? That is, generate the bindings on one target, which can be committed, or extended to support many targets, and when testing, check out the headers, and generate the bindings for the target in which the tests are happening ?

@emilio
Copy link
Contributor

emilio commented Nov 13, 2019

Isn't that equivalent to running bindgen at build-time?

@gnzlbg
Copy link
Contributor

gnzlbg commented Nov 13, 2019

Not really.

You run bindgen at "binding generation time" to generate bindings, e.g., for a couple of architectures, and then just commit those files to your repo. At build time, you just build the .rs files, no need to use bindgen for anything. At test time, you'd run bindgen again to generate tests for your bindings, and then tests the "pre-generated" ones from the .rs files, but this only happens if someone tries to cargo test the bindings.

You could run bindgen at build-time only for architectures that your current bindings do not support, as a "best effort" kind of thing, but for common ones users wouldn't even need to build rust-bindgen, nor need libclang installed, etc.

@emilio
Copy link
Contributor

emilio commented Nov 13, 2019

Sure, call it "running bindgen at build time if #[cfg(test)] yields true", looks like the same to me :)

We could add a flag for bindgen to only generate tests, then you could do something like that without too much trouble in your build script?

@gnzlbg
Copy link
Contributor

gnzlbg commented Nov 13, 2019

I think that would go hand in hand with this feature (e.g. the ability to generate tests in a different file).

@emilio
Copy link
Contributor

emilio commented Nov 13, 2019

Right. I guess once we implement this, doing that would be something equivalent to setting non_test_output to /dev/null, or just a Write implementation that throws away its contents :)

@thomcc
Copy link
Member

thomcc commented Apr 10, 2020

I poked at this. thomcc@db0f7ad is a mostly working implementation I nerd-sniped myself into doing. Unsure how to test it except manually. Also not sure it actually solves my problem... Will PR if interested though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants