travis: Parallelize tests on Android #41575

alexcrichton · 2017-04-27T06:00:34Z

Currently our slowest test suite on android, run-pass, takes over 5 times longer
than the x86_64 component (~400 -> ~2200s). Typically QEMU emulation does indeed
add overhead, but not 5x for this kind of workload. One of the slowest parts of
the Android process is that compilation happens serially. Tests themselves
need to run single-threaded on the emulator (due to how the test harness works)
and this forces the compiles themselves to be single threaded.

Now Travis gives us more than one core per machine, so it'd be much better if we
could take advantage of them! The emulator itself is still fundamentally
single-threaded, but we should see a nice speedup by sending binaries for it to
run much more quickly.

It turns out that we've already got all the toos to do this in-tree. The
qemu-test-{server,client} that are in use for the ARM Linux testing are a
perfect match for the Android emulator. This commit migrates the custom adb
management code in compiletest/rustbuild to the same qemu-test-{server,client}
implementation that ARM Linux uses.

This allows us to lift the parallelism restriction on the compiletest test
suites, namely run-pass. Consequently although we'll still basically run the
tests themselves in single threaded mode we'll be able to compile all of them in
parallel, keeping the pipeline much more full hopefully and using more cores for
the work at hand. Additionally the architecture here should be a bit speedier as
it should have less overhead than adb which is a whole new process on both the
host and the emulator!

Locally on an 8 core machine I've seen the run-pass test suite speed up from
taking nearly an hour to only taking 5 minutes. I don't think we'll see quite a
drastic speedup on Travis but I'm hoping this change can place the Android tests
well below 2 hours instead of just above 2 hours.

Because the client/server here are now repurposed for more than just QEMU,
they've been renamed to remote-test-{server,client}.

Note that this PR does not currently modify how debuginfo tests are executed on
Android. While parallelizable it wouldn't be quite as easy, so that's left to
another day. Thankfull that test suite is much smaller than the run-pass test
suite.

rust-highfive · 2017-04-27T06:00:38Z

r? @aturon

(rust_highfive has picked a reviewer for you, use r? to override)

aturon · 2017-04-27T06:03:06Z

cc @TimNN @aidanhs, may be of interest to you fine folks :)

petrochenkov · 2017-04-27T08:11:13Z

src/tools/compiletest/src/main.rs

@@ -15,7 +15,7 @@
 #![feature(test)]
 #![feature(libc)]

-#![deny(warnings)]
+// #![deny(warnings)]


Nit: Looks like this need to be uncommented

alexcrichton · 2017-04-27T14:15:48Z

Locally a full build of the arm-android image took 1:47:24 before this change, and 0:39:56 afterwards. That's over an hour's time shaved off!

aidanhs · 2017-04-27T14:58:11Z

src/libstd/process.rs

@@ -1371,6 +1371,7 @@ mod tests {
            // `set` command.
            assert!((cfg!(windows) && k.starts_with("=")) ||
                    k.starts_with("DYLD") ||
+                    k == "PWD" ||


Why does this PR mean PWD needs to be excluded?

I didn't actually know why this was needed until just now! I think what's happening is:

We're shipping over all environment variables from the host to the target. One of these is PWD. This means that when this test is executed, then PWD is the same as it is on the host, but it doesn't match the target.

When the test runs, it executes env

On android, this looks like it uses a shell, which probably resets PWD

This means that the env command on Android is printing out a different value of PWD than the parent process has (b/c we reset it manually).

Now that I actually know what's happening I'll remove this change and alter the "ship env vars to the client" code to skip PWD (it already skips a few others)

aidanhs · 2017-04-27T15:09:34Z

This is pretty awesome. Also, this made me laugh:

As a final fix I discovered that the ARM and Android test suites were actually running all library unit tests (e.g. stdtest, coretest, etc) twice. I've corrected that to only run tests once which should also give a nice boost in overall cycle time here.

This is the removal of find_tests I assume (which I note has also been done for emscripten). Any particular reason you opted to keep the deps subdirectory rather than out_dir itself? The latter feels slightly more intuitive to me, but I don't know if it might miss some tests.

aidanhs · 2017-04-27T15:14:35Z

src/tools/compiletest/src/main.rs


+            // android debug-info test uses remote debugger
+            // so, we test 1 thread at once.
+            // also trying to isolate problems with adb_run_wrapper.sh ilooping


adb_run_wrapper.sh doesn't seem to exist any more, so at least this line can be removed

aidanhs · 2017-04-27T15:17:46Z

src/tools/remote-test-server/src/main.rs

+    // hopefully allow us to still identify what's running while staying under
+    // the filesystem limits.
+    let len = cmp::min(filename.len() - 1, 50);
+    let dst = dir.join(t!(str::from_utf8(&filename[..len])));


I might be being a bit dim, but why - 1? This seems like it will always truncate the last character if the length is below 50? (e.g. a filename of length 1 becomes length 0)

Ah that has to do with the weird protocol of the remote-test-{server,client}. The filename coming in has a trailing nul byte which we want to chop off.

So a funny story about this logic as well. I was getting a random failure in a test with a super long test name and I couldn't figure out why. If I ran only that one test it passed, but in the entire suite it failed. Turned out:

Running the test alone the path was 126 characters

Running the test after 10 other tests the path was 127 characters

Running the test after 100 other tests the pth was 128 characters

The test only ended up failing if it was run after 100 other tests. Sure enough each test claims itself a number and is run in a testXX directory, so after 100 tests we're in 3 character range, pushing us from 127 to 128.

Turns out Android has trouble with paths longer than 127 characters. Who knew!

All of this was because we used to run tests in /data/tmp, and I changed it in this PR to /data/tmp/work. Wheeeeeeeeeeeeeeee

That's horrifying. It's the kind of thing I could imagine lingering for months as a spurious failure if tests were run in parallel and a test flipped between number 99 and 100 to run 😞

alexcrichton · 2017-04-27T15:33:13Z

Any particular reason you opted to keep the deps subdirectory rather than out_dir itself? The latter feels slightly more intuitive to me, but I don't know if it might miss some tests.

Oh the deps dir I think is guaranteed to have everything, whereas the out_dir may or may not have everything. The location of all Cargo's artifacts is unknown to even me over time :). I suspect that either works for us, it's just that right now they're duplicated picking it up from both locations.

aidanhs · 2017-04-27T15:33:28Z

src/test/run-pass/vector-sort-panic-safe.rs

+        if !SILENCE_PANIC.with(|s| s.get()) {
+            prev(info);
+        }
+    }));


Are the changes to this test required for the new testing method, or is it just an unrelated improvement?

This was primarily just an optimization. Without this the test generated over a megabyte of output and I saw the run-pass test suite noticeably hang at the end with the collector trying to collect all that data. This just reduces the output quite a bit so the test is a bit more speedy to run.

alexcrichton · 2017-04-27T18:13:43Z

Ok updated with recent comments

aidanhs · 2017-04-27T21:10:37Z

Nothing else from me (though my look at the remote test server was superficial since I first saw it today).

Keen for the speedups!

TimNN

The changes look good to me.

TimNN · 2017-04-27T21:30:29Z

src/tools/remote-test-server/src/main.rs

    t!(fs::create_dir_all(work));

    let lock = Arc::new(Mutex::new(()));

    for socket in listener.incoming() {
        let mut socket = t!(socket);
        let mut buf = [0; 4];
-        t!(socket.read_exact(&mut buf));
+        if socket.read_exact(&mut buf).is_err() {


Ignoring the error here seems fine to me, although I'm curious why this has become necessary?

I've found that the way emulators treat network connections is super flaky. The server was crashing on startup on Android for a very weird reason, which I was later able to debug fully. The behavior I saw was:

The client would need to push 4 dylibs initially to the server

All 4 pushes finished immediately

Oddly enough, only the last one would actually get pushed

Everything was "fixed" when I forced the client to keep the TCP connection alive until the server sent an "ack" back acknowledging it received the whole file. I have no idea why the data wasn't actually sent from the client to the server when it was all reported as successfully being sent on the client.

Given all that behavior what I think was happening was:

The client created a TCP connection to the server (this goes through the emulation layer)

The client successfully sent all data

The client closed the TCP connection

The server accepted the connection

Somehow the emulator decided to not send all the data on the TCP connection and reported EOF

That EOF in turn previously caused a panic here. It ended up being much easier to debug if I ignored errors here and the server got farther. In general this seemed like a good idea because of how flaky the emulated networks seemed to be...

TimNN · 2017-04-27T21:50:18Z

src/tools/remote-test-client/src/main.rs

    t!(client.flush());
+
+    // Wait for an acknowledgement that all the data was received. No idea
+    // why this is necessary, seems like it shouldn't be!


Just to clarify: without the server sending and us waiting on the "ack " the data is not properly received?

My best guess is that the immediate close of the socket after sending the data could cause problems, which seems to indeed be the case see: http://blog.csdn.net/CPP_CHEN/article/details/29864509

By the way, would it make sense to check if the bytes read were actually "ack "?

Ah yeah I ended up describing this in a comment I just wrote, it's related to the "ignore errors on accept sorta" path as well.

Although I guess this is a facet of TCP and not of the emulated network! I'll check here that ack was sent though to be extra sure.

TimNN · 2017-04-27T22:12:02Z

src/tools/remote-test-client/src/main.rs

-            t!(client.write_all(&[0]));
-            t!(client.write_all(v.as_bytes()));
-            t!(client.write_all(&[0]));
+        match k {


This needs an .as_ref() or something, otherwise it causes a type mismatch (String vs &str)

Currently our slowest test suite on android, run-pass, takes over 5 times longer than the x86_64 component (~400 -> ~2200s). Typically QEMU emulation does indeed add overhead, but not 5x for this kind of workload. One of the slowest parts of the Android process is that *compilation* happens serially. Tests themselves need to run single-threaded on the emulator (due to how the test harness works) and this forces the compiles themselves to be single threaded. Now Travis gives us more than one core per machine, so it'd be much better if we could take advantage of them! The emulator itself is still fundamentally single-threaded, but we should see a nice speedup by sending binaries for it to run much more quickly. It turns out that we've already got all the tools to do this in-tree. The qemu-test-{server,client} that are in use for the ARM Linux testing are a perfect match for the Android emulator. This commit migrates the custom adb management code in compiletest/rustbuild to the same qemu-test-{server,client} implementation that ARM Linux uses. This allows us to lift the parallelism restriction on the compiletest test suites, namely run-pass. Consequently although we'll still basically run the tests themselves in single threaded mode we'll be able to compile all of them in parallel, keeping the pipeline much more full and using more cores for the work at hand. Additionally the architecture here should be a bit speedier as it should have less overhead than adb which is a whole new process on both the host and the emulator! Locally on an 8 core machine I've seen the run-pass test suite speed up from taking nearly an hour to only taking 6 minutes. I don't think we'll see quite a drastic speedup on Travis but I'm hoping this change can place the Android tests well below 2 hours instead of just above 2 hours. Because the client/server here are now repurposed for more than just QEMU, they've been renamed to `remote-test-{server,client}`. Note that this PR does not currently modify how debuginfo tests are executed on Android. While parallelizable it wouldn't be quite as easy, so that's left to another day. Thankfully that test suite is much smaller than the run-pass test suite. As a final fix I discovered that the ARM and Android test suites were actually running all library unit tests (e.g. stdtest, coretest, etc) twice. I've corrected that to only run tests once which should also give a nice boost in overall cycle time here.

alexcrichton · 2017-04-28T03:20:28Z

Updated!

alexcrichton · 2017-04-28T15:48:24Z

@bors: r=TimNN

bors · 2017-04-28T15:48:25Z

📌 Commit 7bc2cbf has been approved by TimNN

bors · 2017-04-28T16:05:08Z

⌛ Testing commit 7bc2cbf with merge 80c3226...

bors · 2017-04-28T16:06:03Z

💔 Test failed - status-appveyor

alexcrichton · 2017-04-28T16:08:18Z

@bors: retry * network error w/ chocolatey on windows - #40474

…

On Sat, Apr 29, 2017 at 12:06 AM, bors ***@***.***> wrote: 💔 Test failed - status-appveyor <https://ci.appveyor.com/project/rust-lang/rust/build/1.0.3094> — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#41575 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAD95O5qnWje9HhtRTlKNaYUoLXwJGHgks5r0g59gaJpZM4NJw-H> .

bors · 2017-04-28T16:10:14Z

⌛ Testing commit 7bc2cbf with merge ad1461e...

travis: Parallelize tests on Android Currently our slowest test suite on android, run-pass, takes over 5 times longer than the x86_64 component (~400 -> ~2200s). Typically QEMU emulation does indeed add overhead, but not 5x for this kind of workload. One of the slowest parts of the Android process is that *compilation* happens serially. Tests themselves need to run single-threaded on the emulator (due to how the test harness works) and this forces the compiles themselves to be single threaded. Now Travis gives us more than one core per machine, so it'd be much better if we could take advantage of them! The emulator itself is still fundamentally single-threaded, but we should see a nice speedup by sending binaries for it to run much more quickly. It turns out that we've already got all the toos to do this in-tree. The qemu-test-{server,client} that are in use for the ARM Linux testing are a perfect match for the Android emulator. This commit migrates the custom adb management code in compiletest/rustbuild to the same qemu-test-{server,client} implementation that ARM Linux uses. This allows us to lift the parallelism restriction on the compiletest test suites, namely run-pass. Consequently although we'll still basically run the tests themselves in single threaded mode we'll be able to compile all of them in parallel, keeping the pipeline much more full hopefully and using more cores for the work at hand. Additionally the architecture here should be a bit speedier as it should have less overhead than adb which is a whole new process on both the host and the emulator! Locally on an 8 core machine I've seen the run-pass test suite speed up from taking nearly an hour to only taking 5 minutes. I don't think we'll see quite a drastic speedup on Travis but I'm hoping this change can place the Android tests well below 2 hours instead of just above 2 hours. Because the client/server here are now repurposed for more than just QEMU, they've been renamed to `remote-test-{server,client}`. Note that this PR does not currently modify how debuginfo tests are executed on Android. While parallelizable it wouldn't be quite as easy, so that's left to another day. Thankfull that test suite is much smaller than the run-pass test suite.

bors · 2017-04-28T18:53:21Z

☀️ Test successful - status-appveyor, status-travis
Approved by: TimNN
Pushing ad1461e to master...

aidanhs · 2017-04-28T19:18:53Z

Nice, looks like 30mins off the build time - down to 1hr42min from 2hr15min.

alexcrichton · 2017-04-28T20:02:10Z

Now to deal with the 2.5hr test times in OSX ...

rust-highfive assigned aturon Apr 27, 2017

petrochenkov reviewed Apr 27, 2017

View reviewed changes

alexcrichton force-pushed the android-qemu-server branch from b66b135 to 1357065 Compare April 27, 2017 14:17

alexcrichton mentioned this pull request Apr 27, 2017

Run non-native tests on real device #41268

Merged

aidanhs reviewed Apr 27, 2017

View reviewed changes

alexcrichton force-pushed the android-qemu-server branch from 1357065 to ea732e5 Compare April 27, 2017 18:13

alexcrichton added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Apr 27, 2017

TimNN approved these changes Apr 27, 2017

View reviewed changes

TimNN suggested changes Apr 27, 2017

View reviewed changes

alexcrichton force-pushed the android-qemu-server branch from ea732e5 to 193000d Compare April 28, 2017 03:19

alexcrichton force-pushed the android-qemu-server branch from 193000d to 7bc2cbf Compare April 28, 2017 03:20

TimNN approved these changes Apr 28, 2017

View reviewed changes

bors merged commit 7bc2cbf into rust-lang:master Apr 28, 2017

bors mentioned this pull request Apr 28, 2017

Update stage0 bootstrap compiler #41544

Merged

alexcrichton deleted the android-qemu-server branch April 28, 2017 20:02

travis: Parallelize tests on Android #41575

travis: Parallelize tests on Android #41575

Uh oh!

Conversation

alexcrichton commented Apr 27, 2017

Uh oh!

rust-highfive commented Apr 27, 2017

Uh oh!

aturon commented Apr 27, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alexcrichton commented Apr 27, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aidanhs commented Apr 27, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aidanhs Apr 27, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alexcrichton commented Apr 27, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alexcrichton commented Apr 27, 2017

Uh oh!

aidanhs commented Apr 27, 2017

Uh oh!

TimNN left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TimNN Apr 27, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alexcrichton commented Apr 28, 2017

Uh oh!

alexcrichton commented Apr 28, 2017

Uh oh!

bors commented Apr 28, 2017

Uh oh!

bors commented Apr 28, 2017

Uh oh!

bors commented Apr 28, 2017

Uh oh!

alexcrichton commented Apr 28, 2017 via email

Uh oh!

bors commented Apr 28, 2017

Uh oh!

bors commented Apr 28, 2017

Uh oh!

aidanhs commented Apr 28, 2017

Uh oh!

alexcrichton commented Apr 28, 2017

Uh oh!

aidanhs commented Apr 27, 2017 •

edited

Loading

aidanhs Apr 27, 2017 •

edited

Loading

TimNN Apr 27, 2017 •

edited

Loading