General CI discussion #707

real-or-random · 2020-01-06T12:53:59Z

When thinking about #705 I realized it's maybe a good idea to take a step back and look at this in the context of a general CI revamp. There are multiple things that would be nice and I think some of these probably have a higher priority than coverage reports:

ARM tests Travis should build and test arm asm #598
Big-endian tests Run a Travis test on s390x (big endian) #696
Windows and/or MSVC builds? (Windows target #451 (comment))
MacOS builds? (MacOS relies on clang anyway, so not this is not as interesting as MSVC.)
Coverage reports WIP: Add gcov coverage report to Travis CI #705
Reducing configuration options to keep the test matrix manageable, see e.g., Make WINDOW_G configurable #596 (comment)

Is there anything else you would like to see?

If we want to work on this, this is a good time to think about the CI platform. I don't mind Travis and it's reasonably stable in the past months but if we think that a different (or additional) provider is a better fit, it's better to discuss this before we start working on these issues.

By the way, Bitcoin is currently looking into GitHub Actions for Windows at least:
bitcoin/bitcoin#17803

gmaxwell · 2020-01-07T01:16:27Z

I think BE is basically solved in travis since there is s390 (lol). It sounds like the arm is arm64 which is useful but less interesting than 32bit arm.

The big gaps I see in travis:

You cannot usefully benchmark, so you can't watch for performance regressions-- the systems it runs on are wildly different in performance and wildly inconsistent in performance because they're VPSes which are concurrently running other loads.
Getting work product output not supported (seems like a workaround would be possible), meaning you can't get CI coverage analysis, or static analysis reports... except via integrating with other services which creates an unnecessary disparity between the free software tools the developers can use locally and the (usually commercial) services that the project ends up beholden to.
Travis CPU time is extremely limited (the fact that the travis executors are slow disguises how limited it is, e.g. it takes minutes for what would be tens of seconds locally). This means that stuff like automated fuzz testing at a scale which is likely to find interesting bugs is essentially off the table, at least as part of CI tests. This also clearly distorts testing priorities, with stuff like not using valgrind or cutting down tests just to fit under travis' limits.

To me the obvious thing to do is to run a jenkins instance. Jenkins is extremely simple to operate (in my experience much easier to use than travis, because you're less subject to trial and error-- particularly because if you're the group running it you can just ssh in and try the tests locally until they work). Jenkins supports remote executors (and you can confine jobs to only run on particular executors), so its trivial to have an executor that runs on dedicated hardware that allows benchmarking... or on fast systems.

The big challenge I see is that the travis ecosystem has created an expectation that PRs from random people will get CI run against them. And having random people's code run on your own hosts/networks is super scary. One option would be to just run the jenkins on PRs by whitelisted people and on merged commits (this is apparently a standard feature of the jenkins PR integration module, which I've never used-- my experience has been w/ just running it on merged commits). I can offer to run some executors (and can do arm7, big endian systems, whatever) on an isolated network that just vpns back to the jenkins install and has no network access. But the limitation there is that if I get hit by a bus you might have issues finding someone else willing to volunteer a half rack of assorted old hardware.

An extension of the "cant benchmark" is that the library really should have regression tests for sidechannel behaviour. Right now someone could flip a compiler flag, and the cmovs some being constant and you get an enormous sidechannel and no one would notice until it shows up in an attack (or hopefully, attack paper). There are a couple different ways of dealing with that which would allow CI integration of a sidechannel test, but most of them require being able to use test on bizarre hardware (e.g. with a jenkins remote executor). The one option that wouldn't would be running tests in a cycle accurate cpu simulator-- essentially like running in valgrind but could get cycle counts and dram access traces as a side effect which could be compared across different secrets to be sure they were constant, but when I went down that path a couple years ago none that I could find simulated dram. @tdaede might have more recent suggestions.

gmaxwell · 2020-02-04T07:18:05Z

Travis hiding the output of ./tests failing really stinks, did it used to do that?

real-or-random · 2020-02-04T10:29:30Z

We cat the log files in the end but you need to "unfold" output by clicking the arrow.
https://travis-ci.org/bitcoin-core/secp256k1/jobs/641332511
Or are talking about something else?

gmaxwell · 2020-02-04T13:31:01Z

Ah, I missed the 'unfold' Thanks.

fanquake · 2020-02-04T22:23:39Z

By the way, Bitcoin is currently looking into GitHub Actions for Windows at least:
bitcoin/bitcoin#17803

FYI. We've removed the GitHub Actions workflow from our repository. See discussion in that issue and bitcoin/bitcoin#18031.

gmaxwell · 2020-02-11T05:41:58Z

The CI should probably do a build on a system without libgmp to catch accidental header dependencies.

The CI should probably have a makefileless build (e.g. just a big GCC commandline that sets all the required defines)

real-or-random · 2020-04-15T20:41:07Z

related: bitcoin/bitcoin#18659

real-or-random · 2020-05-07T15:49:15Z

We may want to look into Azure Pipelines: https://azure.microsoft.com/de-de/services/devops/pipelines/

real-or-random · 2020-11-04T15:47:01Z

We have the ubsan sanitizer now on Travis (#839). Has anyone tried other sanitizers that we may want to enable?

sipa · 2020-11-04T16:09:08Z

asan/lsan is generally as easy as ubsan to enable (and they can all be enabled simultaneously) though has a higher performance impact. It can catch out-of-bound accesses (also on the stack), and memory leaks.

tsan isn't useful for us as there is no threading.

msan is a pain to integrate (as it requires building libc and other dependencies with msan as well), and we already have valgrind which largely detects the same issues.

elichai · 2020-11-04T16:13:05Z

msan is a pain to integrate (as it requires building libc and other dependencies with msan as well), and we already have valgrind which largely detects the same issues.

This isn't really correct for C, msan contains shims for the common libc uses, so there shouldn't be a need to rebuild libc to run msan here.

real-or-random · 2020-11-11T10:31:53Z

Travis is effectively shutting down for open source:
bitcoin/bitcoin#17802

Options:

Be lazy and pay them: but I don't think if we're happy with Travis in general)
migrate to Cirrus CI: Cirrus would work. We need to move s390x to qemu but native s390x is unreliable on Travis anyway
migrate to Shippable: I was playing around with this a few weeks ago (https://app.shippable.com/github/real-or-random/secp256k1/dashboard) This looked good on first glance and builds are fast. But it seems that the project is dead soon (no activity in their github repo, not even activity or replies on twitter). So I don't think we'll be happy with this for long.
do our own thing (e.g., Jenkins run by @gmaxwell )
look into other CIs, e.g., Azure Pipelines

Non-options:

github actions: not suitable for bitcoin core and I don't think we should have a different position: ci: Migration from AppVeyor to GitHub Actions bitcoin/bitcoin#17803 (comment)

edit: Azure is not good either:

gmaxwell · 2020-11-11T12:26:30Z

Any public "free" solution is going to either be short lived or extremely stingy with cpu time for obvious economic reasons-- or, likely, both. Any "custom rolled" solution has the life limits from their sponsors not having time for maintenance, hardware failures, etc. but presumably wouldn't be cpu resources limited. Custom rolled can also guarantee uncontested access to systems, allowing for real benchmarking.

I suggest regardless: All testing should be setup as generic shell scripts (and/or python if really needed) that could be invoked from any CI system-- they're pretty much there already except for the travis matrixing. Then it becomes easier to move things around.

The project isn't just limited to just using one thing. If some free hosted thing works then it would probably good to run it in parallel with whatever else gets setup.

[I would already have a jenkins setup running but for ongoing power usage limitations here which are finally almost resolved.]

We need to move s390x to qemu but native s390x is unreliable on Travis anyway

I don't think having s390x is particularly important, esp since it's slightly broken for reasons out of our hands-- but having a BE target is nice. Debian on MIPS BE (with valgrind too) and Sparc run fine in qemu-system though both quite slow, and I assume BE PPC works but I haven't tried it. ...and each of these is way more important than s390x: They're all found in really cool spacecraft. I don't know where s390x machines are found, but certainly not in space. Armonk, New York, probably. Space is much better than Armonk: For one thing, space doesn't have any computer crackers.

real-or-random · 2020-11-11T12:40:47Z

I don't think having s390x is particularly important, esp since it's slightly broken for reasons out of our hands-- but having a BE target is nice. Debian on MIPS BE (with valgrind too) and Sparc run fine in qemu-system though both quite slow, and I assume BE PPC works but I haven't tried it. ...and each of these is way more important than s390x:

Yes, sure. I don't care about s390x in particular. We should just have some BE architecture. If it's just for BE testing, we should probably just take the one which is fastest on qemu. Of course, having valgrind is better than not having it, and the probability that the Sparc build is relevant for some users is larger than for s390x.

gmaxwell · 2020-11-11T21:25:53Z

Well I know mips with valgrind works, and I successfully got debian in sparc working but there is no valgrind. (which is too bad, sparc otherwise tends to expose some issues due to register windowing).

real-or-random · 2020-11-12T12:02:48Z

The project isn't just limited to just using one thing. If some free hosted thing works then it would probably good to run it in parallel with whatever else gets setup.

Indeed.

More on the free thing:
I changed my mind on Github Actions. I think it's acceptable, see bitcoin/bitcoin#17803 (comment). But Marko seems to disagree and he has some valid points.

I think Cirrus CI is the best free/"cloud" option for now.

jonasschnelli · 2020-11-12T12:13:35Z

There are plans to increase the stability and performance (as well as supported architectures) on bitcoinbuilds.org. There is also a GitHub integration (running only here for now). The CI is completely custom built and easy to extend. I'm currently trying to setup a team of devs/admins.

Ideally it would support running tests on hardware provided by authenticated developers/users.
Probably not what you are looking for today but could be something for the long run.

real-or-random · 2020-11-12T12:56:09Z

Probably not what you are looking for today but could be something for the long run.

Thanks for letting us know. This sounds great and is certainly an option for the long run.

apoelstra · 2020-11-16T18:27:50Z

I am working on building a new x86 system with many cores and am happy to replicate any @gmaxwell script infrastructure, to reduce the bus number. But I don't have any oddball hardware, nor do I have plans to amass any.

JeremyRand · 2020-12-10T19:00:04Z

edit: Azure is not good either:

@real-or-random The Azure docs claim that it can be used with generic Git repos (not requiring GitHub). Would that mode be suitable? I assume it wouldn't demand problematic GitHub permissions since it has no knowledge of GitHub in that mode, but I may be missing something.

kroggen · 2020-12-16T18:08:11Z

This is how I am building my fork (with VRF support) for Windows using GitHub Actions and Visual C: here

real-or-random · 2020-12-23T13:37:00Z

I'm heavily working on Cirrus CI integration in https://github.com/real-or-random/secp256k1/tree/202012-cirrusci . It looks promising so far but I want to continue to hack on this a little bit and clean it up before I open a PR here. (And anyway, a PR wouldn't be build currently because Cirrus is not yet enabled.)

real-or-random · 2021-01-14T20:26:49Z

notes from irc:

20:38 <real_or_random> I didn't know it's so easy to run tests in valgrind: make check LOG_COMPILER=valgrind
20:38 <real_or_random> or if we add benchmarks, this would also work: make check LOG_COMPILER='./libtool --mode=execute valgrind'
20:39 <real_or_random> nice thing: it runs exactly like "make check", e.g., it uses multiple threads
20:41 <real_or_random> I believe this will be very helpful for simplifying the ci shell script

real-or-random · 2021-03-08T14:49:10Z

By the way, I still don't the like way the results are shown in Cirrus. We run make check which runs ./tests and ./exhaustive_tests in parallel and write their outputs in log files. That means, when ./tests fails, the make check thing is red and then you click on it but you don't see any real error message. Instead, the real error is in a green section below. (See downstream example: https://cirrus-ci.com/task/6253397255913472?command=cat_tests_log#L4) Not a huge deal but simply not great for people who don't know what's going on.

We could simply run the tests without make check but then we don't test the automake part of make check. Maybe we can grep the log files for PASS or FAIL and make the "cat" script fail depending on this.

real-or-random · 2021-06-28T16:24:03Z

Maybe the clang static analyzer (scan-build) is a nice tool we could include in the CI: #959 (comment)

real-or-random · 2021-08-20T11:56:07Z

Cirrus CI now has ARM64 hardware, see cirruslabs/cirrus-ci-docs#905 and bitcoin/bitcoin#22749

real-or-random · 2022-11-16T19:42:45Z

We could add a few more jobs with -O3 to increase the coverage of CI tests.

real-or-random mentioned this issue Nov 11, 2020

ci: Migration from AppVeyor to GitHub Actions bitcoin/bitcoin#17803

Closed

real-or-random mentioned this issue Jan 7, 2021

Add support for Cirrus CI #864

Merged

6 tasks

real-or-random mentioned this issue Nov 15, 2022

More ARM Linux builds, fewer MacOS builds #1153

Open

real-or-random mentioned this issue Nov 16, 2022

Add MSan CI jobs #1155

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

General CI discussion #707

General CI discussion #707

real-or-random commented Jan 6, 2020 •

edited

Loading

gmaxwell commented Jan 7, 2020

gmaxwell commented Feb 4, 2020

real-or-random commented Feb 4, 2020

gmaxwell commented Feb 4, 2020

fanquake commented Feb 4, 2020

gmaxwell commented Feb 11, 2020

real-or-random commented Apr 15, 2020

real-or-random commented May 7, 2020

real-or-random commented Nov 4, 2020

sipa commented Nov 4, 2020

elichai commented Nov 4, 2020

real-or-random commented Nov 11, 2020 •

edited

Loading

gmaxwell commented Nov 11, 2020

real-or-random commented Nov 11, 2020

gmaxwell commented Nov 11, 2020

real-or-random commented Nov 12, 2020

jonasschnelli commented Nov 12, 2020

real-or-random commented Nov 12, 2020

apoelstra commented Nov 16, 2020

JeremyRand commented Dec 10, 2020

kroggen commented Dec 16, 2020

real-or-random commented Dec 23, 2020

real-or-random commented Jan 14, 2021

real-or-random commented Mar 8, 2021 •

edited

Loading

real-or-random commented Jun 28, 2021

real-or-random commented Aug 20, 2021

real-or-random commented Nov 16, 2022

General CI discussion #707

General CI discussion #707

Comments

real-or-random commented Jan 6, 2020 • edited Loading

gmaxwell commented Jan 7, 2020

gmaxwell commented Feb 4, 2020

real-or-random commented Feb 4, 2020

gmaxwell commented Feb 4, 2020

fanquake commented Feb 4, 2020

gmaxwell commented Feb 11, 2020

real-or-random commented Apr 15, 2020

real-or-random commented May 7, 2020

real-or-random commented Nov 4, 2020

sipa commented Nov 4, 2020

elichai commented Nov 4, 2020

real-or-random commented Nov 11, 2020 • edited Loading

gmaxwell commented Nov 11, 2020

real-or-random commented Nov 11, 2020

gmaxwell commented Nov 11, 2020

real-or-random commented Nov 12, 2020

jonasschnelli commented Nov 12, 2020

real-or-random commented Nov 12, 2020

apoelstra commented Nov 16, 2020

JeremyRand commented Dec 10, 2020

kroggen commented Dec 16, 2020

real-or-random commented Dec 23, 2020

real-or-random commented Jan 14, 2021

real-or-random commented Mar 8, 2021 • edited Loading

real-or-random commented Jun 28, 2021

real-or-random commented Aug 20, 2021

real-or-random commented Nov 16, 2022

real-or-random commented Jan 6, 2020 •

edited

Loading

real-or-random commented Nov 11, 2020 •

edited

Loading

real-or-random commented Mar 8, 2021 •

edited

Loading