Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable per-component builds when coverage is enabled #4902

Closed
wants to merge 2 commits into from

Conversation

ttuegel
Copy link
Member

@ttuegel ttuegel commented Nov 22, 2017

Disable per-component build when program coverage enabled

Creating the test program coverage report requires the test runner to know the
build directory for the library component. If per-component builds are enabled,
this is not possible because the library is not in-scope at the same time as the
test component. The only apparent solution that does not require rethinking
per-component builds entirely is to disable it when coverage is enabled.

Resolves: #4798

  • Patches conform to the coding conventions.
  • Any changes that could be relevant to users have been recorded in the changelog.
  • The documentation has been updated, if necessary.
  • If the change is docs-only, [ci skip] is used to avoid triggering the build bots.

cc @cartazio @cj285

@ttuegel ttuegel self-assigned this Nov 22, 2017
@ttuegel ttuegel requested a review from ezyang November 22, 2017 17:47
@cartazio
Copy link
Contributor

i guess this would be fine (even if its a shame that theres that tension atm)

i'm ok with the coverage checking we (should do) being under a different flag / build way than component stuff .... which isn't terribly documented at the moment as far as I can tell? aside from here?
https://www.haskell.org/cabal/users-guide/installing-packages.html?highlight=component#setup-configure

@ttuegel
Copy link
Member Author

ttuegel commented Nov 22, 2017

even if its a shame that theres that tension atm

I didn't mean to imply this is a temporary situation ☹️ . It's part of the essential tension of per-component builds. I don't know how to get the information from the library component into the test component without violating the separation of components. Maybe @ezyang knows more about it.

@ezyang
Copy link
Contributor

ezyang commented Nov 22, 2017

It does seem like you'd have to basically figure out how to do coverage across packages to solve this problem in general.

@ttuegel is another possibility to only disable per-component on the "top" package, not all of them? (Maybe this patch does that already)

@ttuegel
Copy link
Member Author

ttuegel commented Nov 23, 2017

is another possibility to only disable per-component on the "top" package, not all of them?

This should only disable per-component builds on the package(s) that have coverage enabled. However, because command-line arguments are currently broken, passing --enable-coverage will enable coverage for all packages.

@ttuegel
Copy link
Member Author

ttuegel commented Nov 23, 2017

It does seem like you'd have to basically figure out how to do coverage across packages to solve this problem in general.

@ezyang
The ElaboratedComponent knows its own dependencies; is there a way to determine which dependencies are from the same package? If so, we could probably construct the DistDirParams for each internal dependency and get the build directories that way.

@ezyang
Copy link
Contributor

ezyang commented Nov 23, 2017

The ElaboratedComponent knows its own dependencies; is there a way to determine which dependencies are from the same package? If so, we could probably construct the DistDirParams for each internal dependency and get the build directories that way.

Yes, this should be possible. TBH, I don't know what the idiomatic way to do this is, but reading elaborateSolverToComponents should tell you the answer (slash, be the place to add the information you need.)

@23Skidoo
Copy link
Member

The new test seems to be failing on CI.

@ttuegel
Copy link
Member Author

ttuegel commented Nov 25, 2017

The new test seems to be failing on CI.

Yeah, in the test I added, I didn't check that GHC ships with shared libraries before testing. I thought that was a given; obviously not! 😃

@ttuegel
Copy link
Member Author

ttuegel commented Nov 27, 2017

@cartazio
Once I fix the test cases, I'm going to merge this and backport to cabal-install-2.0 because I don't have an ETA on the proper fix; hopefully this will not make it into cabal-install-2.2, but if it does then at least program coverage will work!

@ttuegel ttuegel force-pushed the disable-per-component branch 2 times, most recently from 9995d90 to 84a07d8 Compare November 27, 2017 18:38
@cartazio
Copy link
Contributor

cartazio commented Nov 27, 2017 via email

@ttuegel
Copy link
Member Author

ttuegel commented Nov 27, 2017

For reasons currently unbeknownst to me, coverage is still broken with GHC 7.8 and GHC 7.10, but not with more recent versions. I suspect something has changed about how GHC handles HPC search paths, but I'm not sure yet.

@cartazio
Copy link
Contributor

cartazio commented Nov 28, 2017 via email

@ttuegel
Copy link
Member Author

ttuegel commented Nov 28, 2017

I have this quaint belief that it’s very hard to get bug fixes into old ghc releases so I should always use the current major version.

I agree, nobody should use or support old GHCs, but here we are.

My fear is that the problem actually persists with GHC 8.x but the test case doesn't catch it, so I won't be satisfied until it works with GHC 7.x also.

@cartazio
Copy link
Contributor

cartazio commented Nov 28, 2017 via email

@ttuegel ttuegel force-pushed the disable-per-component branch from 84a07d8 to 49704f6 Compare November 29, 2017 01:37
@ttuegel
Copy link
Member Author

ttuegel commented Nov 29, 2017

The Travis failures are

  1. A timeout, and
  2. an unrelated test failure,

but I don't have the permissions to restart the failed tests.

@23Skidoo
Copy link
Member

23Skidoo commented Nov 29, 2017 via email

@ttuegel
Copy link
Member Author

ttuegel commented Nov 30, 2017

I thought you were already a member of the haskell/cabal group.

Actually I was; I had gotten signed out of Travis and didn't notice. I can only restart the builds for haskell/cabal though. I can't restart the haskell-pushbot/cabal-binaries build. That was the timeout, though, so as long as the tests pass I'm satisfied.

@23Skidoo
Copy link
Member

@ezyang Can we add everyone in the haskell/cabal group to haskell-pushbot? Or maybe we can move pushbot to the haskell org?

@ttuegel
Copy link
Member Author

ttuegel commented Nov 30, 2017

The timeout problem I thought was transient is not. Between the new tests I added, GHC 7.10 being slow, and the Travis macOS machines being slow, that build consistently fails. Is there a preferred solution for slow tests?

@ezyang
Copy link
Contributor

ezyang commented Dec 1, 2017

@23Skidoo I'm happy to but we have to decide what to do; haskell-pushbot is a user, not an org.

One thing we could try is to move the cabal-binaries repo to an org (we'll have to invite haskell-pushbot to the org before we can do this) and then add other people to the repo.

@ezyang
Copy link
Contributor

ezyang commented Dec 1, 2017

The "build is too long" situation is a bit of a problem.

So, the reason we are running tests in the "build" (and not the spillover job) is because all of these tests require the Cabal library to be used when a test script is run, but all of the compiled libraries are like 5G and are too big to put in the cabal-binaries repo. One way we might be able to work around this is to somehow add a mode to cabal-tests where we compile in all of the test scripts into the test binary, so we can go ahead and run it directly. But this requires some nontrivial dev work.

Another possibility is to speed up the build. We've probably gotten most of the low hanging fruit but there might still be some possibilities.

BTW osx-7.8.4 build is a legit failure, not timeout.

@23Skidoo
Copy link
Member

23Skidoo commented Dec 2, 2017

@ezyang So @hvr tells me that haskell-pushbot can't be part of the haskell org because it needs its own Travis build quota. I think it makes sense to do what you suggested -- move cabal-binaries to a separate org and manually add everyone from the haskell/cabal team to collaborators.

Creating the test program coverage report requires the test runner to know the
build directory for the library component. If per-component builds are enabled,
this is not possible because the library is not in-scope at the same time as the
test component. The only apparent solution that does not require rethinking
per-component builds entirely is to disable it when coverage is enabled.

Resolves: haskell#4798
@ttuegel ttuegel force-pushed the disable-per-component branch from 49704f6 to 2592dc6 Compare December 4, 2017 19:23
@ttuegel
Copy link
Member Author

ttuegel commented Dec 5, 2017

Unfortunately, removing a few redundant tests did not speed up the build enough with GHC 7.8 and 7.10 (the slower versions). I fear this PR is DOA unless we can find a solution to the build time problem.

@23Skidoo
Copy link
Member

23Skidoo commented Dec 5, 2017

I need to finally sit down and finish #4231, that should help macOS build times a bit. Maybe we can also skip haddock and sdist steps on macOS.

Here is a table of build steps performed by one of the failed jobs ordered by the time they take:

Time (s) Build step
976 cabal new-build -j2 cabal-install:cabal
596 cabal new-build -j2 Cabal Cabal:unit-tests Cabal:check-tests Cabal:parser-tests Cabal:parser-hackage-tests --enable-tests
593 [...]/cabal-tests/cabal-tests --builddir=[...] -j3 --skip-setup-tests --with-cabal [...]/cabal-install-2.1.0.0/build/cabal/cabal --with-hackage-repo-tool [...]/hackage-repo-tool
238 [...]/cabal-testsuite-2.1.0.0/build/cabal-tests/cabal-tests --builddir=[...] -j3
107 [...]/cabal-install-2.1.0.0/build/cabal/cabal update
56 cabal new-build -j2 cabal-testsuite:cabal-tests
34 [...]/cabal-install-2.1.0.0/setup/setup haddock --builddir=[...]
27 cabal act-as-setup --build-type=Simple -- haddock --builddir=[..]
24 cabal update
15 cabal new-build -j2 hackage-repo-tool

A bit weird that cabal update is on the fourth place, may be a fluke. The first time we call it takes 1/4 as long. Other than removing haddock from macOS configurations (I think we never had macOS-specific haddock failures), the only thing we can do is move more of the test suite to cabal-binaries. @ezyang, how feasible would that be?

@ttuegel
Copy link
Member Author

ttuegel commented Dec 5, 2017

Is it actually necessary to run the package tests with setup and using cabal as setup? Wouldn't the latter tests exercise the same code paths as the former?

@23Skidoo
Copy link
Member

23Skidoo commented Dec 5, 2017

I think that the act-as-setup haddock thing is a leftover from the time when there was no new-haddock. Package tests are being run by the cabal-tests executable FWICT.

@ttuegel
Copy link
Member Author

ttuegel commented Dec 5, 2017

Package tests are being run by the cabal-tests executable FWICT.

Yes. Some of those tests are setupTests, some are cabalTests and some (many) are setupAndCabalTests. I'm asking if the setupAndCabalTests could just be setupTests.

@ezyang
Copy link
Contributor

ezyang commented Dec 6, 2017

You wouldn't have to search replace them all; just change the logic in setupAndCabalTests ;)

@23Skidoo
Copy link
Member

23Skidoo commented Dec 6, 2017

@ttuegel Not sure, but you can follow @ezyang's suggestion and disable cabalTests in setupAndCabalTests on macOS only, since on Linux this doesn't seem to be a problem right now.

@ttuegel
Copy link
Member Author

ttuegel commented Dec 6, 2017

You wouldn't have to search replace them all; just change the logic in setupAndCabalTests ;)

It wouldn't really be setup_And_CabalTests then, would it? ;-)

I think I will just add a flag to cabal-tests to enable/disable setupAndCabalTests and we can flip that flag on macOS for one set of tests.

@ezyang
Copy link
Contributor

ezyang commented Dec 7, 2017

Well, the intent for these tests is that they could be run for setup or cabal... not that we actually do ;) Your plan sounds great.

@ttuegel
Copy link
Member Author

ttuegel commented Dec 19, 2017

I think I will just add a flag to cabal-tests to enable/disable setupAndCabalTests and we can flip that flag on macOS for one set of tests.

My plan was to only run the setupAndCabalTests once, but it turns out we already do this! I have no idea how to get the test suite time down or how to make this merge-able. Please advise.

@23Skidoo
Copy link
Member

23Skidoo commented Dec 19, 2017

@ezyang Do you think it'd be possible to run more of time-consuming tests mentioned in #4902 (comment) on haskell-pushbot?

edit: Apparently, it's non-trivial, see #4902 (comment). See also #4462.

@ttuegel
Copy link
Member Author

ttuegel commented Dec 20, 2017

So, the reason we are running tests in the "build" (and not the spillover job) is because all of these tests require the Cabal library to be used when a test script is run, but all of the compiled libraries are like 5G and are too big to put in the cabal-binaries repo.

Since we are already caching $HOME/.cabal/store, could we start by splitting the Cabal build from the cabal-install build? For that matter, could we separate the cabal-tests runs from the builds, too?

EDIT: This would require some non-trivial fiddling with Travis' build stages feature.

@ttuegel
Copy link
Member Author

ttuegel commented Dec 20, 2017

OK, I think I have a pretty good idea of how to split up the Travis jobs so they don't time-out. I'll have a PR in a day or so.

@23Skidoo
Copy link
Member

@ttuegel If you want to experiment with build stages, you may want to take a look at my previous attempt (#4560 and #4556 (comment)).

If you won't be able to work around the macOS timeout issue, I can see the following paths forward:

a) Merge your patch into the (soon-to-be-created) 2.2 branch, but without the test.
b) For 2.4/3.0, move macOS builds away from Travis -- not sure where yet, maybe GHC HQ will let us share their CircleCI account. If there's no suitable free service, we'll have to find money/sponsorship. I'm willing to invest time into configuring and administering an alternative macOS build bot.

@cartazio
Copy link
Contributor

cartazio commented Dec 20, 2017 via email

@23Skidoo
Copy link
Member

23Skidoo commented Jan 5, 2018

This really should be in 2.2. One option (nuclear) is to temporarily mark the slowest macOS configurations allow-failure until we figure out a more permanent solution.

@ezyang
Copy link
Contributor

ezyang commented Jan 6, 2018

We clearly don't have the bandwidth at the moment to fix the OS X build situation, so I vote for nuking the slow test.

@23Skidoo
Copy link
Member

23Skidoo commented Jan 6, 2018

We can add the code for the test, but disable it and add a TODO.

@ttuegel
Copy link
Member Author

ttuegel commented Jan 6, 2018

How do I disable cabal-testsuite tests?

@23Skidoo
Copy link
Member

23Skidoo commented Jan 6, 2018

Good question. We still want to compile it (to prevent bit-rotting), but mark it as disabled for now. I believe that this is not supported right now, so we should add this feature to the test runner first. I may have time to look into it next week.

@ezyang
Copy link
Contributor

ezyang commented Jan 6, 2018

You can skip tests by calling skip in the test main. We'll still fire up the GHC interpreter on the test but then nothing else will happen.

@23Skidoo
Copy link
Member

23Skidoo commented Jan 7, 2018

Closing in favour of #5004.

@23Skidoo 23Skidoo closed this Jan 7, 2018
@ttuegel
Copy link
Member Author

ttuegel commented Jan 7, 2018

Thank you!

@ttuegel ttuegel deleted the disable-per-component branch January 7, 2018 13:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants