Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Solver errors should be made more explicit #2

Closed
rmarquis opened this issue Jul 4, 2017 · 2 comments
Closed

Solver errors should be made more explicit #2

rmarquis opened this issue Jul 4, 2017 · 2 comments

Comments

@rmarquis
Copy link

rmarquis commented Jul 4, 2017

When the solver fails for some reason, the error message should help to locate the error quickly in order to ease manual intervention.

Consider the following:

$ auracle buildorder plasma-git-meta
no results found for botan-1.10
BUILD kde-gtk-config-git
BUILD extra-cmake-modules-git
BUILD kconfig-git
BUILD kcoreaddons-git

.. (~100 other dependencies) ...

BUILD plasma-mediacenter-git
BUILD kwallet-pam-git
BUILD plasma-git-meta

Indeed, finding the package that requires the faulty botan-1.10 dependency is not trivial.

A similar issue exists when using auracle download -r plasma-git-meta, but using grep to locate the faulty PKGBUILD can be used as a workaround.

Note that here I'm using plama-git-meta as I know this package chain is currently broken. It has a lot of AUR deps and is a somewhat extreme case, but a similar and much more common case is updating (sync) with multiple AUR outdated packages, with one of the dependency broken for some reason.

A better solution is to have a more explicit error message.

This is what I implemented in pacaur, with these objectives:

  • make it easy to locate the faulty PKGBUILD
  • determine which is the target requiring that dependency, so it can be --ignored allowing the update of the others packages
  • instead of stopping at the first detected error, find all errors at once then stops
$ pacaur -S plasma-git-meta
:: Package plasma-git-meta not found in repositories, trying AUR...
:: resolving dependencies...
:: no results found for botan-1.10 (dependency tree: plasma-git-meta plasma-nm-git qca-qt5-git botan-1.10)

Here we see that it is qca-qt5-git that requires botan-1.10, and qca-qt5-git is a second level dependency of the plasma-git-meta target.

The implementation is as follow:

for each erroneous dependency detected
  * check the previous dependency in the topologically sorted buildorder chain
  * if the (erroneous) dependency is a dependency of that package
       store that dep
       recurse with the new dependency
    else
       check the previous element in the buildorder chain 
  * stop when the dependency is a target and print the list of stored deps in the error message

This is somewhat slow, and maybe not robust enough, but gives good results in the majority of cases.

I'm not sure if finding the target requiring the faulty dependency is in the scope of auracle, but printing at least the direct faulty dependency could prove to be very useful for auracle users.

@AladW
Copy link

AladW commented Jul 4, 2017

In particular, if a dependency is not found the exit code is 0. Even the current behavior with an exit code greater than 0 may be helpful.

@falconindy
Copy link
Owner

Generally agree with this. The current implementation of buildorder was based on some opportunistic code reuse with the recursive download code for the sake of getting something that vaguely works. I see now that I'll need to unravel these a bit and go with a callback approach.

falconindy added a commit that referenced this issue May 25, 2020
Create a new std::set<DK> for each operation such that one test failing
early doesn't cause them all to potentially break (given that we're
using EXPECT and not ASSERT). Also add a PrintTo overload so that
failures are more easily understood:

Before:

  Expected: has 2 elements and there exists some permutation of elements such that:
   - element #0 is equal to 4-byte object <00-00 00-00>, and
   - element #1 is equal to 4-byte object <01-00 00-00>
    Actual: { 4-byte object <00-00 00-00> }, which has 1 element

After:

  Expected: has 3 elements and there exists some permutation of elements such that:
   - element #0 is equal to DependencyKind::Depend, and
   - element #1 is equal to DependencyKind::MakeDepend, and
   - element #2 is equal to DependencyKind::CheckDepend
    Actual: { DependencyKind::MakeDepend, DependencyKind::CheckDepend }, which has 2 elements
falconindy added a commit that referenced this issue Sep 4, 2020
Originally, I wanted to address the following problem:

When a user callback indicates failure, calling CancelAll() means that
we end up (re)invoking a curl socket callback from a socket callback,
leading to a double free (either in sd-event or curl). One possible
backtrace looks like:

    (gdb) bt
->  #0  aur::AurImpl::DispatchSocketCallback (this=0x618000000480, s=<optimized out>, action=4, io=<optimized out>) at ../src/aur/aur.cc:330
    #1  0x00007ffff74fa4e1 in singlesocket () from /usr/lib/libcurl.so.4
    #2  0x00007ffff74fe622 in curl_multi_remove_handle () from /usr/lib/libcurl.so.4
    #3  0x000055555570e833 in aur::AurImpl::FinishRequest (this=<optimized out>, curl=0x623000005500, result=<optimized out>, dispatch_callback=<optimized out>) at ../src/aur/aur.cc:462
    #4  0x0000555555708cc1 in std::__do_visit<std::__detail::__variant::__deduce_visit_result<void>, aur::AurImpl::Cancel(const value_type&)::Visitor, const std::variant<void*, sd_event_source*>&> (__visitor=...) at /usr/include/c++/10.2.0/variant:869
    #5  std::visit<aur::AurImpl::Cancel(const value_type&)::Visitor, const std::variant<void*, sd_event_source*>&> (__visitor=...) at /usr/include/c++/10.2.0/variant:1710
    #6  aur::AurImpl::Cancel (this=0x618000000480, request=...) at ../src/aur/aur.cc:291
    #7  0x00005555557090f6 in aur::AurImpl::CancelAll (this=0x618000000480) at ../subprojects/abseil-cpp-20200225.2/absl/container/internal/raw_hash_set.h:311
    #8  0x000055555570f08d in aur::AurImpl::CheckFinished (this=0x618000000480) at ../src/aur/aur.cc:485
->  #9  0x000055555570f341 in aur::AurImpl::DispatchSocketCallback (this=0x618000000480, s=<optimized out>, action=4, io=<optimized out>) at ../src/aur/aur.cc:332
    #10 0x00007ffff74fb092 in Curl_multi_closed () from /usr/lib/libcurl.so.4
    #11 0x00007ffff74cc631 in Curl_closesocket () from /usr/lib/libcurl.so.4
    #12 0x00007ffff74df551 in Curl_disconnect () from /usr/lib/libcurl.so.4
    #13 0x00007ffff74fc354 in multi_done () from /usr/lib/libcurl.so.4
    #14 0x00007ffff74fcc91 in multi_runsingle () from /usr/lib/libcurl.so.4
    #15 0x00007ffff74fe1d1 in multi_socket () from /usr/lib/libcurl.so.4
    #16 0x00007ffff74fe354 in curl_multi_socket_action () from /usr/lib/libcurl.so.4
    #17 0x000055555570f60d in aur::AurImpl::OnCurlTimer (userdata=0x618000000480) at ../src/aur/aur.cc:401
    #18 0x00007ffff7466b3e in ?? () from /usr/lib/libsystemd.so.0
    #19 0x00007ffff746821e in sd_event_dispatch () from /usr/lib/libsystemd.so.0
    #20 0x00007ffff746a6a9 in sd_event_run () from /usr/lib/libsystemd.so.0
    #21 0x0000555555704bd3 in aur::AurImpl::Wait (this=0x618000000480) at ../src/aur/aur.cc:495
    #22 0x000055555563ee8a in auracle::Auracle::GetOutdatedPackages (this=<optimized out>, args=std::vector of length 0, capacity 0, packages=<optimized out>) at /usr/include/c++/10.2.0/bits/unique_ptr.h:421
    #23 0x0000555555656816 in auracle::Auracle::Outdated (this=<optimized out>, args=..., options=...) at ../src/auracle/auracle.cc:564
    #24 0x0000555555596c7f in main (argc=<optimized out>, argv=<optimized out>) at ../subprojects/abseil-cpp-20200225.2/absl/container/internal/raw_hash_set.h:311

We could do that by making CancelAll() merely schedule another event
that performs the actual cancellation at a later point in order to avoid
the recursion. However, our cancellation logic is all sorts of weird and
makes assumptions about how events are dispatched (i.e. there might be
multiple at a time). Let's just get rid of all of this and use the
sd-event mechanism of sd_event_exit instead.

This does, however (as did the original proposed solution), have the
side effect of logging multiple times because we potentially open up to
5 connections to the AUR at once, e.g.

  $ build/auracle --baseurl http://129.168.255.1 outdated
  error: UNKNOWN: Connection timed out after 10000 milliseconds
  error: UNKNOWN: Connection timed out after 10000 milliseconds

I suppose one way to fix this would be to do response merging on the
backend to match the request splitting. That way, the frontend only gets
one response. I think that comes with a lot of weird potential behaviors
though (handling of partial failures, to mention one). Would be nicer if
the AUR didn't have the crap behavior and could take POST requests in
order to extend the arg limit.

Whatever, this is a weird edge case.

Fixes #82.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants