Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for using a local repository clone for our external dependencies #279

Closed
s3rvac opened this issue Apr 18, 2018 · 9 comments
Closed

Comments

@s3rvac
Copy link
Member

s3rvac commented Apr 18, 2018

Current state

Currently, when you want to simultaneously develop e.g. fileinfo and pelib, you need to use the following workflow:

  • Create a new branch in the RetDec repo.
  • Create a new branch in the pelib repo.
  • Do some changes in the pelib repo, commit them, push them.
  • In the RetDec repo, open deps/pelib/CMakeLists.txt and change the following lines:
    URL https://github.com/avast-tl/pelib/archive/e93eaa7c150f4608a5d02a67f5edc9e54456fe24.zip
    URL_HASH SHA256=2ffd7e89451c980a1af6d24d4f6dfbb69a660b06ad5de44c481f6431e21de394
    
  • Rebuild RetDec.

This workflow has the following disadvantages:

  • In order to test the changes from pelib in RetDec, you need to commit them, push them, change a CMakeLists.txt file in RetDec, and perform a rebuild.
  • After every change to pelib, you need a completely new build of pelib, as this is how CMake works when you change the URL for an external project.

Both of these disadvantages slow the development considerably.

Proposal

A better way would be to add a CMake option for each of our dependency (e.g. pelib, llvm, elfio) that would allow you to set a path to a local repository that should be used when building the external project. This would allow you to simultaneously develop e.g. both RetDec and pelib, without a need of constant commiting, pushing, and changing of commit hashes in a CMakeLists.txt file. Also, it would be faster as you would only need an incremental build of the external project.

When you are done with the changes, you would push them into the master branch of the external-project's repo and update the corresponding CMakeLists.txt file in RetDec to use the new commit, just like we do now.

@timokau
Copy link

timokau commented Sep 15, 2018

This would also be beneficial for packaging, where offline building may be a requirement. Why doesn't retdec just use git sobmodules?

@s3rvac
Copy link
Member Author

s3rvac commented Sep 20, 2018

@timokau We were actually using git submodules when we open-sourced RetDec in December 2017. Back then, we created a separate repository for each logical component and linked them via git submodules (dependency tree). However, this approach proved to be confusing (#72), hard to maintain, slow to develop, and exhibited several other issues like #14 and #48. Instead of trying to reduce the complexity of the dependency tree or complicating the matter by using C++ dependency managers, we decided to put most of the libraries and tools together into a single repository: avast-tl/retdec. And, judging by the last half a year, this was a right choice as it solved the problems we were having with git submodules.

For more details, see this wiki page.

@timokau
Copy link

timokau commented Sep 20, 2018

I think we're talking about different things: I don't want the retdec project itself to be split up but I want to be able to fetch all the external dependencies cmake tries to fetch at build time before the build and then build offline. Git submodules seems perfect for that.

@s3rvac
Copy link
Member Author

s3rvac commented Sep 20, 2018

When using submodules:

  • You and your users have to remember to initialize them after cloning the repo (or remember to call git clone with --recursive or --recurse-submodules), which people forget, see cmake fails because all "deps" subfolders are empty #72.
  • You and your users have to remember to update them after pulling changes or switching branches.
  • And there are many other issues, like increased mental load when working with them or conflict resolution.

Thank you for the idea, but for us, git submodules are a no-go. They only seem perfect, while in reality, they are far from perfect.

@timokau
Copy link

timokau commented Sep 20, 2018

Couldn't you just initialize the submodules with cmake if they're not initialized?

@s3rvac
Copy link
Member Author

s3rvac commented Sep 20, 2018

We could, but it may still cause confusion (the sources would not be there after a regular clone), and all the other issues with submodules would still remain, anyway.

@timokau
Copy link

timokau commented Sep 28, 2018

But the sources are not there right now either, they need to be fetched by cmake first. What problem does cmake solve better than submodules?

@metthal
Copy link
Member

metthal commented Sep 29, 2018

The sources of third-party libraries are now at least hidden in your working directory (from which you run CMake, I suppose no one is running CMake in source). Since they are third-party, you shouldn't care about them as user, only as a developer and that's what this ticket is about.

Having them as submodules wouldn't solve anything regarding development. It's much more painful to work with submodule and keeping it updated while also modifying it. We tried it, it didn't work as we expected.

I suppose that you mostly care about the user point of view and I wonder why would you care about third-party libraries? CMake takes care of those dependencies and it would still need to even if we've used submodules. The only difference is in how those dependencies are downloaded to your PC. I personally see no benefit in transitioning to submodules, especially after we've abandoned them. What do you need to build just our dependencies for?

@timokau
Copy link

timokau commented Sep 29, 2018

I still don't really see the problems with submodules, but I don't have any experience with them myself so I'm probably just being naive.

I suppose that you mostly care about the user point of view and I wonder why would you care about third-party libraries? CMake takes care of those dependencies and it would still need to even if we've used submodules. The only difference is in how those dependencies are downloaded to your PC. I personally see no benefit in transitioning to submodules, especially after we've abandoned them. What do you need to build just our dependencies for?

My use-case is an offline build. I want to be able to fetch everything needed for the build first, then build without an internet connection. That is important for reproducibility (in packaging). We verify everything that is fetched from the internet against a hash and then build in a sandboxed environment without internet. Cmake makes this hard. Currently we manually fetch all the dependencies: https://github.com/NixOS/nixpkgs/blob/92a047a6c4d46a222e9c323ea85882d0a7a13af8/pkgs/development/tools/analysis/retdec/default.nix#L25

That makes updates a pain, which is why the retdec package is outdated. Submodules would make that simple.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants