-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remove git submodules #10743
remove git submodules #10743
Conversation
Could the |
@@ -871,6 +871,8 @@ ifeq ($(VERBOSE), 0) | |||
|
|||
QUIET_MAKE = -s | |||
|
|||
WARNCOLOR="\033[33;1m" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be defined when VERBOSE = 1
too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch. thanks
Aside from the comments above, I think I like it. Would have to try it out for a while to decide for sure, but looks like a clever solution. |
no. i specifically want (need) these files to be versioned separately. that allows the rules to force a rebuild when the version spec changes. |
Technically they could depend on |
I think this'll fail to build with |
I was going to propose it, but thought it would be too complex. Thanks @vtjnash for just going ahead and doing it. |
So essentially this is That's really great!! |
I would be wary of relying on too much cleverness from gmake here, except that CMake has its own ExternalProject mechanism which should work robustly for this if we ever get further in that direction. At the moment the fact that this is using |
I think we should always be using tag names or direct SHA references. That On Mon, Apr 6, 2015, 09:50 Tony Kelman notifications@github.com wrote:
|
gmake provides the framework, but very little cleverness of it's own. it's all actually just using pretty simple rules, encoded in the basic tools provided by gmake. I'm personally no longer interested in switching to CMake, once we merged libgit2 and I discovered how inadequate it's support of cross-compiling actually is. That just seems like such a shockingly poor design decision.
tags can move. this moves away from depending on SHA1 hashes quite as directly, specifically because it seems to cause users such pain. It looks like git returns an ETag that is actually just the SHA1 hash of that commit, so you can validate the cache just by checking the header of the reply (after all of the redirects):
|
Not really the topic of this issue, but do you ever want first-class support for MSVC? Maintaining 2 independent build systems is painful and not a good idea long-term. CMake is perfectly capable of cross-compiling - with a toolchain file. We just haven't gone and written the toolchain files. edit: also note that LLVM is moving very strongly towards removing support for autotools altogether, so CMake will be absolutely unavoidable before all that much longer. |
that would in some ways be a huge pain for us. even the llvm page on "how to cross compile llvm" notes that cmake isn't able to correctly detect the dependencies of a cross-compile and needs to be manually tricked: http://llvm.org/docs/HowToCrossCompileLLVM.html can you imagine how long it will take to create each cross-build configuration if you have to write all of the rules manually? it took me barely an afternoon to turn on cross-compiling to any arbitrary target for 8 autotools dependencies. it's unclear to me how to replicate that result for just one cmake target (libgit2). what's the point of having a supposed automated configuration system if I have to write all of the rules manually? |
They're missing I believe this will still be broken on Windows with |
|
Now that staticfloat/cache.julialang.org#3 is resolved this should probably switch to using |
I was attempting to follow the github API doc: https://developer.github.com/v3/repos/contents/ |
Ah, well if that's current, strange that the download links through the web ui aren't consistent with their api. We could alternately add |
2f79903
to
f247054
Compare
turns on that they aren't quite the same. internally, the API link encodes the SHA1 hash and owner into the filepath, while the archive link instead encodes the branch name:
vs.
any thoughts or preference here? |
git submodules cause developers a lot of heartache, but are very useful when used correctly. this combines the best features of git submodules, with the best features of static tarball files (and even automatically switches between the two), in a new feature that I'm dubbing "git external". the main idea here is to act a bit more like svn external (hence the name), in that this tracks a branch (or tag) rather than a SHA1 hash. In a nod to git submodules, however, this explicitly also remembers the SHA1 hash of the original reference and gives a warning if they diverge. This helps make two problem situations less likely: 1) Pointing to a SHA1 hash that is not on an upstream branch, and thus later gets deleted by the git gc 2) Forgetting to push in the correct order (submodule and then toplevel), and instantly breaking the build for everyone else because the branch information is now stored in a file, instead of some hidden git metadata, this also allows building from a tarball without any extra effort. in fact, when building from a tarball (with no .git folder), this automatically downloads the dependency tarball as well, so that having git installed is not necessary. all this seamless magic is implemented in the git-external deps/Makefile rule. see the description above that rule, and other usages, for more details.
… rules into git-external define
edit: in case this isn't clear, this is broken with |
@@ -0,0 +1 @@ | |||
Subproject commit 86447ad060d6f4edf01f2a64b9598dfeeb6e6f7d |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
huh?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
git submodules coming back for revenge?
buildbots are apparently broken by this, cc @staticfloat |
ci seems OK with this PR. the i686 build failed after merging because the download of virtualenv failed |
Not CI. Binaries. http://buildbot.e.ip.saba.us:8010/builders/ |
As far as I'm a bit concerned having the checksums only verified when building with |
I've nuked the buildbots, we'll see if things get better now.
-E
|
Looks mostly back to normal, apparently the mis-rebased submodule remnant was confusing git on the buildbots. |
Looking at this again, tracking a sha directly is better than tracking a branch, since the latter leads to non-reproducible builds that depend on when you try to run them, and can cause indirect breakage here if a broken commit gets pushed to a dependency. The sha mismatch should be an error, not a warning. Or it should do a checkout, which would error when dirty. |
What you are describing is a git submodule (minus tracking version information in a separate file, plus actual integration with git). But to your other point, whenever possible, these should probably typically be tracking tags not branches for exactly the reason you mention. |
Note that of the many complaints I've made about git submodules, none of them were that they use a sha instead of a branch. With recent versions of git, submodules can track branches. But they shouldn't due to the non-reproducibility here. In my opinion we can't tag a release with this feature in its current form. Otherwise 6 months later if someone tries to build a release tag, they most likely won't be able to if any dependencies have changed. Also a note to self that we should add comments to these |
I was going to going to suggest the reverse and say that we should not be adding the checksum files. The SHA1 hash is already a checksum for git. Otherwise, "make source-dist" will create them. |
The only reason I liked this version more than submodules is because it allows proper operation from tarballs with edit:
Currently advisory-only. That's bad. Warnings get ignored. I won't, but to prove a point I (or anyone else) could go push a broken commit to openspecfun master right now and it would break the Julia build for everyone doing a fresh clone. |
Also SHA1 is no longer considered secure. Tarball checking uses SHA256 which is secure (also MD5, which is totally insecure at this point, but just as a fallback). |
I'm less concerned about cryptographic security than I am about actually getting the code we say we want to get. Asking for a branch is not the right choice here. A tag is better, but not perfect, and we'd need to be rigorous about enforcing tags rather than branches. I think the sha1 needs to be mandatory here. The minimal way of doing that might be with just a checkout on the git side, but the tarball side needs to adjust the url it's downloading from otherwise it'll fail the checksum verification if the referenced branch changes. |
i forgot about the checksum verification case of the moving head case. it's an easy change to just put the SHA1 hash twice (in the |
Can you explain what you mean by this, in more words or a diff? |
like so: diff --git a/deps/openblas.version b/deps/openblas.version
index 5f67252..e893343 100644
--- a/deps/openblas.version
+++ b/deps/openblas.version
@@ -1,2 +1,2 @@
-OPENBLAS_BRANCH=v0.2.14
+OPENBLAS_BRANCH=d0c51c4de91b2356b770f92862c6b0e1d54953cd
OPENBLAS_SHA1=d0c51c4de91b2356b770f92862c6b0e1d54953cd diff --git a/deps/Makefile b/deps/Makefile
index 5a8862b..aef765c 100644
--- a/deps/Makefile
+++ b/deps/Makefile
@@ -261,9 +261,9 @@ ifeq (exists, $$(shell [ -d $$(JULIAHOME)/.git/modules/deps/$1 ] && echo exists
$1/$4: $$(JULIAHOME)/.git/modules/deps/$1/HEAD
endif
else # NO_GIT
-$2_SRC_DIR = $1-$$($2_BRANCH)
+$2_SRC_DIR = $1-$$($2_SHA1)
$$($2_SRC_DIR).tar.gz:
- $$(JLDOWNLOAD) $$@ $$(call $2_TAR_URL,$$($2_BRANCH))
+ $$(JLDOWNLOAD) $$@ $$(call $2_TAR_URL,$$($2_SHA1))
$$($2_SRC_DIR)/$3: $$($2_SRC_DIR).tar.gz
$$(JLCHECKSUM) $$($2_SRC_DIR).tar.gz
mkdir -p $$($2_SRC_DIR) && \ |
I like the latter change, but if we're going to do the former, why have a separate |
just avoiding deleting code (minimizing the size of the diff). |
This makes it less likely that we would clone a branch that has diverged from the commit sha1 specified in the _SHA1 variables in the deps/*.version files ref #10743 (comment)
This makes it less likely that we would clone a branch that has diverged from the commit sha1 specified in the _SHA1 variables in the deps/*.version files ref #10743 (comment)
This makes it less likely that we would clone a branch that has diverged from the commit sha1 specified in the _SHA1 variables in the deps/*.version files ref #10743 (comment)
I know ya'll probably thought I would be the last person to be submitting a PR to kill of the usage of git submodules. but with this little bit of sleight of hand, I'm trying to have my cake and eat it too.
If you run
make NO_GIT=1
, the build doesn't need git at all, doesn't need it installed, nothing. So if you download a zip or tarball from GitHub, it'll even set NO_GIT automatically and keep on chugging.But if you run
make
like usual, this does use git to manage certain dependencies. Additionally, it uses shallow checkouts to keep checkout time and space down, but those are easy enough to convert to full checkouts if required. This lets us have the benefits of git submodules, without the downsides of git submodules. I call this new feature "git-externals".Both usages (NO_GIT=1 and NO_GIT=0) will even live happily side-by-side in the same build tree (so that
make source-dist
is simpler). But wait, there's more: git-externals modules are also fully compatible (forwards and backwards) with their pre-existing git-submodules (although the forward move will force a rebuild).ref #10730