forked from go-gitea/gitea
-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
git-annex
#1
Draft
kousu
wants to merge
8
commits into
main
Choose a base branch
from
git-annex
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
kousu
force-pushed
the
git-annex
branch
2 times, most recently
from
February 1, 2022 08:02
214f09d
to
d463427
Compare
github-actions
bot
force-pushed
the
main
branch
from
March 26, 2022 20:37
ded2781
to
b24a462
Compare
github-actions
bot
force-pushed
the
main
branch
from
March 27, 2022 09:11
b24a462
to
65e42f8
Compare
Draft
github-actions
bot
force-pushed
the
git-annex
branch
from
March 28, 2022 19:22
4273c59
to
a511a8c
Compare
github-actions
bot
force-pushed
the
git-annex
branch
from
March 28, 2022 19:50
a511a8c
to
3a12f44
Compare
gitea-sync bot
pushed a commit
that referenced
this pull request
Nov 5, 2023
We plan to keep this fork alive for a significant amount of time, because it will hold an extra feature (#1) that upstream doesn't have. Automate keeping up to date with upstream to reduce our overhead. The scheme I chose for this is a little bit convoluted. There are three branches involved, but this means that the patch that we might potentially upstream is clean. Also, this scheme can't fix merge conflicts automatically, but it can find and email about them automatically, which is about as good as we can hope for.
gitea-sync
bot
force-pushed
the
git-annex
branch
from
November 6, 2023 13:13
b336d40
to
e977ccd
Compare
gitea-sync bot
pushed a commit
that referenced
this pull request
Nov 6, 2023
We plan to keep this fork alive for a significant amount of time, because it will hold an extra feature (#1) that upstream doesn't have. Automate keeping up to date with upstream to reduce our overhead. The scheme I chose for this is a little bit convoluted. There are three branches involved, but this means that the patch that we might potentially upstream is clean. Also, this scheme can't fix merge conflicts automatically, but it can find and email about them automatically, which is about as good as we can hope for.
gitea-sync bot
pushed a commit
that referenced
this pull request
Nov 7, 2023
We plan to keep this fork alive for a significant amount of time, because it will hold an extra feature (#1) that upstream doesn't have. Automate keeping up to date with upstream to reduce our overhead. The scheme I chose for this is a little bit convoluted. There are three branches involved, but this means that the patch that we might potentially upstream is clean. Also, this scheme can't fix merge conflicts automatically, but it can find and email about them automatically, which is about as good as we can hope for.
gitea-sync
bot
force-pushed
the
git-annex
branch
2 times, most recently
from
November 8, 2023 13:11
0df2117
to
f78d0d8
Compare
gitea-sync bot
pushed a commit
that referenced
this pull request
Nov 8, 2023
We plan to keep this fork alive for a significant amount of time, because it will hold an extra feature (#1) that upstream doesn't have. Automate keeping up to date with upstream to reduce our overhead. The scheme I chose for this is a little bit convoluted. There are three branches involved, but this means that the patch that we might potentially upstream is clean. Also, this scheme can't fix merge conflicts automatically, but it can find and email about them automatically, which is about as good as we can hope for.
gitea-sync
bot
force-pushed
the
git-annex
branch
from
November 9, 2023 13:11
f78d0d8
to
d6f2dbf
Compare
gitea-sync bot
pushed a commit
that referenced
this pull request
Nov 9, 2023
We plan to keep this fork alive for a significant amount of time, because it will hold an extra feature (#1) that upstream doesn't have. Automate keeping up to date with upstream to reduce our overhead. The scheme I chose for this is a little bit convoluted. There are three branches involved, but this means that the patch that we might potentially upstream is clean. Also, this scheme can't fix merge conflicts automatically, but it can find and email about them automatically, which is about as good as we can hope for.
gitea-sync
bot
force-pushed
the
git-annex
branch
from
November 10, 2023 13:11
d6f2dbf
to
81f56b1
Compare
gitea-sync bot
pushed a commit
that referenced
this pull request
Nov 10, 2023
We plan to keep this fork alive for a significant amount of time, because it will hold an extra feature (#1) that upstream doesn't have. Automate keeping up to date with upstream to reduce our overhead. The scheme I chose for this is a little bit convoluted. There are three branches involved, but this means that the patch that we might potentially upstream is clean. Also, this scheme can't fix merge conflicts automatically, but it can find and email about them automatically, which is about as good as we can hope for.
gitea-sync
bot
force-pushed
the
git-annex
branch
from
November 11, 2023 13:09
81f56b1
to
d75a62f
Compare
gitea-sync bot
pushed a commit
that referenced
this pull request
Nov 11, 2023
We plan to keep this fork alive for a significant amount of time, because it will hold an extra feature (#1) that upstream doesn't have. Automate keeping up to date with upstream to reduce our overhead. The scheme I chose for this is a little bit convoluted. There are three branches involved, but this means that the patch that we might potentially upstream is clean. Also, this scheme can't fix merge conflicts automatically, but it can find and email about them automatically, which is about as good as we can hope for.
gitea-sync bot
pushed a commit
that referenced
this pull request
Nov 12, 2023
We plan to keep this fork alive for a significant amount of time, because it will hold an extra feature (#1) that upstream doesn't have. Automate keeping up to date with upstream to reduce our overhead. The scheme I chose for this is a little bit convoluted. There are three branches involved, but this means that the patch that we might potentially upstream is clean. Also, this scheme can't fix merge conflicts automatically, but it can find and email about them automatically, which is about as good as we can hope for.
gitea-sync
bot
force-pushed
the
git-annex
branch
from
November 12, 2023 13:09
d75a62f
to
5a4179a
Compare
gitea-sync bot
pushed a commit
that referenced
this pull request
Nov 13, 2023
We plan to keep this fork alive for a significant amount of time, because it will hold an extra feature (#1) that upstream doesn't have. Automate keeping up to date with upstream to reduce our overhead. The scheme I chose for this is a little bit convoluted. There are three branches involved, but this means that the patch that we might potentially upstream is clean. Also, this scheme can't fix merge conflicts automatically, but it can find and email about them automatically, which is about as good as we can hope for.
gitea-sync
bot
force-pushed
the
git-annex
branch
from
November 13, 2023 13:12
5a4179a
to
d952a4b
Compare
gitea-sync bot
pushed a commit
that referenced
this pull request
Nov 14, 2023
We plan to keep this fork alive for a significant amount of time, because it will hold an extra feature (#1) that upstream doesn't have. Automate keeping up to date with upstream to reduce our overhead. The scheme I chose for this is a little bit convoluted. There are three branches involved, but this means that the patch that we might potentially upstream is clean. Also, this scheme can't fix merge conflicts automatically, but it can find and email about them automatically, which is about as good as we can hope for.
gitea-sync
bot
force-pushed
the
git-annex
branch
from
November 14, 2023 13:12
d952a4b
to
1790aeb
Compare
[git-annex](https://git-annex.branchable.com/) is a more complicated cousin to git-lfs, storing large files in an optional-download side content. Unlike lfs, it allows mixing and matching storage remotes, so the content remote(s) doesn't need to be on the same server as the git remote, making it feasible to scatter a collection across cloud storage, old harddrives, or anywhere else storage can be scavenged. Since this can get complicated, fast, it has a content-tracking database (`git annex whereis`) to help find everything later. The use-case we imagine for including it in Gitea is just the simple case, where we're primarily emulating git-lfs: each repo has its large content at the same URL. Our motivation is so we can self-host https://www.datalad.org/ datasets, which currently are only hostable by fragilely scrounging together cloud storage -- and having to manage all the credentials associated with all the pieces -- or at https://openneuro.org which is fragile in its own ways. Supporting git-annex also allows multiple Gitea instance to be annex remotes for each other, mirroring the content or otherwise collaborating the split up the hosting costs. Enabling -------- TODO HTTP ---- TODO Permission Checking ------------------- This tweaks the API in routers/private/serv.go to expose the calling user's computed permission, instead of just returning HTTP 403. This doesn't fit in super well. It's the opposite from how the git-lfs support is done, where there's a complete list of possible subcommands and their matching permission levels, and then the API compares the requested with the actual level and returns HTTP 403 if the check fails. But it's necessary. The main git-annex verbs, 'git-annex-shell configlist' and 'git-annex-shell p2pstdio' are both either read-only or read-write operations, depending on the state on disk on either end of the connection and what the user asked it to ask for, with no way to know before git-annex examines the situation. So tell the level via GIT_ANNEX_READONLY and trust it to handle itself. In the older Gogs version, the permission was directly read in cmd/serv.go: ``` mode, err = db.UserAccessMode(user.ID, repo) ``` - https://github.com/G-Node/gogs/blob/966e925cf320beff768b192276774d9265706df5/internal/cmd/serv.go#L334 but in Gitea permission enforcement has been centralized in the API layer. (perhaps so the cmd layer can avoid making direct DB connections?) Deletion -------- git-annex has this "lockdown" feature where it tries really quite very hard to prevent you deleting its data, to the point that even an rm -rf won't do it: each file in annex/objects/ is nested inside a folder with read-only permissions. The recommended workaround is to run chmod -R +w when you're sure you actually want to delete a repo. See https://git-annex.branchable.com/internals/lockdown So we edit util.RemoveAll() to do just that, so now it's `chmod -R +w && rm -rf` instead of just `rm -rf`.
Fixes #11 Tests: * `git annex init` * `git annex copy --from origin` * `git annex copy --to origin` over: * ssh for: * the owner * a collaborator * a read-only collaborator * a stranger in a * public repo * private repo And then confirms: * Deletion of the remote repo (to ensure lockdown isn't messing with us: https://git-annex.branchable.com/internals/lockdown/#comment-0cc5225dc5abe8eddeb843bfd2fdc382) ------ To support all this: * Add util.FileCmp() * Patch withKeyFile() so it can be nested in other copies of itself ------- Many thanks to Mathieu for giving style tips and catching several bugs, including a subtle one in util.filecmp() which neutered it. Co-authored-by: Mathieu Guay-Paquet <mathieu.guay-paquet@polymtl.ca>
Fixes #8 Co-authored-by: Mathieu Guay-Paquet <mathieu.guaypaquet@gmail.com>
This makes HTTP symmetric with SSH clone URLs. This gives us the fancy feature of _anonymous_ downloads, so people can access datasets without having to set up an account or manage ssh keys. Previously, to access "open access" data shared this way, users would need to: 1. Create an account on gitea.example.com 2. Create ssh keys 3. Upload ssh keys (and make sure to find and upload the correct file) 4. `git clone git@gitea.example.com:user/dataset.git` 5. `cd dataset` 6. `git annex get` This cuts that down to just the last three steps: 1. `git clone https://gitea.example.com/user/dataset.git` 2. `cd dataset` 3. `git annex get` This is significantly simpler for downstream users, especially for those unfamiliar with the command line. Unfortunately there's no uploading. While git-annex supports uploading over HTTP to S3 and some other special remotes, it seems to fail on a _plain_ HTTP remote. See #7 and https://git-annex.branchable.com/forum/HTTP_uploads/#comment-ce28adc128fdefe4c4c49628174d9b92. This is not a major loss since no one wants uploading to be anonymous anyway. To support private repos, I had to hunt down and patch a secret extra security corner that Gitea only applies to HTTP for some reason (services/auth/basic.go). This was guided by https://git-annex.branchable.com/tips/setup_a_public_repository_on_a_web_site/ Fixes #3 Co-authored-by: Mathieu Guay-Paquet <mathieu.guaypaquet@polymtl.ca>
This moves the `annexObjectPath()` helper out of the tests and into a dedicated sub-package as `annex.ContentLocation()`, and expands it with `.Pointer()` (which validates using `git annex examinekey`), `.IsAnnexed()` and `.Content()` to make it a more useful module. The tests retain their own wrapper version of `ContentLocation()` because I tried to follow close to the API modules/lfs uses, which in terms of abstract `git.Blob` and `git.TreeEntry` objects, not in terms of `repoPath string`s which are more convenient for the tests.
Previously, Gitea's LFS support allowed direct-downloads of LFS content, via http://$HOSTNAME:$PORT/$USER/$REPO/media/branch/$BRANCH/$FILE Expand that grace to git-annex too. Now /media should provide the relevant *content* from the .git/annex/objects/ folder. This adds tests too. And expands the tests to try symlink-based annexing, since /media implicitly supports both that and pointer-file-based annexing.
This updates the repo index/file view endpoints so annex files match the way LFS files are rendered, making annexed files accessible via the web instead of being black boxes only accessible by git clone. This mostly just duplicates the existing LFS logic. It doesn't try to combine itself with the existing logic, to make merging with upstream easier. If upstream ever decides to accept, I would like to try to merge the redundant logic. The one bit that doesn't directly copy LFS is my choice to hide annex-symlinks. LFS files are always _pointer files_ and therefore always render with the "file" icon and no special label, but annex files come in two flavours: symlinks or pointer files. I've conflated both kinds to try to give a consistent experience. The tests in here ensure the correct download link (/media, from the last PR) renders in both the toolbar and, if a binary file (like most annexed files will be), in the main pane, but it also adds quite a bit of code to make sure text files that happen to be annexed are dug out and rendered inline like LFS files are.
Upstream can handle the full test suite; to avoid tedious waiting, we only test the code added in this fork.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Support
git-annex
, an alternative togit-lfs
which is already supported ingitea
.This is like GIN (https://github.com/G-Node/gogs/), but based off of Gitea instead of the older, less maintained, fork named gogs. Unlike GIN, this is just git-annex support; there's no theme changes or DOI support, because those are complicated and irrelevant for my uses.
I don't intend to ever merge this PR; instead, it is packaged directly out of this branch (#2). But I want to have the PR to make it easy to demonstrate the patch.