-
Notifications
You must be signed in to change notification settings - Fork 373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Git LFS #80
Comments
Filters are supported in libgit2, but not in Also, the list of files that use lfs is stored in |
I'm worried about clean/smudge in general because it seems expensive to make it behave consistently. https://git-scm.com/book/en/v2/Customizing-Git-Git-Attributes says that smudging happens just before checkout and cleaning happens just before staging. If that's correct, then that seems to mean that |
(Update: On second thought, this paragraph might not really be addressing your point) Yes, I remember setting up the LFS repo being a pain. I don't remember how git reacted to changing For reference, the setup looks like this:
(I use Github's LFS support to sync a few binaries across my machines. I use My sense is that only the
I think that's what the
This works pretty well for LFS specifically. I'm not sure what else, if anything, clean/smudge filters are used for. This does make LFS a little different from the way |
This might all be quite awkward with |
Note that LFS is widely disliked due to half-baked support from GitHub (small quotas, made worse by being consumed by both third-party forks and CI activity) and even less support elsewhere. I hope jj can offer a first-class solution to large binary file handling eventually, and that any LFS compat is forwards-compatible with that.
What else might they display? A diff of a large binary file will rarely be intelligible, even with a suitable diff algorithm (e.g. based on rolling hashes). I suppose it would be cool for a Sufficiently Smart diff viewer to be able to e.g. display two versions of a .png, but in general large binary files are opaque. |
Fair enough :) But the point still stands for clean/smudge in general. |
As I mentioned, there seemed to be some attempt to support intelligent diffing with LFS. I would guess that the original idea was for the user to configure custom diff tools (e.g. for pngs) that would be called by |
Regardless of LFS's merits, I imagine there's plenty of other users like me that would love to use jj in an existing repo that uses git lfs, but currently can't. Whether I use jj should be opaque to others in the repo, so this wouldn't be a compelling reason to migrate the codebase away from lfs (to whatever solution may be better) in a many-user repo. |
I agree, support for LFS would be mostly (maybe only) to make it easier to use jj with existing git repos. |
Is there a way to ignore LFS files so that LFS-enabled repos can still use jj, even if it means we can't interact with LFS itself while using jj? Just wondering if there's a way to use jj without needing full support for LFS. |
If you don't need the LFS files, then it probably already works - you'd just see the placeholder files (pointers to the real content) in the working copy, I think. But I suspect that you need the actual files, and for that I can't think of a good solution. Oh, using sparse checkouts in a colocated repo might work. However, sparse checkouts don't currently support negative patterns, so it could be very annoying to maintain the sparse patterns depending on your repo. Hmm, it also looks like we don't have any documentation about sparse checkouts, other than |
So This solution would have worked great especially if it supported negative globs so that I could just add my existing LFS globs in my |
We do want to add support for arbitrary patterns in Then we'd also need to figure out the UX for adding and removing patterns. Git seems to use the same format as for |
Instead of patterns being "prefixes" as they are now, can't we allow arbitrary globs in Reading comments in this issue, it seems like Edit: adding/removing globs in the command line is (as the git docs mention) error-prone. I personally think this is fine as long as we add warnings about it. An alternative would be a Edit 2: the git docs mention that having "non-cone" globs can slow down commands, but (naively) this seems like it could be solved by compiling patterns similarly to what |
If we accept both things like
That would be appreciated, thanks! Just to be clear, FYI, the typical use case for sparse checkouts is when you're working on only a small part of a large repo, like working only on a particular file system in the Linux repo (I have never worked in the Linux repo, so I have no idea if that's a realistic example - maybe you need most of the repo in order to do a build anyway, for example).
Yes, I think that would be useful. I was looking for
There are still cases that can't be made fast, like |
For posterity, there are other legit uses of Git's clean/smudge filters beyond LFS: https://github.com/elasticdog/transcrypt |
Oh, I didn't realize that With that said, I actually implemented |
This confused me. My impression was the opposite: jj is only allowed to touch (or erase) files in |
I think @71 meant when you go from having some part of the workspace populated to having that part not populated, then the @71, if you make git populate those paths after setting the sparse patterns with |
In my case, the problem was that I had files that I did not want in the Git repo.
|
Do you mean that you want it as an untracked file? You can do |
I will add this again later, [once it is supported][issue]. In the meantime, though, I will ignore it from Jujutsu's POV, but force add it from Git's POV so that it ends up in the repo from a Git(Hub) perspective without confusing Jujustu. [issue]: jj-vcs/jj#80
I will add this again later, [once it is supported][issue]. In the meantime, though, I will ignore it from Jujutsu's POV, but force add it from Git's POV so that it ends up in the repo from a Git(Hub) perspective without confusing Jujustu. [issue]: jj-vcs/jj#80
I will add this again later, [once it is supported][issue]. In the meantime, though, I will ignore it from Jujutsu's POV, but force add it from Git's POV so that it ends up in the repo from a Git(Hub) perspective without confusing Jujustu. [issue]: jj-vcs/jj#80
I would like to echo this sentiment: Beloved or not, LFS is fairly widely used to track binary files in repos for various reasons, and missing support for it excludes a large amount of repositories from use with jj. It's a very promising sign that "mundane" user compatibility concerns like this are taken seriously 👍 thanks for that, and thanks for jujutsu! |
To add some colour to this, I think it's not necessary that jj support the full set of Git LFS features (like smudge, clean, etc), only that jj be able to interop gracefully in a colocated Git LFS repo. IMO it's okay to say this out of scope for If so, then perhaps all that's needed is for
There will be some unhappy cases when the Apologies if this was suggested earlier, I tried to digest the discussion so far as best as I could. |
This is the one thing stopping me from using |
Thanks! That should only require changes to https://github.com/jj-vcs/jj/blob/main/lib/src/local_working_copy.rs, specifically the Note that this approach means that things like |
@bcspragu Depending on your appetite for implementation, you could also try to pick up the work I linked in this comment: #2920 (comment). It should also handle arbitrary smudge/clean filters, submodules, etc., but needs more investment. |
I have a functional (but extremely janky) patch here main...bcspragu:jj:main. Some notes:
Apologies, but I'm not quite that hungry yet 😅. Though I think that approach is a much more reasonable one in terms of what could actually be merged in |
@bcspragu I took some time to add support for multiple gitattributes files on top of your change in weiznich@a125051. That also fixes the problem with the relative paths. Maybe one of the jj maintainers can give some hints how to move this forward from here on? |
I'm not sure the |
@martinvonz https://github.com/weiznich/jj/blob/ignore_lfs/lib/src/matchers.rs#L592-L616 implements exactly that. I've run it locally on a repository that useses several gitattributes files in different subfolders and it seems to work fine. |
Please send a PR when it's ready. That should include tests. Please put those tests in the library crate (not the CLI crate). |
Thanks for the updates, looks excellent!
I can pull in @weiznich's changes (if that works for them) and add some library tests. Just to confirm, the idea is to ship the LFS-ignoring functionality as a part of |
@bcspragu If you have the capacity to put up a PR with tests that would be great. Just pull in my changes if that helps you. |
Yes, I'm fine with that. I assume others will object if they disagree.
That's probably best. |
Git LFS seems to be used frequently enough that it may be worth adding support for it. I don't think it'll be a priority for me very soon, but I guess that depends on how many people want it.
The specification says that it uses clean/smudge filters. We don't have anything like that yet. So the first step is probably to add support for that. We could make the filters only available internally (i.e. in Rust code) to start with to keep it simple. On the other hand, it might not be hard to make them user-configurable.
Another option is to add a separate file type in the data model for LFS entries. For reference, we currently have files, symlinks, trees, conflicts, and gitmodules. I haven't thought through the consequences yet. There should be no difference to the user and no difference in the representation when using the Git backend. However, clean/smudge filters are probably useful to have anyway. Oh, one possible advantage of representing LFS entries in the model is that we can decide to always leave merged LFS files as conflicts, without downloading the files until the user checks them out or looks at the diff etc.
I don't yet know what other aspects of Git LFS we need to consider.
Originally requested in #77.
The text was updated successfully, but these errors were encountered: