-
Notifications
You must be signed in to change notification settings - Fork 390
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider moving extra metadata from git notes to other storage #7
Comments
I think this is just cleaner, and it gives us room to put other store-related data in the `.jj/store/` directory. I may want to use that place for writing the metadata we currently write in Git notes (#7).
Fun fact: I replaced the Git notes storage by a simple custom format and |
Do you have a solution for associating metadata with nodes/edges in the commit graph? If so, is it something that can be shared between repositories? |
I currently use Git notes, which can technically be shared, but it's not obvious how to do that for a regular user. I could of course add some command for making it easier, but there's not much in the extra metadata that is important for sharing, and now that evolution is pretty much gone, there's even less. The only pieces of information are the change id and the open/closed flag. I hadn't planned to exchange it until there's exchange between native jj repos (i.e. years from now :)). Allow exchange of the information is thus not something I aimed for with the format I'm replacing Git notes with. The format can be thought of as naive Git packfiles without delta encoding or compression. Does git-branchless also need to associate metadata with nodes and/or edges? What metadata? |
I'm trying to replace the Git backend's use of Git notes for storing metadata (#7). This patch adds a file format that I hope can be used for that. It's a simple generic format for storing fixed-size keys and associated variable-size values. The keys are stored in sorted order. Each key is followed by an offset to the value. The offset is relative to the first value. All values are concatenated after each other. I suppose it's a bit like Git's pack files but lacking both delta-encoding and compression. Each file can also have a parent pointer (just like the index files have), so we don't have to rewrite the whole file each time. As with the index files, the new format squashes a file into its parent if it contains more than half the number of entries of the parent. The code is also based on `index.rs`. Perhaps we can alo replace the default operation storage with this format. Maybe also the native local backend's storage. We'll need delta-encoding and compression soon then.
The new store works the same way as the `OpHeadsStore`. It keeps track of the current head file(s) by recording their names in a directory. When a write happens, it adds the new head and then removes the old head. There will be generally be a single head at a time. The only exception is when there's been concurrent operations (locally, or remotely, in the case of a distributed file system). When there are multiple heads files, they are automatically merged. No guarantee is given about which value wins if the key exists in several heads; the store is meant to be used for data that's immutable once written. As long as different keys are written, this is a CRDT. That makes it fit for solving both #3 and #7.
We don't want to re-read the whole table(s) every time we read extra metadata for a commit (which is the immediate use-case I'm aiming for in #7)..
The git backend gets really slow after many commits have been created. Profiling has shown that the problem is git notes. One problem is that libgit2 doesn't do sharding. Manually editing a note from the command line using
git notes
helps, but it's still very slow. We should consider moving the extra metadata to some other storage, perhaps a custom format.Another option might be for the git backend to simply cache the notes tree. I don't know how much that would help.
One advantage of the current storage in git notes is that it lets us exchange the data using regular git commands.
The text was updated successfully, but these errors were encountered: