Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Medley Interlisp project use of git #495

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
masinter opened this issue Sep 22, 2021 Discussed in #102 · 24 comments
Closed

Medley Interlisp project use of git #495

masinter opened this issue Sep 22, 2021 Discussed in #102 · 24 comments
Assignees

Comments

@masinter
Copy link
Member

Discussed in #102

Converting back to an Issue so we can close it (spawning new issues if there are things undone).

Originally posted by masinter December 21, 2020
There are several problems with the way the project has and continues to misuse GitHub.
Fixing these requires more Git expertise than I have, and concerted effort. Let me assure you that each of these has a longer explanation than given here.

  1. Storing old versions as separate files in the repo. This refers to seeing FOO and FOO.~1~ in the repo. Doing this enables some workflows that are important, now, to be able to pluck out a previous definition with a simple `GETDEF(FOO FILE;3). There might be some way of doing the same with GiT but that doesn't matter unless we hook into the Git API.
  2. This is a minor artifact of 1 but it has an easier fix. The problem is that the VM doesn't exactly follow EMacs' way of numbering versions. Emacs you see foo and foo.~3~ then foo with no version is really version 4. Lisp instead makes an explicit hard link between foo.~4~ and foo which is fine. Except Git knows nothing about hard links and treats them as two separate files, doubling the space (and Git hashes file names with contents so they don't help). A simple fix would be to write a quick script that scans through looking for foo and foo.~nn~ same size and content then replace one with a hard link with the other. (This also would remove the annoyance of having an extra version pop up.
  3. We're storing derived files that should be rebuilt -- compiled files (John had code to batch compile everything), sysouts, whereis.hash, exports.all. There are good reasons for each of these but it should be a goal for someone with no experience could rebuild from source. It's the only way That's separate from releases (which we're not using now).
  4. I didn't understand the git model (I still don't, but I've learned a few things): Move a bunch of files in. Add and commit. Then move them out. Git RM and then commit that. Then move them back in. etc. Multiple copies.
  5. A diff that worked for lisp. And things like that. shellcommand git commands that work with lisp file names
  6. LFS of sysouts might help a little, but 1 and 2 and 3 and 4
@lassik
Copy link

lassik commented Sep 23, 2021

On point 1:

Git can checkout multiple worktrees from the same clone, but people rarely do this. There probably isn't a standard feature to check out multiple copies of the same file into the same worktree using a running number in the filename.

I would recommend:

  • Storing only the latest version of the file in git at any given time, i.e. normal usage of git.
  • Relying on git history to keep the old versions.
  • Writing a separate shell script that examines the Git history and copies old versions of files from there into the worktree. (EDIT: You already have a very nice script to do just that, added in commit 966b837.)
  • Writing a .gigignore file which tells git to ignore filenames with the pattern *.~[0-9]*~ instead of suggesting to commit them. (Seems this has been done in commit 2f41e9e.)

@masinter
Copy link
Member Author

masinter commented Sep 23, 2021

Sorry for not being clearer. These are mainly issues that have been addressed. I wanted to not lose the bits that were still open.

See scripts/restore-versions.sh

@masinter
Copy link
Member Author

@rmkaplan @nbriggs @ecraven

@rmkaplan

I’ve been playing around with a medley interface to the worktree
commands, systematically making worktrees for specified branches in
named subtrees in the file system.
One issue that comes up is that the default seems to be that you can’t
make a worktree for the currently check out branch. And if you have
made a worktree for a non-checked-out branch, then you can’t switch to
that branch in the git desktop.
There are some options listed in the documentation that might override
these defaults, but I’m not sure what the consequences are. —force
seems to allow the worktree even if is the current branch, and
—nocheckout seems to do the add without actually checking it out.
Is it dangerous to specify git worktree add —force —nocheckout <path> <branch>

As long as you do not commit anything, nothing bad can happen. So just don't run git add .. && git commit ..
;)

It seems that this should be safe if all Medley wants to do is to
compare files across branches and with the separate Medley working
directory.
But would things get out of step if (by mistake) Medley copied a file
into one of these unchecked-out working trees or deleted a file there?
Best would be if these trees were conceptually read-only.

No, as long as you don't commit anything. Ideally, after you are "done", just delete all the directories. I'm not sure why git is designed the way it is, it would seem much better to me if the main repository directory didn't keep track of the worktrees, but only the worktrees pointed back to the main directory, but alas, that's just the way it is...

I'd suggest creating some sort of temp directory (I'll call it ./tmp), then just check out the "main" repo into ./tmp/medley, and create the worktrees in ./tmp/lmm14, ./tmp/ron25, etc. When all the comparing (and adding things to "main" and committing and pushing) is done, just delete all of ./tmp entirely. That way, nothing is left over that could create any problems.

@nbriggs nbriggs changed the title Medley Interlisp project misuse of GitHub Medley Interlisp project use of git Nov 17, 2021
@masinter
Copy link
Member Author

@rmkaplan
Copy link
Contributor

Is there a command line way of finding out what branch corresponds to a particular PR? Or do you have to go to the website and click in the PR itself to see the branch name?

@rmkaplan
Copy link
Contributor

The git desktop lists a whole bunch of branches that have not been pulled down to my repo, origin/lmm9, origin/lmm10 etc which presumably have PR's associated with them.

Can I build a worktree for the remote branch without first checking it out and maybe getting confused about what's actually checked out? All I want it for is to do comparisons with other branches.

@nbriggs
Copy link
Contributor

nbriggs commented Nov 18, 2021

That's a case where you don't have a tracking branch. There's no requirement that there be a PR associated with a branch (PRs are github, not git). git can deal with remote branches that it knows about, which are the remote branches that existed at the time you last did a "git fetch".

You can "git worktree add /tmp/xxx origin/lmm14" and it will check out origin/lmm14 into /tmp/xxx just as it would with a local branch -- but it's not necessarily what's on the server, it's what's known in your local repository from the last time you did a "git fetch".

@nbriggs
Copy link
Contributor

nbriggs commented Nov 18, 2021

In maiko, for example, I have multiple branches that are nominally "master" --

  • gitlab/master
  • origin/master
  • master

At least two of them have quite different things in them.

@nbriggs
Copy link
Contributor

nbriggs commented Nov 18, 2021

Actually, if you want to get all your remotes updated so that your local copy of the repo matches what's in the remotes, you'd do (pseudo code):
for r in "git remote" do "git remote update" r

@rmkaplan
Copy link
Contributor

rmkaplan commented Nov 18, 2021 via email

@nbriggs
Copy link
Contributor

nbriggs commented Nov 18, 2021

They're the branch named "master" in some repository, maybe they contain the same stuff, or related stuff, or maybe they don't. In my case the remote named "gitlab" (which happens to be stored on gitlab.com) has a branch named "master" which represents the state of maiko at some time in the past:

% git log gitlab/master
commit cd4c59dbb3114e4d26260705cb3f10e7b4b67d31
Author: Nick Briggs <nicholas.h.briggs@gmail.com>
Date:   Fri Aug 21 12:18:03 2020 -0700
[...]
% git log origin/master
commit 987cf4c7c6305c07f12e95b80349a2ac45c185a5
Author: Nick Briggs <nicholas.h.briggs@gmail.com>
Date:   Sat Nov 6 18:36:25 2021 -0700
[...]
% git log master
commit 987cf4c7c6305c07f12e95b80349a2ac45c185a5
Author: Nick Briggs <nicholas.h.briggs@gmail.com>
Date:   Sat Nov 6 18:36:25 2021 -0700
[...]

Here origin/master and master happen to be the same because I've not committed (or merged anything into) master before pushing it to origin/master.

@rmkaplan
Copy link
Contributor

From a previous message: "(PRs are github, not git)"

That begs the question, are there github commandline commands that will fetch PR and PR branches?

(Also, this github interface has its own problems, since it doesn't seem to let you go down into threads on previoius subdiscussions, just keep adding on. Or am I missing something?)

It appears that --guess-remote does pull an origin/xxx down to a worktree, whether or not it has been pulled down before. And git branch -r returns a list of remote branches.

@nbriggs
Copy link
Contributor

nbriggs commented Nov 18, 2021

That begs the question, are there github commandline commands that will fetch PR and PR branches?

There may be, but I wouldn't know since github commandline doesn't run on my Mac, and won't run on all the machines that people might be trying to use.

I didn't need to use "--guess-remote" to do the git worktree add, but then I'd fetched things from the remote before. You still should do the git remote update to ensure that things are updated.
"git branch -r" and "git branch -l" will give you remote and local branches, "git branch -a" will give you everything but names the remote branches in a different format.

@orcmid
Copy link

orcmid commented Nov 18, 2021

@rmkaplan That begs the question, are there github commandline commands that will fetch PR and PR branches?

@nbriggs There may be, but I wouldn't know since github commandline doesn't run on my Mac, and won't run on all the machines that people might be trying to use.

Until this thread, I had no idea "Github command line" was a thing. They claim that it is available for MacOS, Windows, and Linux so I am not clear what the limitation is (being a happy GitHub Desktop + TortoiseGit user). What platform is neglected?

https://github.com/cli/cli#installation

@orcmid
Copy link

orcmid commented Nov 18, 2021

@rmkaplan (Also, this github interface has its own problems, since it doesn't seem to let you go down into threads on previoius subdiscussions, just keep adding on. Or am I missing something?)

Hmm, I suppose because this is all based on a simple issue tracker, with Discussions and Issues being not that distinct in terms of reply non-threading. You can make your own bread crumbs / mouse tracks, although it strains all these being simple Markdown entries.

I was obsessive enough to exploit the permalinks that GitHub creates for Markdown headers in recovering some old material to a docs/ yesterday. It's a brittle process. If I change a heading, I break all the internal and external links. And I have to publish a page to see what the HTML anchors are. I miss the way FrontPage handled all this. :(

@nbriggs
Copy link
Contributor

nbriggs commented Nov 18, 2021

They claim that it is available for MacOS, Windows, and Linux so I am not clear what the limitation is (being a happy GitHub Desktop + TortoiseGit user).

@orcmid -- It's supported on some versions of MacOS. On OSX El Capitan (10.11.6) it fails with

% bin/gh --help
dyld: Symbol not found: _clock_gettime
  Referenced from: /private/tmp/gh_2.2.0_macOS_amd64/bin/gh
  Expected in: flat namespace

... and it's not actually possible to compile that (correctly) from source because of the missing _clock_gettime implementation.
Anyone who wanted to use it on any *BSD or Solaris system would have to build their own from source, assuming that the code is sufficiently portable there.

@orcmid
Copy link

orcmid commented Nov 18, 2021

@nbriggs Anyone who wanted to use it on any *BSD or Solaris system would have to build their own from source, assuming that the code is sufficiently portable there.

Oh my, "These functions are part of the Timers option and need not be available on all implementations." although allegedly POSIX. OpenBSD has a
nice account of its support.

@nbriggs
Copy link
Contributor

nbriggs commented Nov 18, 2021

@orcmid -- depending on exactly how they built the executable, I might be able to edit in a dependency on an additional shared library wherein I can implement clock_gettime(...) since I think there are Mach equivalents, just not with that name. However, it's not clear that there's any real return for that investment.

@rmkaplan
Copy link
Contributor

I now understand that there are 2 completely different work-practice issues wrapped up in this, they should be split out into separate discussions.

  1. The process of reviewing changes in response to a PR request, prior to merging an origin branch into the origin/master branch. How can this be done in Medley, so that the differences are meaningful and easy to understand Medley interface for PR branch comparisons and approvals #575 .

  2. The process of comparing a separate Medley working environment (with Medley version numbers etc) with what is in one or more different branches, either already pulled or perhaps still out there in the respository. And then also committing to a local branch that can eventually be published with an associated PR for merging into origin/master. Interface between normal Medley working directory and local git clone branches #576

Both of these have the goal of using meaningful Medley comparison tools (better tuned versions of COMPAREDIRECTORIES and COMPARESOURCES). But one is a comparison between 2 remote git branches, with no Medley-based action other than compare. The other is a comparison between a local git branch and a non-git Medley directory, and maybe some Medley based file-transfers to and from branches in the local repository.

@masinter
Copy link
Member Author

the nest of interlinked issues is confusing.
The problem is the Porcelain level git and GitHub commands don't correspond to what's going on, and include some DWIM-ish coercions that interfere with further scripting.
Yes, the workflow includes give and take, a communication with two flows, two perspectives, but not two protocols or two issues with separate resolution.
You need to review others' changes and make reviewable changes.
They need to review your changes and make reviewable changes.

@rmkaplan
Copy link
Contributor

rmkaplan commented Nov 20, 2021 via email

@masinter
Copy link
Member Author

There's a third workflow I don't think you covered: whether or not you've reviewed the changes, accepting the changes and integrating into your workspace. So if Bill makes a change and I approve and merge it, it's now there in Master.
If you've made some changes in your private copy of the repo and HAVEN'T changed the file, what do you do?
You don't want to do any updates to GitHub but then?
you may want to run in a merged environment or not.

@rmkaplan
Copy link
Contributor

rmkaplan commented Nov 22, 2021 via email

@masinter
Copy link
Member Author

instead of using 'gh' as a 'shell command' from Medley, what about calling the github API from inside lisp, with some kind of HTTP API subr? A "REST"-style API ? There's some ongoing standards work in IETF to standardize API calls over HTTP.
We could find some CL or emacs-lisp implementation of JSON

@Interlisp Interlisp locked and limited conversation to collaborators Dec 24, 2021
@masinter masinter converted this issue into discussion #637 Dec 24, 2021

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants