Skip to content

Version Control with SMC #370

@mforbes

Description

@mforbes

For research collaboration, it is often important to be able to a version control system (VCS) to manage ones assets. While properly version controlling notebooks still has some issues (how to deal with output), there are workarounds.

This issue outlines some of the problems I have encountered, and possible suggestions. It might turn out that the solution is simply a workflow procedure, not something that requires a change in SMC, but I thought this would be a good place to record these thoughts.

Problem

This issue addresses some of the problems I have encountered with trying to use version control on SMC and some suggestions for possible resolution.

  1. Checkout and updates can conflict with the web UI. This has been noted in issue Jupyter/IPython notebooks -- file uploaded but not updated in SMC #96 where it seems that the issues is having a checkpoint with a newer timestamp than the file being checked out from the VCS. The result is that trying to view a recently checked out notebook will almost immediately revert back to the checkpoint.
  2. Collaboration: Another issue is that when different collaborators log into a project, they all log in as the same user. Thus, when a user tries to commit a change, things will first fail (since no user has been setup), and then, once a ~/.hgrc file or similar is defined for the project, all users will behave as the same single user specified in this file. Instead, it would be desireable for the system somehow to identify who is logged in so that different users can be properly identified.

Possible Resolution

  1. The first issue will be resolved when issue Jupyter/IPython notebooks -- file uploaded but not updated in SMC #96 is fixed.
  2. The second issue is a bit more challenging and might simply involve workflow rather than changes in SMC itself. The challenge is that one typically runs VCS commands after connecting to the machine with ssh. Since the SMC process for this is keyless authentication, each users connects as the same USER, hence it seems there is no immediate way to differentiate different collaborators.

Is there some possibility of rolling the user id into the project hash, so that the SMC system can assign a meaningful username differentiating who is logged in? This would probably be the ideal solution from a users perspective, but might be challenging to implement. If this is possible, it might be nice to have a configuration section in the SMC UI where users could input there VCS username that can be accessed in the project. (For example, a VCS_USERNAME or HG_USERNAME, etc. environmental variable could be populated so that these could be used in ~/.hgrc.)

The alternative is somehow to have users local information sent when they ssh into the project. In principle this could be done with something like SendEnv HG_USERNAME in their ~/.ssh/config file but this would require AcceptEnv HG_USERNAME on the SMC machine (which might be considered a security risk).

Another related alternative is to have the user specify these variables in the authorized_keys file:

environment="HG_USERNAME=name@gmail.com" ssh-rsa ...

This still requires the server to set the PermitUserEnvironment option in /etc/ssh/sshd_config though. It seems like this might be the best option, as then there is already a "UI" for setting these variables when the user specifies their keys.

There might be some sort of solution similar to using a HEREDOC:

ssh bec <<EOF
export HG_USERNAME="${HG_USERNAME}"
command
EOF

but this does not permit logging into the server (at least I don't see how to tell ssh to execute the command and then let me have control... even if command is bash it just logs out.)

  1. Although I do not personally have much use for this, it might be worth considering if version control could be enabled through the SMC UI for users who are not shell-savey. This might be a big chunk of work though...

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions