Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Toolbox packaging #285

Merged
merged 18 commits into from
Oct 16, 2020
Merged

Add Toolbox packaging #285

merged 18 commits into from
Oct 16, 2020

Conversation

guzman-raphael
Copy link
Collaborator

@guzman-raphael guzman-raphael commented Sep 22, 2020

Fix #228. Supersedes #258.

Add Toolbox packaging such that DataJoint may be installed via: FileExchange, matlab.addons.install, matlab.addons.toolbox.installToolbox, and using GHToolbox. Currently, my fork contains a release 3.3.2 3.3.3 that was used for testing and can be used to validate. All that is needed is to download/install using the DataJoint.mltbx file. Once we publish a new release, we can associate it with FileExchange such that manual download is not necessary.

This adds the following toolbox dependencies:

  • GHToolbox (which has a single dependency: compareVersions)
  • mym

Copy link
Contributor

@ixcat ixcat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'change requested' due to setup.m rename question, mltbx question (more of a 'needs discussion' than fully 'change requested')

+dj/setup.m Show resolved Hide resolved
if strcmp(ME.identifier, 'MATLAB:undefinedVarOrClass') && (~prompt || strcmpi('yes',...
dj.internal.ask(sprintf(sprintf('%s\n', installPromptMsg{:}), 'GHToolbox'))))
% fetch
tmp_toolbox = [tempname '.mltbx'];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

notice we have the .mltbx in the commit - is this intended?
if so, how would release building work w/r/t commit sequence/mltbx, etc?
(apologies if discussed already)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it was intentional to commit the compiled DataJoint.mltbx file. Have a look at here. Once we modify our FileExchange DataJoint entry it will automatically release from our linked releases. All that we will need to do is 'manually' (though this can be automated) attach the DataJoint.mltbx to the release.

Copy link
Contributor

@ixcat ixcat Oct 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that GH 'release' artefacts can be tracked separately from code itself, the GH doc link from the matlab page you referenced ( https://docs.github.com/en/free-pro-team@latest/github/administering-a-repository/about-releases ) describes a variety of ways to use GH releases. Committing binaries to the main repo does not seem to be the only way to do this, perhaps I'm mistaken. Probably better for verbal discussion at this point.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right it is a fair point. The challenge I've been trying to solve here is 'how do we guarantee that the attached binary to the release represents the one built from the source?'. Did not want maintainers to be compiling, uploading this themselves to prevent human error from releasing an incorrect version. My simple solution was to commit it as well therefore anyone needing to make a release would simply use the committed binary for this (As it is strictly used for testing). Obviously a better approach would be with some automation that compiles it for us when triggering a release but well.... we just don't have that yet. 😅

Copy link
Contributor

@ixcat ixcat Oct 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did not want maintainers to be compiling, uploading this themselves to prevent human error from releasing an incorrect version - is exactly what is happening here, is it not?

As a stopgap makes sense (not entirely convinced, esp. since it will bloat repo size footprint), but also wasn't sure if that was intended as 'stopgap' or 'long term solution', so have less reservations if it's intended as a stopgap.

anyway, point raised, think conversation is probably best in person/dev meeting from here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to be blunt: it is a source code repository, not a 'bag of stuff' repository; I can see the point in blurring this line slightly for the release/ease of fetching case you mention (why i'm proposing git-lfs as an alternative instead of outright removal+ a separate less-strictly-linked tracking process), but long run having every git checkout copy contain a full download of every binary copy ever released is a waste of time/space.

git-lfs allows precise, intentional version tracking as you outline (which I agree is a good thing) without bloating checkout footprint nearly as much - you get the source code history (which is on the order of kb/low mb) and the related copy of the binary (so checkout footprint of release artefacts is is on the order of 1*{release} or worst case 1*{number of active local branches having different release archives} footprint, instead of {all releases})

r.e. 100MB - this is 'separation by file size'; going back to 'bag of stuff' point, i propose 'separation by file-role'. policy decision is up to users.

will admit for dj matlab specifically, since it is only a source zip, the .mltbx is fairly small so it might not be worth the extremely limited extra effort of using lfs; but if there is a better way anyway, i think we should use this, and am putting forward the notion that git-lfs is a better way.

the "new release not requiring building/compiling but simply attach" point (which I also agree is a good/simple approach, with the additional caveat of 'so long as the storage/bloat concern is addressed') is also supported by using a git-lfs workflow.

r.e. mym: i view this as bad practice overall, so using it as a precedent to justify in my opinion is flawed; if lfs (or something else) works better for our use case, i propose we should transition mym as well

ultimately I'm not taking a hard line here, but if we are going to discuss this, the time is now, so wanting to make sure full discussion is had.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should also note for full disclosure that binaries-in-archives is kind of a pet-peeve of mine, so that may impact feedback

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ixcat Those are fair points really and I do appreciate you taking the time to articulate this. In case I had not made it clear, I had not imaged the approach outlined here would be permanent in any way. Merely meant to adopt a 'short-term' approach for this while we take some time to come up with a better approach (so that users don't have to experience delays on account to this). Ideally, we should have proper CI/CD with micro-releases such that new PR's compile binaries and test based on them. Then upon merge, binaries should be recompiled from source, automatically cut to a release, and compiled binaries attached to said release. I believe that is the typical pattern other open-source projects utilize. I'd file this under 'dev process improvements' that we need to prioritize for (including migrating to GitHub Actions and utilizing the 'secret encryption' method described previously that would do away with the stage hassles).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually now that I have visited this through a bit more, I do also favor not including mltbx directly within the commit, but rather attach it to the releases. I must agree that it is not perfect as it does leave a room for some human error as pointed about by @guzman-raphael, but ultimately this is of same procedure as found for Python version releases. So I'd think best would be to gear towards finding an automation for this step in the future, but proceed on with an explicit and separate release of mltbx.

Copy link
Collaborator Author

@guzman-raphael guzman-raphael Oct 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eywalker @ixcat Thanks for the feedback, guys. Based on the discussion we had earlier on this, I have gone ahead and removed the DataJoint.mltbx binary deferring proper handling of this until we have some CD mechanism in place. Have also updated the tests to trigger a toolbox compilation.

local-docker-compose.yml Show resolved Hide resolved
@shenshan
Copy link

shenshan commented Oct 9, 2020

The installation works great for me. I'm on MacOS 10.15.5, matlab 2020a.

@ttngu207
Copy link

ttngu207 commented Oct 13, 2020

Full installation flow works great for me too: MATLAB R2018b (Windows 10). Tested:

  • prompt and auto installation for: GHToolbox, compareVersion, mym
  • install DataJoint using Add-on Manager
  • uninstall DataJoint will NOT uninstall GHToobox, compareVersion or mym

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DJ Matlab / mYm improved version/compatibility support
5 participants