Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

R** move cmake installation into env_create.sh #2393

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

guru-desh
Copy link
Contributor

I had a previous PR for this, however, I had made some edits to my fork that closed the pull request and added commits from another feature I was working on. I guess that's a reminder to me to make new branches in git 😄 . Apologies for a duplicate PR. Here's the original description:

This PR simplifies the build setup process by installing cmake via conda.

Current Behavior:

In the build.sh script, this if statement exists:

if [ -z "`which cmake`" ] || [ "`which cmake`" = "cmake not found" ]; then
  conda install cmake -y
fi

If cmake does not exist on the machine used to build coremltools, then cmake will be installed via conda. BUILDING.md states that the user needs to install cmake. However, if the build.sh script already installs cmake then the following refactor can be done:

  1. Install cmake by default into the conda environment
  2. Remove the requirement for the user to install cmake outside of the coremltools build

New Behavior:

  1. This env_create.sh script installs cmake via the anaconda channel.
  2. The CMake requirement is removed from the requirements section (the user who builds the project no longer needs to install cmake themselves)

This refactor makes it so that the user no longer needs to install cmake before building coremltools (as the build itself will install cmake via conda)

Checks

Running ./scripts/build.sh --python=3.8 created a working build for me locally, but I'm not sure how to run the GitLab CI job. Could I get some help on that? Thanks in advance 😄

@guru-desh guru-desh changed the title Move cmake to env create R** move cmake installation into env_create.sh Nov 13, 2024
@TobyRoseman
Copy link
Collaborator

The change looks good. I've kicked off a CI run: https://gitlab.com/coremltools1/coremltools/-/pipelines/1544298967

@guru-desh
Copy link
Contributor Author

Thanks for the CI run! Here's a link to the previous PR that I forgot to link: #2379

In that previous PR, there were some questions that had come up. I will link my response below:

could you please elaborate on the motivation for "move conda install to env_create.sh" change?

This was a pure refactoring change. In build.sh, I saw that the majority of the steps build.sh are build-specific steps like setting up the cmake commands and running make, and the conda install -y cmake command doesn't seem like a build-related step but more of an environment config step. This is why I moved it to env_create.sh.

In addition, if conda install -y cmake was in build.sh, then dependency management becomes harder as the dependency of cmake is specified not in env_create.sh but in build.sh, which should not add environment dependencies.

Imho I would prefer to keep it in build.sh, because "install after env got created" will work for env-tweaking cases, e.g. Rosetta

Could you explain what "env-tweaking cases" are? Why can't the env-tweaking cases be done in the env_create.sh script?

@TobyRoseman
Copy link
Collaborator

@YifanShenSZ - can respond to the above questions? I think since this change only installs cmake if it isn't already installed, it shouldn't matter which script installs it.

@guru-desh
Copy link
Contributor Author

I think since this change only installs cmake if it isn't already installed, it shouldn't matter which script installs it.

I would like to clarify this. The change installs cmake into the conda environment regardless of whether cmake already exists on the host machine or not. This means that when the environment is activated and a build is run, then the cmake inside the conda environment will be used.

@guru-desh
Copy link
Contributor Author

@TobyRoseman any update on this?

@TobyRoseman
Copy link
Collaborator

I think the documentation changes for this PR are good. I would like to merge those changes.

I don't think we should be installing cmake into the conda build environment if it's already installed in the system.

I still don't understand why you want to move the cmake installation out of build.sh and into env_create.sh.

@guru-desh
Copy link
Contributor Author

I think the documentation changes for this PR are good. I would like to merge those changes.

Awesome!

I don't think we should be installing cmake into the conda build environment if it's already installed in the system.

Here's my reasoning:

  1. Installing cmake in the conda environment provides consistency on all platforms and users that decide to build coremltools. By installing cmake in the anaconda environment, it gives us the option to set the version (we currently don't do that, but we should) of cmake that is used
  2. Currently, the CMakeLists.txt file uses cmake version 3.10.2 and uses the deprecated find_package(PythonInterp) and find_package(PythonLibs) functions that are soon to be removed in newer CMake versions. Let's say that we decide not to install cmake via conda and allow users to bring their own CMake version. What can happen is that users use a higher CMake version that does not support find_package(PythonInterp) or find_package(PythonLibs) and run into issues when building. We would then need to spend time on these issues and worses the user experience.
  3. By installing cmake in the environment, we create a standard for whichcmake is used and what versions we allow, which reduces the chance that users run into future issues and reduces developer toil without adding significant risk to how coremltools is being maintained now. This change allows unfamiliar users that want to build coremltools not have to worry about the CMake version that used, installing CMake, etc. This change would streamline that process.

Would love to hear your reasoning about why to not install cmake in the conda build environment.

I still don't understand why you want to move the cmake installation out of build.sh and into env_create.sh.

Here's my reasoning:

It's mainly due to the Single Responsibility Principle. build.sh is responsible for building the project. What that means to me is the following:

  1. Setting build flags (this could also be in a seperate file if wanted, but this is fine right now)
  2. Running the build

With the conda install cmake command being in the build.sh, it expands the scope of what build.sh is doing by also having to manage the build dependencies, which is unneeded as env_create.sh already does that. If issues arise in environment setup, that means that since we expanded build.sh scope, developers would need to check both env_create.sh and build.sh to see where the environment setup error occurs instead of developers just looking and debugging env_create.sh

@TobyRoseman
Copy link
Collaborator

Thanks @guru-desh for the detailed explanations.

I agree with your second point. If we're going to install cmake, it makes more sense to do it in env_create.sh.

However, I still don't think it makes sense to install cmake, in the environment, if the user already has it installed in their system. This waste hard drive space. Also having duplicate installs of camke could be problematic.

Please address this issue and rebase on top of latest main. Then I'll kick off another CI run. Assuming that succeeds, we can merge this pull request.

@@ -110,6 +110,7 @@ if [[ $DEV == 1 ]]; then
python -m pip install -e "$COREMLTOOLS_HOME/../coremltools" --upgrade
fi

conda install -y cmake
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add the deleted if-statement from build.sh here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants