Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update python installation method #255

Closed
manopapad opened this issue Jun 3, 2022 · 11 comments
Closed

Update python installation method #255

manopapad opened this issue Jun 3, 2022 · 11 comments

Comments

@manopapad
Copy link
Contributor

Legion and legate are using easy_install to install the python packages to an arbitrary location. This mode is deprecated and will soon be removed. We should switch to following standard conventions (e.g. pip-based), which install directly into the currently active virtual environment (e.g. the one managed by conda).

Here are some relevant messages we get during installation:

DeprecationWarning: The distutils package is deprecated and slated for removal in Python 3.12. Use setuptools or check PEP 632 for potential alternatives
DeprecationWarning: The distutils.sysconfig module is deprecated, use sysconfig instead
SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
EasyInstallDeprecationWarning: easy_install command is deprecated. Use build and pip and other standards-based tools.
SetuptoolsDeprecationWarning: Custom 'build_py' does not implement 'get_data_files_without_manifest'.
Please extend command classes from setuptools instead of distutils.

One thing I would like to know is how easy it would be to "uninstall" our packages in this mode, in case we want to do a clean build.

Copying from my discussion with @bryevdv on alternatives:

The issue probably comes down to whether "build cmd + package/install cmd" is acceptable or whether you want some one-shot to do everything. One option might be to have cmake build the binary artifacts and install them in the source tree (or somewhere), and then a pip install . just does a fairly normal "python package install". Maybe the cmake process even does the pip install . for you as the last step. If you want things "driven" from python then you still have a setup.py that invokes the C++ build. Since we can't call setup.py as a script anymore (that is deprecated) we'd have to rely on only env vars to control any conditional logic inside it when you call pip install . or build. There are evidently other tools that support custom steps, like the hatch tool mentioned in use by Jupyter.

CC @trxcllnt for input

@bryevdv
Copy link
Contributor

bryevdv commented Jun 3, 2022

So I had forgotten about all the other potential stuff in the install dir (blas, tblis, thrust...) I am really not sure what to do about those. Presumably they need to be in a fixed location where when all the CLLR .so files are linked?

I started a topic on the Python packaging discourse regarding Bokeh, but a lot of the discussion seems relevant here as well:

https://discuss.python.org/t/custom-build-steps-moving-bokeh-off-setup-py/16128/3

One immediate thing we can do just to get out from that deprecation warning, is simply to get rid of this:

https://github.com/nv-legate/legate.core/blob/9e6bcd44e3262f16e69b4ed0e78da5792ef5a6fe/install.py#L541-L554

And per some of that discussion, an easy way to do that is probably just to rely on environment variables, rather than custom command line options, to condition the code in setup.py. Then presumably <env> pip install . can be made to function. In the new parlance that is using pip install as a "frontend" and leaving setup.py as the "backend" (i.e. not run as a script, which is want the packaging folks want). I will experiment a bit with this.

But that small step does not get legate installed into a "regular" environment. For that, I think a first question is to figure out what what is reasonable/possible for all the other non-python dependencies.

@bryevdv
Copy link
Contributor

bryevdv commented Jun 3, 2022

Yeah, that seems to work fine, here is a quick and dirty POC diff

diff --git a/install.py b/install.py
index aa0c405..d4210cc 100755
--- a/install.py
+++ b/install.py
@@ -537,21 +537,20 @@ def build_legate_core(
         f.write(content.encode("utf-8"))
     cmd = ["cp", "config.mk", os.path.join(install_dir, "share", "legate")]
     verbose_check_call(cmd, cwd=src_dir)
-    # Then run setup.py
-    cmd = [
-        sys.executable,
-        "setup.py",
-        "install",
-        "--recurse",
-    ] + setup_py_flags
+    prefix = str(install_dir)
     if unknown is not None:
         try:
             prefix_loc = unknown.index("--prefix")
-            cmd.extend(unknown[prefix_loc : prefix_loc + 2])
+            prefix = unknown[prefix_loc + 1]
         except ValueError:
-            cmd += ["--prefix", str(install_dir)]
-    else:
-        cmd += ["--prefix", str(install_dir)]
+            pass
+    cmd = [ sys.executable, "-m", "pip", "install", ".", "--prefix", prefix]
+    os.environ["LEGATE_INSTALL_PREFIX"] = prefix
+    print(f"""
+
+    ****************************** {prefix}
+
+    """)
     verbose_check_call(cmd, cwd=legate_core_dir)
 
 
diff --git a/setup.py b/setup.py
index 5759076..27c6da5 100755
--- a/setup.py
+++ b/setup.py
@@ -15,24 +15,14 @@
 # limitations under the License.
 #
 
-import argparse
 import os
 import subprocess
-import sys
 from distutils.command.build_py import build_py
 from distutils.core import setup
 
 from setuptools import find_packages
 
-# We need to know the prefix for the installation
-# so we can know where to get the library
-parser = argparse.ArgumentParser()
-parser.add_argument("--prefix", required=False)
-parser.add_argument(
-    "--recurse", required=False, default=False, action="store_true"
-)
-args, _ = parser.parse_known_args()
-
+prefix = os.environ.get("LEGATE_INSTALL_PREFIX", None)
 
 class my_build_py(build_py):
     def run(self):
@@ -44,7 +34,7 @@ class my_build_py(build_py):
             root_dir = os.path.dirname(os.path.realpath(__file__))
             header_src = os.path.join(root_dir, "src", "core", "legate_c.h")
             output_dir = os.path.join(root_dir, "legate", "core")
-            include_dir = os.path.join(args.prefix, "include")
+            include_dir = os.path.join(prefix, "include")
             header = subprocess.check_output(
                 [
                     os.getenv("CC", "gcc"),
@@ -55,7 +45,7 @@ class my_build_py(build_py):
                     header_src,
                 ]
             ).decode("utf-8")
-            libpath = os.path.join(args.prefix, "lib")
+            libpath = os.path.join(prefix, "lib")
             with open(os.path.join(output_dir, "install_info.py.in")) as f:
                 content = f.read()
             content = content.format(
@@ -66,20 +56,9 @@ class my_build_py(build_py):
         build_py.run(self)
 
 
-# If we haven't been called from install.py then do that first
-if args.recurse:
-    # Remove the recurse argument from the list
-    sys.argv.remove("--recurse")
-    setup(
-        name="legate.core",
-        version="22.03",
-        packages=find_packages(
-            where=".",
-            include=["legate*"],
-        ),
-        cmdclass={"build_py": my_build_py},
-    )
-else:
-    with open("install.py") as f:
-        code = compile(f.read(), "install.py", "exec")
-        exec(code)
+setup(
+    name="legate.core",
+    version="22.03",
+    packages=find_packages(where=".", include=["legate*"]),
+    cmdclass={"build_py": my_build_py},
+)
dev38 ❯ ./install.py --install-dir=install38 --cuda  --arch turing  --no-clean

.....

    ****************************** /home/bryan/work/legate.core/install38

    
Processing /home/bryan/work/legate.core
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Building wheels for collected packages: legate.core
  Building wheel for legate.core (pyproject.toml) ... done
  Created wheel for legate.core: filename=legate.core-22.3-py3-none-any.whl size=120975 sha256=2b5fed0006f4c3c32fde40d6cd6db041e7d614137efd7166c201f2afbc775584
  Stored in directory: /tmp/pip-ephem-wheel-cache-2080v6z7/wheels/f2/f5/4a/b4047b85c6ae3ef6905c72a4c8042c40ec76dc737ebc3f598d
Successfully built legate.core
Installing collected packages: legate.core
Successfully installed legate.core-22.3

As an added bonus, since we would never be calling setup.py as a script all that goofy "recurse" logic can just be deleted.

@bryevdv
Copy link
Contributor

bryevdv commented Jun 3, 2022

I guess another incremental step might be to continue installing all the other shared libraries, etc. under the install dir in the repo, but let the pure python package part install in the "standard" python env.

dev38python
Python 3.8.13 | packaged by conda-forge | (default, Mar 25 2022, 06:04:10) 
[GCC 10.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import legate
>>> legate
<module 'legate' from '/home/bryan/anaconda3/envs/dev38/lib/python3.8/site-packages/legate/__init__.py'>

But we would also need to get the legate driver installing correctly on its own. i.e. define a real entry point rather than do

verbose_check_call(
        ["cp", "legate.py", os.path.join(install_dir, "bin", "legate")],
        cwd=legate_core_dir,
    )

@trxcllnt
Copy link

trxcllnt commented Jun 3, 2022

#198 passes --single-version-externally-managed --root=/ to Legion's Python bindings' install command (enabled in Legion/502), which eliminates the easy_install warnings when the Python bindings are installed to CMAKE_INSTALL_PREFIX.

@bryevdv
Copy link
Contributor

bryevdv commented Jun 3, 2022

@trxcllnt that still appears to be calling setup.py install and that is what is deprecated I would say it would be good to change that at some point as well (e.g. by calling python -m pip install . instead, and using an env var in setup.py module rather than a cmd line arg --cmake-build-dir)

Edit: or, just let the python bits get installed into the standard python env, which is I guess the ultimate goal.

@manopapad
Copy link
Contributor Author

But that small step does not get legate installed into a "regular" environment. For that, I think a first question is to figure out what what is reasonable/possible for all the other non-python dependencies.

Here is a list of dependencies that we normally don't get from conda:

  • OpenBLAS: We can safely reuse the conda package today. If Allow multiple OpenBLAS OpenMP calls to run in parallel conda-forge/openblas-feedstock#138 gets merged there will be no reason to do a custom build anymore.
  • TBLIS: This library is in the process of transitioning over to be built over BLIS, which is available on conda-forge https://anaconda.org/conda-forge/blis. Therefore, I hope TBLIS will also be soon available through conda.
  • Thrust: We have had issues with bugs in old versions of Thrust (e.g. those packaged with CUDA distributions). We should check whether the version available on conda will work for us.
  • Legion/Realm: We need to update these pieces often during development, so for this usecase we should build them from source.
  • Gasnet: This is currently not available on conda, so it needs to be built from source, and configured appropriately for the networking stack on the developer's machine.

Overall, we will likely shift to getting most of our dependencies from conda in the medium term, but certain libraries we will still want to build from source (legion, realm and gasnet).

I think we could handle these as separate packages, and have a separate install process for each. Legion has a minimal python component (called pygion), that should easily fit within a standard pip install process; most of the installation will consist of copying over headers and dynamic libraries. Gasnet only has C++ components.

@bryevdv
Copy link
Contributor

bryevdv commented Jun 14, 2022

OK, back from being out last week. I would propose this as a very rough work plan:

  • Wait for @trxcllnt's various cmake-related PRs to be merged.
  • Update setup.py to use env vars instead of command line args
    • update install.py and cmake install to use pip install . with env var
    • socialize not to invoke setup.py — use install.py or cmake
  • change to install the "python parts" (including real entry point for legate) to standard site-packages
  • switch to using conda packages for dependencies wherever that is possible
  • assess whether there anything different or better to do with remaining packages that have to be built from source (leaving this vague because I don't have a lot to suggest about this)

I could imagine a similar path for cunumeric. There's lots of small details that are left out, I just expect we can sort those out as we go along the path.

@jjwilke
Copy link

jjwilke commented Sep 22, 2022

@manopapad This can probably be closed now with the CMake changes?

@manopapad
Copy link
Contributor Author

I think so. There are still some deprecation warnings printed out, but these might be benign? What do you think @bryevdv ?

    /Users/mpapadakis/opt/miniconda3/envs/legate/lib/python3.8/site-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
    /Users/mpapadakis/opt/miniconda3/envs/legate/lib/python3.8/site-packages/setuptools/command/egg_info.py:643: SetuptoolsDeprecationWarning: Custom 'build_py' does not implement 'get_data_files_without_manifest'.
    Please extend command classes from setuptools instead of distutils.
    /Users/mpapadakis/opt/miniconda3/envs/legate/lib/python3.8/site-packages/setuptools/command/easy_install.py:144: EasyInstallDeprecationWarning: easy_install command is deprecated. Use build and pip and other standards-based tools.

@bryevdv
Copy link
Contributor

bryevdv commented Sep 22, 2022

AFAIK our install.py just calls pip install now, so offhand I'd guess those might be coming from something inside scikit-build. If so, we'd just to wait for them to update things on their end to see those warnings go again. I think this can be closed now.

@manopapad
Copy link
Contributor Author

Sounds good, closing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants