Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: package custom Python libraries and its PyPI dependencies #99

Closed
n-batalha opened this issue Mar 14, 2023 · 3 comments
Closed
Labels

Comments

@n-batalha
Copy link

n-batalha commented Mar 14, 2023

Context

We are using dbt for Python "models", which handles Python stored procedures to create tables and abstracts that from users.

However they are leaving it to users (at least for now) to handle the pushing of code outside the model into a stage.

Although I think the feature requested here applies more broadly to reuse logic between stored procedures/UDFs.

Request

To be able to create a package of a user defined Python library, that depends on a number of external libs (in Anaconda or alternatively in PyPI), with the packaging of the library and PyPI dependencies being handled and submitted to a Snowflake stage.

E.g. snow package lib <lib location>, which packages a local library defined in <lib location> alongside its dependencies.

Example

I managed to do one using your wrapper for packaging PyPI dependencies:

image

Then:

snow package create pyjokes
snow package upload -f pyjokes.zip -s packages --overwrite

And in dbt, a model is written as:

import pandas as pd

def model(dbt, session):
    dbt.config(
        packages = ["pandas"],
        imports = [
            "@packages/pyjokes.zip",
        ]
    )

    from mylib.mylib import get_some_joke

    message = get_some_joke()

    return pd.DataFrame({"joke": [message]})

But I miused the snow package create <package> by depending only on a single external package (pyjokes), and "knowing" your implementation compresses the local folder (which includes my lib) in a zip and then conveniently submits it to a stage. Ideally the user lists a number of dependencies (e.g. pyproject.toml), Anaconda ones are ignored and PyPI only ones are pulled and compressed alongside our lib in a zip, and pushed to a stage.

@n-batalha
Copy link
Author

n-batalha commented Mar 15, 2023

Actually inspecting the code, I found that there is a route, using the package command on a wheel with the user created library (which has 3rd party dependencies - the wheel file knows the dependencies of the lib, you seem to invoke pip which pulls the dependencies of the wheel, and your CLI takes care of the rest). E.g.:

snow package create example-1.0.0-py3-none-any.whl
mv example-1.0.0-py3-none-any.whl.zip example.zip
snow package upload -f example.zip -s packages --overwrite

Not sure if this was considered or is supported, so keeping it open.

@recursive-automata
Copy link

recursive-automata commented Dec 5, 2023

Starting from your solution, @n-batalha, ... package create ... can be replaced with pip install -t and zip -r. This puts the top level module (named example below) in the right format for ... package upload ....

pip install -U -t $TARGET_DIR $PACKAGE_DIR
cd $TARGET_DIR
zip -r example.zip example/
snow snowpark package upload -f example.zip -s packages --overwrite -c $CONNECTION_NAME

@sfc-gh-turbaszek
Copy link
Contributor

@n-batalha I'm closing this one as there were numerous fixes in package commands in 2.0 version. If this still persists in 2.0 please reopen this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants