-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make it possible to use schema-awareness outside of operators and transform dictionary-like objects #139
Conversation
Click to view CI ResultsGitHub pull request #139 of commit 8600b240837f6be6b2cd57b2769b77ba9f2c3a58, no merge conflicts. Running as SYSTEM Setting status of 8600b240837f6be6b2cd57b2769b77ba9f2c3a58 to PENDING with url https://10.20.13.93:8080/job/merlin_core/199/console and message: 'Pending' Using context: Jenkins Building on master in workspace /var/jenkins_home/workspace/merlin_core using credential ce87ff3c-94f0-400a-8303-cb4acb4918b5 > git rev-parse --is-inside-work-tree # timeout=10 Fetching changes from the remote Git repository > git config remote.origin.url https://github.com/NVIDIA-Merlin/core # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/core > git --version # timeout=10 using GIT_ASKPASS to set credentials login for merlin-systems username and pass > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/core +refs/pull/139/*:refs/remotes/origin/pr/139/* # timeout=10 > git rev-parse 8600b240837f6be6b2cd57b2769b77ba9f2c3a58^{commit} # timeout=10 Checking out Revision 8600b240837f6be6b2cd57b2769b77ba9f2c3a58 (detached) > git config core.sparsecheckout # timeout=10 > git checkout -f 8600b240837f6be6b2cd57b2769b77ba9f2c3a58 # timeout=10 Commit message: "Use protocols in dispatch" > git rev-list --no-walk ec99496883329a91de59fed300c77da0c1bc030c # timeout=10 [merlin_core] $ /bin/bash /tmp/jenkins3928575102079743059.sh GLOB sdist-make: /var/jenkins_home/workspace/merlin_core/core/setup.py test-gpu inst-nodeps: /var/jenkins_home/workspace/merlin_core/core/.tox/.tmp/package/1/merlin-core-0.6.0+11.g8600b24.zip WARNING: Discarding $PYTHONPATH from environment, to override specify PYTHONPATH in 'passenv' in your configuration. test-gpu installed: absl-py==1.2.0,alabaster==0.7.12,anyio==3.6.1,argon2-cffi==21.3.0,argon2-cffi-bindings==21.2.0,astroid==2.5.6,asttokens==2.0.7,astunparse==1.6.3,asv==0.5.1,asvdb==0.4.2,attrs==22.1.0,awscli==1.25.72,Babel==2.10.3,backcall==0.2.0,beautifulsoup4==4.11.1,betterproto==1.2.5,black==22.6.0,bleach==5.0.1,boto3==1.24.51,botocore==1.27.71,Brotli==1.0.9,cachetools==5.2.0,certifi==2019.11.28,cffi==1.15.1,chardet==3.0.4,clang==5.0,click==8.1.3,cloudpickle==2.1.0,colorama==0.4.4,coverage==6.4.4,cuda-python==11.7.1,cudf==22.4.0,cupy-cuda116==10.6.0,cycler==0.11.0,Cython==0.29.32,dask==2022.1.1,dask-cuda==22.4.0,dask-cudf==22.4.0,dbus-python==1.2.16,debugpy==1.6.2,decorator==5.1.1,defusedxml==0.7.1,dill==0.3.5.1,distlib==0.3.6,distributed==2022.3.0,distro==1.7.0,dm-tree==0.1.7,docker-pycreds==0.4.0,docutils==0.16,emoji==1.7.0,entrypoints==0.4,execnet==1.9.0,executing==0.10.0,faiss-gpu==1.7.2,fastai==2.7.9,fastapi==0.82.0,fastavro==1.6.0,fastcore==1.5.24,fastdownload==0.0.7,fastjsonschema==2.16.1,fastprogress==1.0.3,fastrlock==0.8,feast==0.19.4,fiddle==0.2.0,filelock==3.8.0,flatbuffers==1.12,fonttools==4.37.1,fsspec==2022.5.0,gast==0.4.0,gevent==21.12.0,geventhttpclient==2.0,gitdb==4.0.9,GitPython==3.1.27,google==3.0.0,google-api-core==2.10.0,google-auth==2.11.0,google-auth-oauthlib==0.4.6,google-pasta==0.2.0,googleapis-common-protos==1.52.0,graphviz==0.20.1,greenlet==1.1.2,grpcio==1.41.0,grpcio-channelz==1.47.0,grpcio-reflection==1.48.1,grpclib==0.4.3,h11==0.13.0,h2==4.1.0,h5py==3.7.0,HeapDict==1.0.1,hpack==4.0.0,httptools==0.4.0,hugectr2onnx==0.0.0,huggingface-hub==0.8.1,hyperframe==6.0.1,idna==2.8,imagesize==1.4.1,implicit==0.6.0,importlib-metadata==4.12.0,importlib-resources==5.9.0,iniconfig==1.1.1,ipykernel==6.15.1,ipython==8.4.0,ipython-genutils==0.2.0,ipywidgets==7.7.0,jedi==0.18.1,Jinja2==3.1.2,jmespath==1.0.1,joblib==1.1.0,json5==0.9.9,jsonschema==4.9.1,jupyter-cache==0.4.3,jupyter-client==7.3.4,jupyter-core==4.11.1,jupyter-server==1.18.1,jupyter-server-mathjax==0.2.5,jupyter-sphinx==0.3.2,jupyterlab==3.4.5,jupyterlab-pygments==0.2.2,jupyterlab-server==2.15.0,jupyterlab-widgets==1.1.0,keras==2.9.0,Keras-Preprocessing==1.1.2,kiwisolver==1.4.4,lazy-object-proxy==1.7.1,libclang==14.0.6,lightfm==1.16,lightgbm==3.3.2,linkify-it-py==1.0.3,llvmlite==0.39.0,locket==1.0.0,lxml==4.9.1,Markdown==3.4.1,markdown-it-py==1.1.0,MarkupSafe==2.1.1,matplotlib==3.5.3,matplotlib-inline==0.1.3,mdit-py-plugins==0.2.8,merlin-core==0.6.0+11.g8600b24,merlin-models==0.6.0+45.g5a345d9c1,merlin-systems==0+untagged.105.gf89cc51,mistune==0.8.4,mmh3==3.0.0,mpi4py==3.1.3,msgpack==1.0.4,multidict==6.0.2,myst-nb==0.13.2,myst-parser==0.15.2,natsort==8.1.0,nbclassic==0.4.3,nbclient==0.6.6,nbconvert==6.5.3,nbdime==3.1.1,nbformat==5.4.0,nest-asyncio==1.5.5,notebook==6.4.12,notebook-shim==0.1.0,numba==0.56.0,numpy==1.21.5,nvidia-pyindex==1.0.9,# Editable install with no version control (nvtabular==1.3.3+15.g16e4e34e9),-e /usr/local/lib/python3.8/dist-packages,nvtx==0.2.5,oauthlib==3.2.0,onnx==1.12.0,onnxruntime==1.11.1,opt-einsum==3.3.0,packaging==21.3,pandas==1.3.5,pandavro==1.5.2,pandocfilters==1.5.0,parso==0.8.3,partd==1.3.0,pathtools==0.1.2,pexpect==4.8.0,pickleshare==0.7.5,Pillow==9.2.0,pkgutil_resolve_name==1.3.10,platformdirs==2.5.2,pluggy==1.0.0,prometheus-client==0.14.1,promise==2.3,prompt-toolkit==3.0.30,proto-plus==1.19.6,protobuf==3.19.4,psutil==5.9.1,ptyprocess==0.7.0,pure-eval==0.2.2,py==1.11.0,pyarrow==6.0.0,pyasn1==0.4.8,pyasn1-modules==0.2.8,pybind11==2.10.0,pycparser==2.21,pydantic==1.10.2,pydot==1.4.2,Pygments==2.12.0,PyGObject==3.36.0,pynvml==11.4.1,pyparsing==3.0.9,pyrsistent==0.18.1,pytest==7.1.2,pytest-cov==3.0.0,pytest-forked==1.4.0,pytest-xdist==2.5.0,python-apt==2.0.0+ubuntu0.20.4.7,python-dateutil==2.8.2,python-dotenv==0.21.0,python-rapidjson==1.8,pytz==2022.2.1,PyYAML==5.4.1,pyzmq==23.2.1,regex==2022.7.25,requests==2.22.0,requests-oauthlib==1.3.1,requests-unixsocket==0.2.0,rmm==21.12.0,rsa==4.7.2,s3fs==2022.2.0,s3transfer==0.6.0,sacremoses==0.0.53,scikit-build==0.15.0,scikit-learn==1.1.2,scipy==1.9.0,seedir==0.3.0,Send2Trash==1.8.0,sentry-sdk==1.9.4,setproctitle==1.3.2,setuptools-scm==7.0.5,shortuuid==1.0.9,six==1.15.0,sklearn==0.0,smmap==5.0.0,sniffio==1.2.0,snowballstemmer==2.2.0,sortedcontainers==2.4.0,soupsieve==2.3.2.post1,Sphinx==5.1.1,sphinx-multiversion==0.2.4,sphinx-togglebutton==0.3.1,sphinx_external_toc==0.3.0,sphinxcontrib-applehelp==1.0.2,sphinxcontrib-copydirs @ git+https://github.com/mikemckiernan/sphinxcontrib-copydirs.git@bd8c5d79b3f91cf5f1bb0d6995aeca3fe84b670e,sphinxcontrib-devhelp==1.0.2,sphinxcontrib-htmlhelp==2.0.0,sphinxcontrib-jsmath==1.0.1,sphinxcontrib-qthelp==1.0.3,sphinxcontrib-serializinghtml==1.1.5,SQLAlchemy==1.4.36,stack-data==0.4.0,starlette==0.19.1,stringcase==1.2.0,supervisor==4.1.0,tabulate==0.8.10,tblib==1.7.0,tdqm==0.0.1,tenacity==8.0.1,tensorboard==2.9.1,tensorboard-data-server==0.6.1,tensorboard-plugin-wit==1.8.1,tensorflow==2.6.2,tensorflow-estimator==2.9.0,tensorflow-gpu==2.9.2,tensorflow-io-gcs-filesystem==0.26.0,tensorflow-metadata==1.9.0,termcolor==1.1.0,terminado==0.15.0,testbook==0.4.2,threadpoolctl==3.1.0,tinycss2==1.1.1,tokenizers==0.10.3,toml==0.10.2,tomli==2.0.1,toolz==0.12.0,torch==1.12.1+cu113,torchmetrics==0.3.2,tornado==6.2,tox==3.25.1,tqdm==4.64.0,traitlets==5.3.0,transformers==4.12.0,transformers4rec==0.1.11+10.g21a2a836a,treelite==2.3.0,treelite-runtime==2.3.0,tritonclient==2.22.0,typing_extensions==4.3.0,uc-micro-py==1.0.1,urllib3==1.26.11,uvicorn==0.18.3,uvloop==0.16.0,versioneer==0.20,virtualenv==20.16.4,wandb==0.13.1,watchfiles==0.16.1,wcwidth==0.2.5,webencodings==0.5.1,websocket-client==1.3.3,websockets==10.3,Werkzeug==2.2.2,widgetsnbextension==3.6.0,wrapt==1.12.1,xgboost==1.6.1,zict==2.2.0,zipp==3.8.1,zope.event==4.5.0,zope.interface==5.4.0 test-gpu run-test-pre: PYTHONHASHSEED='2715396807' test-gpu run-test: commands[0] | python -m pytest --cov-report term --cov merlin -rxs tests/unit ============================= test session starts ============================== platform linux -- Python 3.8.10, pytest-7.1.2, pluggy-1.0.0 cachedir: .tox/test-gpu/.pytest_cache rootdir: /var/jenkins_home/workspace/merlin_core/core, configfile: pyproject.toml plugins: anyio-3.5.0, xdist-2.5.0, forked-1.4.0, cov-3.0.0 collected 364 items / 1 skipped |
Property setters don't work well with Protocol method definitions
Click to view CI ResultsGitHub pull request #139 of commit cb7d3f8298817d9d94d6c6fe23e3e44b8980345a, no merge conflicts. Running as SYSTEM Setting status of cb7d3f8298817d9d94d6c6fe23e3e44b8980345a to PENDING with url https://10.20.13.93:8080/job/merlin_core/206/console and message: 'Pending' Using context: Jenkins Building on master in workspace /var/jenkins_home/workspace/merlin_core using credential ce87ff3c-94f0-400a-8303-cb4acb4918b5 > git rev-parse --is-inside-work-tree # timeout=10 Fetching changes from the remote Git repository > git config remote.origin.url https://github.com/NVIDIA-Merlin/core # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/core > git --version # timeout=10 using GIT_ASKPASS to set credentials login for merlin-systems username and pass > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/core +refs/pull/139/*:refs/remotes/origin/pr/139/* # timeout=10 > git rev-parse cb7d3f8298817d9d94d6c6fe23e3e44b8980345a^{commit} # timeout=10 Checking out Revision cb7d3f8298817d9d94d6c6fe23e3e44b8980345a (detached) > git config core.sparsecheckout # timeout=10 > git checkout -f cb7d3f8298817d9d94d6c6fe23e3e44b8980345a # timeout=10 Commit message: "Adjust method signatures in DAG operators" > git rev-list --no-walk 9ebe07cfac0b9837c9168efde2548a7186469e8f # timeout=10 [merlin_core] $ /bin/bash /tmp/jenkins1786803776840566357.sh GLOB sdist-make: /var/jenkins_home/workspace/merlin_core/core/setup.py test-gpu inst-nodeps: /var/jenkins_home/workspace/merlin_core/core/.tox/.tmp/package/1/merlin-core-0.6.0+35.gcb7d3f8.zip WARNING: Discarding $PYTHONPATH from environment, to override specify PYTHONPATH in 'passenv' in your configuration. test-gpu installed: absl-py==1.2.0,alabaster==0.7.12,anyio==3.6.1,argon2-cffi==21.3.0,argon2-cffi-bindings==21.2.0,astroid==2.5.6,asttokens==2.0.7,astunparse==1.6.3,asv==0.5.1,asvdb==0.4.2,attrs==22.1.0,awscli==1.25.73,Babel==2.10.3,backcall==0.2.0,beautifulsoup4==4.11.1,betterproto==1.2.5,black==22.6.0,bleach==5.0.1,boto3==1.24.51,botocore==1.27.72,Brotli==1.0.9,cachetools==5.2.0,certifi==2019.11.28,cffi==1.15.1,chardet==3.0.4,clang==5.0,click==8.1.3,cloudpickle==2.1.0,colorama==0.4.4,coverage==6.4.4,cuda-python==11.7.1,cudf==22.4.0,cupy-cuda116==10.6.0,cycler==0.11.0,Cython==0.29.32,dask==2022.1.1,dask-cuda==22.4.0,dask-cudf==22.4.0,dbus-python==1.2.16,debugpy==1.6.2,decorator==5.1.1,defusedxml==0.7.1,dill==0.3.5.1,distlib==0.3.6,distributed==2022.3.0,distro==1.7.0,dm-tree==0.1.7,docker-pycreds==0.4.0,docutils==0.16,emoji==1.7.0,entrypoints==0.4,execnet==1.9.0,executing==0.10.0,faiss-gpu==1.7.2,fastai==2.7.9,fastapi==0.82.0,fastavro==1.6.0,fastcore==1.5.24,fastdownload==0.0.7,fastjsonschema==2.16.1,fastprogress==1.0.3,fastrlock==0.8,feast==0.19.4,fiddle==0.2.0,filelock==3.8.0,flatbuffers==1.12,fonttools==4.37.1,fsspec==2022.5.0,gast==0.4.0,gevent==21.12.0,geventhttpclient==2.0,gitdb==4.0.9,GitPython==3.1.27,google==3.0.0,google-api-core==2.10.0,google-auth==2.11.0,google-auth-oauthlib==0.4.6,google-pasta==0.2.0,googleapis-common-protos==1.52.0,graphviz==0.20.1,greenlet==1.1.2,grpcio==1.41.0,grpcio-channelz==1.47.0,grpcio-reflection==1.48.1,grpclib==0.4.3,h11==0.13.0,h2==4.1.0,h5py==3.7.0,HeapDict==1.0.1,hpack==4.0.0,httptools==0.4.0,hugectr2onnx==0.0.0,huggingface-hub==0.8.1,hyperframe==6.0.1,idna==2.8,imagesize==1.4.1,implicit==0.6.0,importlib-metadata==4.12.0,importlib-resources==5.9.0,iniconfig==1.1.1,ipykernel==6.15.1,ipython==8.4.0,ipython-genutils==0.2.0,ipywidgets==7.7.0,jedi==0.18.1,Jinja2==3.1.2,jmespath==1.0.1,joblib==1.1.0,json5==0.9.9,jsonschema==4.9.1,jupyter-cache==0.4.3,jupyter-client==7.3.4,jupyter-core==4.11.1,jupyter-server==1.18.1,jupyter-server-mathjax==0.2.5,jupyter-sphinx==0.3.2,jupyterlab==3.4.5,jupyterlab-pygments==0.2.2,jupyterlab-server==2.15.0,jupyterlab-widgets==1.1.0,keras==2.9.0,Keras-Preprocessing==1.1.2,kiwisolver==1.4.4,lazy-object-proxy==1.7.1,libclang==14.0.6,lightfm==1.16,lightgbm==3.3.2,linkify-it-py==1.0.3,llvmlite==0.39.0,locket==1.0.0,lxml==4.9.1,Markdown==3.4.1,markdown-it-py==1.1.0,MarkupSafe==2.1.1,matplotlib==3.5.3,matplotlib-inline==0.1.3,mdit-py-plugins==0.2.8,merlin-core==0.6.0+35.gcb7d3f8,merlin-models==0.6.0+45.g5a345d9c1,merlin-systems==0+untagged.105.gf89cc51,mistune==0.8.4,mmh3==3.0.0,mpi4py==3.1.3,msgpack==1.0.4,multidict==6.0.2,myst-nb==0.13.2,myst-parser==0.15.2,natsort==8.1.0,nbclassic==0.4.3,nbclient==0.6.6,nbconvert==6.5.3,nbdime==3.1.1,nbformat==5.4.0,nest-asyncio==1.5.5,notebook==6.4.12,notebook-shim==0.1.0,numba==0.56.0,numpy==1.21.5,nvidia-pyindex==1.0.9,# Editable install with no version control (nvtabular==1.3.3+15.g16e4e34e9),-e /usr/local/lib/python3.8/dist-packages,nvtx==0.2.5,oauthlib==3.2.0,onnx==1.12.0,onnxruntime==1.11.1,opt-einsum==3.3.0,packaging==21.3,pandas==1.3.5,pandavro==1.5.2,pandocfilters==1.5.0,parso==0.8.3,partd==1.3.0,pathtools==0.1.2,pexpect==4.8.0,pickleshare==0.7.5,Pillow==9.2.0,pkgutil_resolve_name==1.3.10,platformdirs==2.5.2,pluggy==1.0.0,prometheus-client==0.14.1,promise==2.3,prompt-toolkit==3.0.30,proto-plus==1.19.6,protobuf==3.19.4,psutil==5.9.1,ptyprocess==0.7.0,pure-eval==0.2.2,py==1.11.0,pyarrow==6.0.0,pyasn1==0.4.8,pyasn1-modules==0.2.8,pybind11==2.10.0,pycparser==2.21,pydantic==1.10.2,pydot==1.4.2,Pygments==2.12.0,PyGObject==3.36.0,pynvml==11.4.1,pyparsing==3.0.9,pyrsistent==0.18.1,pytest==7.1.2,pytest-cov==3.0.0,pytest-forked==1.4.0,pytest-xdist==2.5.0,python-apt==2.0.0+ubuntu0.20.4.7,python-dateutil==2.8.2,python-dotenv==0.21.0,python-rapidjson==1.8,pytz==2022.2.1,PyYAML==5.4.1,pyzmq==23.2.1,regex==2022.7.25,requests==2.22.0,requests-oauthlib==1.3.1,requests-unixsocket==0.2.0,rmm==21.12.0,rsa==4.7.2,s3fs==2022.2.0,s3transfer==0.6.0,sacremoses==0.0.53,scikit-build==0.15.0,scikit-learn==1.1.2,scipy==1.9.0,seedir==0.3.0,Send2Trash==1.8.0,sentry-sdk==1.9.4,setproctitle==1.3.2,setuptools-scm==7.0.5,shortuuid==1.0.9,six==1.15.0,sklearn==0.0,smmap==5.0.0,sniffio==1.2.0,snowballstemmer==2.2.0,sortedcontainers==2.4.0,soupsieve==2.3.2.post1,Sphinx==5.1.1,sphinx-multiversion==0.2.4,sphinx-togglebutton==0.3.1,sphinx_external_toc==0.3.0,sphinxcontrib-applehelp==1.0.2,sphinxcontrib-copydirs @ git+https://github.com/mikemckiernan/sphinxcontrib-copydirs.git@bd8c5d79b3f91cf5f1bb0d6995aeca3fe84b670e,sphinxcontrib-devhelp==1.0.2,sphinxcontrib-htmlhelp==2.0.0,sphinxcontrib-jsmath==1.0.1,sphinxcontrib-qthelp==1.0.3,sphinxcontrib-serializinghtml==1.1.5,SQLAlchemy==1.4.36,stack-data==0.4.0,starlette==0.19.1,stringcase==1.2.0,supervisor==4.1.0,tabulate==0.8.10,tblib==1.7.0,tdqm==0.0.1,tenacity==8.0.1,tensorboard==2.9.1,tensorboard-data-server==0.6.1,tensorboard-plugin-wit==1.8.1,tensorflow==2.6.2,tensorflow-estimator==2.9.0,tensorflow-gpu==2.9.2,tensorflow-io-gcs-filesystem==0.26.0,tensorflow-metadata==1.9.0,termcolor==1.1.0,terminado==0.15.0,testbook==0.4.2,threadpoolctl==3.1.0,tinycss2==1.1.1,tokenizers==0.10.3,toml==0.10.2,tomli==2.0.1,toolz==0.12.0,torch==1.12.1+cu113,torchmetrics==0.3.2,tornado==6.2,tox==3.25.1,tqdm==4.64.0,traitlets==5.3.0,transformers==4.12.0,transformers4rec==0.1.11+10.g21a2a836a,treelite==2.3.0,treelite-runtime==2.3.0,tritonclient==2.22.0,typing_extensions==4.3.0,uc-micro-py==1.0.1,urllib3==1.26.11,uvicorn==0.18.3,uvloop==0.16.0,versioneer==0.20,virtualenv==20.16.4,wandb==0.13.1,watchfiles==0.16.1,wcwidth==0.2.5,webencodings==0.5.1,websocket-client==1.3.3,websockets==10.3,Werkzeug==2.2.2,widgetsnbextension==3.6.0,wrapt==1.12.1,xgboost==1.6.1,zict==2.2.0,zipp==3.8.1,zope.event==4.5.0,zope.interface==5.4.0 test-gpu run-test-pre: PYTHONHASHSEED='287177704' test-gpu run-test: commands[0] | python -m pytest --cov-report term --cov merlin -rxs tests/unit ============================= test session starts ============================== platform linux -- Python 3.8.10, pytest-7.1.2, pluggy-1.0.0 cachedir: .tox/test-gpu/.pytest_cache rootdir: /var/jenkins_home/workspace/merlin_core/core, configfile: pyproject.toml plugins: anyio-3.5.0, xdist-2.5.0, forked-1.4.0, cov-3.0.0 collected 367 items / 1 skipped |
Click to view CI ResultsGitHub pull request #139 of commit b3c910211a7ef8562be9443f05a17321bb4c49f8, no merge conflicts. Running as SYSTEM Setting status of b3c910211a7ef8562be9443f05a17321bb4c49f8 to PENDING with url https://10.20.13.93:8080/job/merlin_core/207/console and message: 'Pending' Using context: Jenkins Building on master in workspace /var/jenkins_home/workspace/merlin_core using credential ce87ff3c-94f0-400a-8303-cb4acb4918b5 > git rev-parse --is-inside-work-tree # timeout=10 Fetching changes from the remote Git repository > git config remote.origin.url https://github.com/NVIDIA-Merlin/core # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/core > git --version # timeout=10 using GIT_ASKPASS to set credentials login for merlin-systems username and pass > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/core +refs/pull/139/*:refs/remotes/origin/pr/139/* # timeout=10 > git rev-parse b3c910211a7ef8562be9443f05a17321bb4c49f8^{commit} # timeout=10 Checking out Revision b3c910211a7ef8562be9443f05a17321bb4c49f8 (detached) > git config core.sparsecheckout # timeout=10 > git checkout -f b3c910211a7ef8562be9443f05a17321bb4c49f8 # timeout=10 Commit message: "Add `merlin/core` to interrogate ignores" > git rev-list --no-walk cb7d3f8298817d9d94d6c6fe23e3e44b8980345a # timeout=10 [merlin_core] $ /bin/bash /tmp/jenkins12183421768119601733.sh GLOB sdist-make: /var/jenkins_home/workspace/merlin_core/core/setup.py test-gpu inst-nodeps: /var/jenkins_home/workspace/merlin_core/core/.tox/.tmp/package/1/merlin-core-0.6.0+37.gb3c9102.zip WARNING: Discarding $PYTHONPATH from environment, to override specify PYTHONPATH in 'passenv' in your configuration. test-gpu installed: absl-py==1.2.0,alabaster==0.7.12,anyio==3.6.1,argon2-cffi==21.3.0,argon2-cffi-bindings==21.2.0,astroid==2.5.6,asttokens==2.0.7,astunparse==1.6.3,asv==0.5.1,asvdb==0.4.2,attrs==22.1.0,awscli==1.25.73,Babel==2.10.3,backcall==0.2.0,beautifulsoup4==4.11.1,betterproto==1.2.5,black==22.6.0,bleach==5.0.1,boto3==1.24.51,botocore==1.27.72,Brotli==1.0.9,cachetools==5.2.0,certifi==2019.11.28,cffi==1.15.1,chardet==3.0.4,clang==5.0,click==8.1.3,cloudpickle==2.1.0,colorama==0.4.4,coverage==6.4.4,cuda-python==11.7.1,cudf==22.4.0,cupy-cuda116==10.6.0,cycler==0.11.0,Cython==0.29.32,dask==2022.1.1,dask-cuda==22.4.0,dask-cudf==22.4.0,dbus-python==1.2.16,debugpy==1.6.2,decorator==5.1.1,defusedxml==0.7.1,dill==0.3.5.1,distlib==0.3.6,distributed==2022.3.0,distro==1.7.0,dm-tree==0.1.7,docker-pycreds==0.4.0,docutils==0.16,emoji==1.7.0,entrypoints==0.4,execnet==1.9.0,executing==0.10.0,faiss-gpu==1.7.2,fastai==2.7.9,fastapi==0.82.0,fastavro==1.6.0,fastcore==1.5.24,fastdownload==0.0.7,fastjsonschema==2.16.1,fastprogress==1.0.3,fastrlock==0.8,feast==0.19.4,fiddle==0.2.0,filelock==3.8.0,flatbuffers==1.12,fonttools==4.37.1,fsspec==2022.5.0,gast==0.4.0,gevent==21.12.0,geventhttpclient==2.0,gitdb==4.0.9,GitPython==3.1.27,google==3.0.0,google-api-core==2.10.0,google-auth==2.11.0,google-auth-oauthlib==0.4.6,google-pasta==0.2.0,googleapis-common-protos==1.52.0,graphviz==0.20.1,greenlet==1.1.2,grpcio==1.41.0,grpcio-channelz==1.47.0,grpcio-reflection==1.48.1,grpclib==0.4.3,h11==0.13.0,h2==4.1.0,h5py==3.7.0,HeapDict==1.0.1,hpack==4.0.0,httptools==0.4.0,hugectr2onnx==0.0.0,huggingface-hub==0.8.1,hyperframe==6.0.1,idna==2.8,imagesize==1.4.1,implicit==0.6.0,importlib-metadata==4.12.0,importlib-resources==5.9.0,iniconfig==1.1.1,ipykernel==6.15.1,ipython==8.4.0,ipython-genutils==0.2.0,ipywidgets==7.7.0,jedi==0.18.1,Jinja2==3.1.2,jmespath==1.0.1,joblib==1.1.0,json5==0.9.9,jsonschema==4.9.1,jupyter-cache==0.4.3,jupyter-client==7.3.4,jupyter-core==4.11.1,jupyter-server==1.18.1,jupyter-server-mathjax==0.2.5,jupyter-sphinx==0.3.2,jupyterlab==3.4.5,jupyterlab-pygments==0.2.2,jupyterlab-server==2.15.0,jupyterlab-widgets==1.1.0,keras==2.9.0,Keras-Preprocessing==1.1.2,kiwisolver==1.4.4,lazy-object-proxy==1.7.1,libclang==14.0.6,lightfm==1.16,lightgbm==3.3.2,linkify-it-py==1.0.3,llvmlite==0.39.0,locket==1.0.0,lxml==4.9.1,Markdown==3.4.1,markdown-it-py==1.1.0,MarkupSafe==2.1.1,matplotlib==3.5.3,matplotlib-inline==0.1.3,mdit-py-plugins==0.2.8,merlin-core==0.6.0+37.gb3c9102,merlin-models==0.6.0+45.g5a345d9c1,merlin-systems==0+untagged.105.gf89cc51,mistune==0.8.4,mmh3==3.0.0,mpi4py==3.1.3,msgpack==1.0.4,multidict==6.0.2,myst-nb==0.13.2,myst-parser==0.15.2,natsort==8.1.0,nbclassic==0.4.3,nbclient==0.6.6,nbconvert==6.5.3,nbdime==3.1.1,nbformat==5.4.0,nest-asyncio==1.5.5,notebook==6.4.12,notebook-shim==0.1.0,numba==0.56.0,numpy==1.21.5,nvidia-pyindex==1.0.9,# Editable install with no version control (nvtabular==1.3.3+15.g16e4e34e9),-e /usr/local/lib/python3.8/dist-packages,nvtx==0.2.5,oauthlib==3.2.0,onnx==1.12.0,onnxruntime==1.11.1,opt-einsum==3.3.0,packaging==21.3,pandas==1.3.5,pandavro==1.5.2,pandocfilters==1.5.0,parso==0.8.3,partd==1.3.0,pathtools==0.1.2,pexpect==4.8.0,pickleshare==0.7.5,Pillow==9.2.0,pkgutil_resolve_name==1.3.10,platformdirs==2.5.2,pluggy==1.0.0,prometheus-client==0.14.1,promise==2.3,prompt-toolkit==3.0.30,proto-plus==1.19.6,protobuf==3.19.4,psutil==5.9.1,ptyprocess==0.7.0,pure-eval==0.2.2,py==1.11.0,pyarrow==6.0.0,pyasn1==0.4.8,pyasn1-modules==0.2.8,pybind11==2.10.0,pycparser==2.21,pydantic==1.10.2,pydot==1.4.2,Pygments==2.12.0,PyGObject==3.36.0,pynvml==11.4.1,pyparsing==3.0.9,pyrsistent==0.18.1,pytest==7.1.2,pytest-cov==3.0.0,pytest-forked==1.4.0,pytest-xdist==2.5.0,python-apt==2.0.0+ubuntu0.20.4.7,python-dateutil==2.8.2,python-dotenv==0.21.0,python-rapidjson==1.8,pytz==2022.2.1,PyYAML==5.4.1,pyzmq==23.2.1,regex==2022.7.25,requests==2.22.0,requests-oauthlib==1.3.1,requests-unixsocket==0.2.0,rmm==21.12.0,rsa==4.7.2,s3fs==2022.2.0,s3transfer==0.6.0,sacremoses==0.0.53,scikit-build==0.15.0,scikit-learn==1.1.2,scipy==1.9.0,seedir==0.3.0,Send2Trash==1.8.0,sentry-sdk==1.9.4,setproctitle==1.3.2,setuptools-scm==7.0.5,shortuuid==1.0.9,six==1.15.0,sklearn==0.0,smmap==5.0.0,sniffio==1.2.0,snowballstemmer==2.2.0,sortedcontainers==2.4.0,soupsieve==2.3.2.post1,Sphinx==5.1.1,sphinx-multiversion==0.2.4,sphinx-togglebutton==0.3.1,sphinx_external_toc==0.3.0,sphinxcontrib-applehelp==1.0.2,sphinxcontrib-copydirs @ git+https://github.com/mikemckiernan/sphinxcontrib-copydirs.git@bd8c5d79b3f91cf5f1bb0d6995aeca3fe84b670e,sphinxcontrib-devhelp==1.0.2,sphinxcontrib-htmlhelp==2.0.0,sphinxcontrib-jsmath==1.0.1,sphinxcontrib-qthelp==1.0.3,sphinxcontrib-serializinghtml==1.1.5,SQLAlchemy==1.4.36,stack-data==0.4.0,starlette==0.19.1,stringcase==1.2.0,supervisor==4.1.0,tabulate==0.8.10,tblib==1.7.0,tdqm==0.0.1,tenacity==8.0.1,tensorboard==2.9.1,tensorboard-data-server==0.6.1,tensorboard-plugin-wit==1.8.1,tensorflow==2.6.2,tensorflow-estimator==2.9.0,tensorflow-gpu==2.9.2,tensorflow-io-gcs-filesystem==0.26.0,tensorflow-metadata==1.9.0,termcolor==1.1.0,terminado==0.15.0,testbook==0.4.2,threadpoolctl==3.1.0,tinycss2==1.1.1,tokenizers==0.10.3,toml==0.10.2,tomli==2.0.1,toolz==0.12.0,torch==1.12.1+cu113,torchmetrics==0.3.2,tornado==6.2,tox==3.25.1,tqdm==4.64.0,traitlets==5.3.0,transformers==4.12.0,transformers4rec==0.1.11+10.g21a2a836a,treelite==2.3.0,treelite-runtime==2.3.0,tritonclient==2.22.0,typing_extensions==4.3.0,uc-micro-py==1.0.1,urllib3==1.26.11,uvicorn==0.18.3,uvloop==0.16.0,versioneer==0.20,virtualenv==20.16.4,wandb==0.13.1,watchfiles==0.16.1,wcwidth==0.2.5,webencodings==0.5.1,websocket-client==1.3.3,websockets==10.3,Werkzeug==2.2.2,widgetsnbextension==3.6.0,wrapt==1.12.1,xgboost==1.6.1,zict==2.2.0,zipp==3.8.1,zope.event==4.5.0,zope.interface==5.4.0 test-gpu run-test-pre: PYTHONHASHSEED='518487231' test-gpu run-test: commands[0] | python -m pytest --cov-report term --cov merlin -rxs tests/unit ============================= test session starts ============================== platform linux -- Python 3.8.10, pytest-7.1.2, pluggy-1.0.0 cachedir: .tox/test-gpu/.pytest_cache rootdir: /var/jenkins_home/workspace/merlin_core/core, configfile: pyproject.toml plugins: anyio-3.5.0, xdist-2.5.0, forked-1.4.0, cov-3.0.0 collected 367 items / 1 skipped |
Click to view CI ResultsGitHub pull request #139 of commit 865b4d3f4de02ac39ff7adab6ee823e45a077743, no merge conflicts. Running as SYSTEM Setting status of 865b4d3f4de02ac39ff7adab6ee823e45a077743 to PENDING with url https://10.20.13.93:8080/job/merlin_core/208/console and message: 'Pending' Using context: Jenkins Building on master in workspace /var/jenkins_home/workspace/merlin_core using credential ce87ff3c-94f0-400a-8303-cb4acb4918b5 > git rev-parse --is-inside-work-tree # timeout=10 Fetching changes from the remote Git repository > git config remote.origin.url https://github.com/NVIDIA-Merlin/core # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/core > git --version # timeout=10 using GIT_ASKPASS to set credentials login for merlin-systems username and pass > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/core +refs/pull/139/*:refs/remotes/origin/pr/139/* # timeout=10 > git rev-parse 865b4d3f4de02ac39ff7adab6ee823e45a077743^{commit} # timeout=10 Checking out Revision 865b4d3f4de02ac39ff7adab6ee823e45a077743 (detached) > git config core.sparsecheckout # timeout=10 > git checkout -f 865b4d3f4de02ac39ff7adab6ee823e45a077743 # timeout=10 Commit message: "Make the executor tests CPU/pandas compatible" > git rev-list --no-walk b3c910211a7ef8562be9443f05a17321bb4c49f8 # timeout=10 [merlin_core] $ /bin/bash /tmp/jenkins6955275658770030891.sh GLOB sdist-make: /var/jenkins_home/workspace/merlin_core/core/setup.py test-gpu inst-nodeps: /var/jenkins_home/workspace/merlin_core/core/.tox/.tmp/package/1/merlin-core-0.6.0+38.g865b4d3.zip WARNING: Discarding $PYTHONPATH from environment, to override specify PYTHONPATH in 'passenv' in your configuration. test-gpu installed: absl-py==1.2.0,alabaster==0.7.12,anyio==3.6.1,argon2-cffi==21.3.0,argon2-cffi-bindings==21.2.0,astroid==2.5.6,asttokens==2.0.7,astunparse==1.6.3,asv==0.5.1,asvdb==0.4.2,attrs==22.1.0,awscli==1.25.73,Babel==2.10.3,backcall==0.2.0,beautifulsoup4==4.11.1,betterproto==1.2.5,black==22.6.0,bleach==5.0.1,boto3==1.24.51,botocore==1.27.72,Brotli==1.0.9,cachetools==5.2.0,certifi==2019.11.28,cffi==1.15.1,chardet==3.0.4,clang==5.0,click==8.1.3,cloudpickle==2.1.0,colorama==0.4.4,coverage==6.4.4,cuda-python==11.7.1,cudf==22.4.0,cupy-cuda116==10.6.0,cycler==0.11.0,Cython==0.29.32,dask==2022.1.1,dask-cuda==22.4.0,dask-cudf==22.4.0,dbus-python==1.2.16,debugpy==1.6.2,decorator==5.1.1,defusedxml==0.7.1,dill==0.3.5.1,distlib==0.3.6,distributed==2022.3.0,distro==1.7.0,dm-tree==0.1.7,docker-pycreds==0.4.0,docutils==0.16,emoji==1.7.0,entrypoints==0.4,execnet==1.9.0,executing==0.10.0,faiss-gpu==1.7.2,fastai==2.7.9,fastapi==0.82.0,fastavro==1.6.0,fastcore==1.5.24,fastdownload==0.0.7,fastjsonschema==2.16.1,fastprogress==1.0.3,fastrlock==0.8,feast==0.19.4,fiddle==0.2.0,filelock==3.8.0,flatbuffers==1.12,fonttools==4.37.1,fsspec==2022.5.0,gast==0.4.0,gevent==21.12.0,geventhttpclient==2.0,gitdb==4.0.9,GitPython==3.1.27,google==3.0.0,google-api-core==2.10.0,google-auth==2.11.0,google-auth-oauthlib==0.4.6,google-pasta==0.2.0,googleapis-common-protos==1.52.0,graphviz==0.20.1,greenlet==1.1.2,grpcio==1.41.0,grpcio-channelz==1.47.0,grpcio-reflection==1.48.1,grpclib==0.4.3,h11==0.13.0,h2==4.1.0,h5py==3.7.0,HeapDict==1.0.1,hpack==4.0.0,httptools==0.4.0,hugectr2onnx==0.0.0,huggingface-hub==0.8.1,hyperframe==6.0.1,idna==2.8,imagesize==1.4.1,implicit==0.6.0,importlib-metadata==4.12.0,importlib-resources==5.9.0,iniconfig==1.1.1,ipykernel==6.15.1,ipython==8.4.0,ipython-genutils==0.2.0,ipywidgets==7.7.0,jedi==0.18.1,Jinja2==3.1.2,jmespath==1.0.1,joblib==1.1.0,json5==0.9.9,jsonschema==4.9.1,jupyter-cache==0.4.3,jupyter-client==7.3.4,jupyter-core==4.11.1,jupyter-server==1.18.1,jupyter-server-mathjax==0.2.5,jupyter-sphinx==0.3.2,jupyterlab==3.4.5,jupyterlab-pygments==0.2.2,jupyterlab-server==2.15.0,jupyterlab-widgets==1.1.0,keras==2.9.0,Keras-Preprocessing==1.1.2,kiwisolver==1.4.4,lazy-object-proxy==1.7.1,libclang==14.0.6,lightfm==1.16,lightgbm==3.3.2,linkify-it-py==1.0.3,llvmlite==0.39.0,locket==1.0.0,lxml==4.9.1,Markdown==3.4.1,markdown-it-py==1.1.0,MarkupSafe==2.1.1,matplotlib==3.5.3,matplotlib-inline==0.1.3,mdit-py-plugins==0.2.8,merlin-core==0.6.0+38.g865b4d3,merlin-models==0.6.0+45.g5a345d9c1,merlin-systems==0+untagged.105.gf89cc51,mistune==0.8.4,mmh3==3.0.0,mpi4py==3.1.3,msgpack==1.0.4,multidict==6.0.2,myst-nb==0.13.2,myst-parser==0.15.2,natsort==8.1.0,nbclassic==0.4.3,nbclient==0.6.6,nbconvert==6.5.3,nbdime==3.1.1,nbformat==5.4.0,nest-asyncio==1.5.5,notebook==6.4.12,notebook-shim==0.1.0,numba==0.56.0,numpy==1.21.5,nvidia-pyindex==1.0.9,# Editable install with no version control (nvtabular==1.3.3+15.g16e4e34e9),-e /usr/local/lib/python3.8/dist-packages,nvtx==0.2.5,oauthlib==3.2.0,onnx==1.12.0,onnxruntime==1.11.1,opt-einsum==3.3.0,packaging==21.3,pandas==1.3.5,pandavro==1.5.2,pandocfilters==1.5.0,parso==0.8.3,partd==1.3.0,pathtools==0.1.2,pexpect==4.8.0,pickleshare==0.7.5,Pillow==9.2.0,pkgutil_resolve_name==1.3.10,platformdirs==2.5.2,pluggy==1.0.0,prometheus-client==0.14.1,promise==2.3,prompt-toolkit==3.0.30,proto-plus==1.19.6,protobuf==3.19.4,psutil==5.9.1,ptyprocess==0.7.0,pure-eval==0.2.2,py==1.11.0,pyarrow==6.0.0,pyasn1==0.4.8,pyasn1-modules==0.2.8,pybind11==2.10.0,pycparser==2.21,pydantic==1.10.2,pydot==1.4.2,Pygments==2.12.0,PyGObject==3.36.0,pynvml==11.4.1,pyparsing==3.0.9,pyrsistent==0.18.1,pytest==7.1.2,pytest-cov==3.0.0,pytest-forked==1.4.0,pytest-xdist==2.5.0,python-apt==2.0.0+ubuntu0.20.4.7,python-dateutil==2.8.2,python-dotenv==0.21.0,python-rapidjson==1.8,pytz==2022.2.1,PyYAML==5.4.1,pyzmq==23.2.1,regex==2022.7.25,requests==2.22.0,requests-oauthlib==1.3.1,requests-unixsocket==0.2.0,rmm==21.12.0,rsa==4.7.2,s3fs==2022.2.0,s3transfer==0.6.0,sacremoses==0.0.53,scikit-build==0.15.0,scikit-learn==1.1.2,scipy==1.9.0,seedir==0.3.0,Send2Trash==1.8.0,sentry-sdk==1.9.4,setproctitle==1.3.2,setuptools-scm==7.0.5,shortuuid==1.0.9,six==1.15.0,sklearn==0.0,smmap==5.0.0,sniffio==1.2.0,snowballstemmer==2.2.0,sortedcontainers==2.4.0,soupsieve==2.3.2.post1,Sphinx==5.1.1,sphinx-multiversion==0.2.4,sphinx-togglebutton==0.3.1,sphinx_external_toc==0.3.0,sphinxcontrib-applehelp==1.0.2,sphinxcontrib-copydirs @ git+https://github.com/mikemckiernan/sphinxcontrib-copydirs.git@bd8c5d79b3f91cf5f1bb0d6995aeca3fe84b670e,sphinxcontrib-devhelp==1.0.2,sphinxcontrib-htmlhelp==2.0.0,sphinxcontrib-jsmath==1.0.1,sphinxcontrib-qthelp==1.0.3,sphinxcontrib-serializinghtml==1.1.5,SQLAlchemy==1.4.36,stack-data==0.4.0,starlette==0.19.1,stringcase==1.2.0,supervisor==4.1.0,tabulate==0.8.10,tblib==1.7.0,tdqm==0.0.1,tenacity==8.0.1,tensorboard==2.9.1,tensorboard-data-server==0.6.1,tensorboard-plugin-wit==1.8.1,tensorflow==2.6.2,tensorflow-estimator==2.9.0,tensorflow-gpu==2.9.2,tensorflow-io-gcs-filesystem==0.26.0,tensorflow-metadata==1.9.0,termcolor==1.1.0,terminado==0.15.0,testbook==0.4.2,threadpoolctl==3.1.0,tinycss2==1.1.1,tokenizers==0.10.3,toml==0.10.2,tomli==2.0.1,toolz==0.12.0,torch==1.12.1+cu113,torchmetrics==0.3.2,tornado==6.2,tox==3.25.1,tqdm==4.64.0,traitlets==5.3.0,transformers==4.12.0,transformers4rec==0.1.11+10.g21a2a836a,treelite==2.3.0,treelite-runtime==2.3.0,tritonclient==2.22.0,typing_extensions==4.3.0,uc-micro-py==1.0.1,urllib3==1.26.11,uvicorn==0.18.3,uvloop==0.16.0,versioneer==0.20,virtualenv==20.16.4,wandb==0.13.1,watchfiles==0.16.1,wcwidth==0.2.5,webencodings==0.5.1,websocket-client==1.3.3,websockets==10.3,Werkzeug==2.2.2,widgetsnbextension==3.6.0,wrapt==1.12.1,xgboost==1.6.1,zict==2.2.0,zipp==3.8.1,zope.event==4.5.0,zope.interface==5.4.0 test-gpu run-test-pre: PYTHONHASHSEED='55312805' test-gpu run-test: commands[0] | python -m pytest --cov-report term --cov merlin -rxs tests/unit ============================= test session starts ============================== platform linux -- Python 3.8.10, pytest-7.1.2, pluggy-1.0.0 cachedir: .tox/test-gpu/.pytest_cache rootdir: /var/jenkins_home/workspace/merlin_core/core, configfile: pyproject.toml plugins: anyio-3.5.0, xdist-2.5.0, forked-1.4.0, cov-3.0.0 collected 367 items / 1 skipped |
merlin/core/dispatch.py
Outdated
else: | ||
DataFrameType = pd.DataFrame # type: ignore | ||
SeriesType = pd.Series # type: ignore | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These types have been a pain since they don't play nicely with Mypy, which doesn't like dynamically defined types like this. This PR replaces them with Python protocols that can be used as a static types and also checked at runtime (with isinstance
.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could keep this code around for a follow-up PR to remove once the references to these in the other repos have been removed. That way we might be able to merge this without breaking tests in the other repos as a result, making the transition smoother?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's a good idea 👍🏻
@runtime_checkable | ||
class DictLike(Protocol): | ||
def __iter__(self): | ||
return iter([]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The actual implementation of these methods don't matter, since they'll be overridden by anything that explicitly implements the protocol and otherwise are only used to check that matching method signatures are available on anything compared with isinstance()
. These fake implementations make the linters happy though, since they insist that __iter__()
must return an iterator and __len__
must return a non-negative integer.
|
||
return results | ||
|
||
# TODO: Replace `nodes` with `graph` here? | ||
def transform( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method has been long and hard to read for as long as it has existed (well before Executor
s were a thing yet.) Here it's just broken down into some smaller methods that help explain what's happening via the method names to make it easier to read and think about.
@@ -91,7 +91,7 @@ def compute_input_schema( | |||
""" | |||
return parents_schema + deps_schema | |||
|
|||
def transform(self, col_selector: ColumnSelector, df: DataFrameType) -> DataFrameType: | |||
def transform(self, col_selector: ColumnSelector, data: Transformable) -> Transformable: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to make the Merlin DAG friendly to transforming dictionary-like objects (as we'd like to do in Systems, Models, and the dataloaders), the typing here needs to be flexible enough to accomodate non-dataframe objects. The Transformable
protocol expects anything passed here to both be dictionary-like in the sense that you can fetch columns with transformable[col_name]
(which dataframes are) and have a few handy dataframe-like methods (e.g. .columns
.) The DictArray
class defined above satisfies both requirements.
GPU_DICT_ARRAY = auto() | ||
|
||
|
||
class ComputeSchemaMixin: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This mixin allows non-operator classes to use the same schema computation machinery as operator without having to become operators and implement the full operator interface.
@@ -46,6 +46,12 @@ def __init__( | |||
self._tags = tags if tags is not None else [] | |||
self.subgroups = subgroups if subgroups is not None else [] | |||
|
|||
self.all = names == "*" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to make it possible to select all columns by default, which is useful in order to make the ComputeSchemaMixin
methods have the lightest possible number of required parameters, we introduce the wildcard selector "*"
, which is used in one of the following ways:
ColumnSelector("*")
"*" >> SomeOperator()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is also useful for defining operator graphs that can be run on different input data by the same executor, since it allows late-binding to all provided columns in a similar way to how tags allow late-binding to some provided columns (but not others.)
@@ -32,7 +32,7 @@ ignore-nested-functions = true | |||
ignore-semiprivate = true | |||
ignore-setters = true | |||
fail-under = 70 | |||
exclude = ["build", "docs", "merlin/io", "tests", "setup.py", "versioneer.py"] | |||
exclude = ["build", "docs", "merlin/core", "merlin/io", "tests", "setup.py", "versioneer.py"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's not really much reason to add docstrings to the newly defined protocols here, since classes that implement the protocols will have their own docstrings
from merlin.dag.executors import LocalExecutor | ||
from merlin.schema.schema import ColumnSchema, Schema | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tests in this file demonstrate the capabilities we discovered were needed in order to process the kinds of data we deal with after loading data from disk and before it gets passed to models, like operating on dictionary-like objects and transforming tuples of data (e.g. (X, y)
) with the same transformation graph.
The downstream tests are broken by the removal of |
Click to view CI ResultsGitHub pull request #139 of commit 4cb6455824f5a56f16ccfa2e424248e668bcb260, no merge conflicts. Running as SYSTEM Setting status of 4cb6455824f5a56f16ccfa2e424248e668bcb260 to PENDING with url https://10.20.13.93:8080/job/merlin_core/209/console and message: 'Pending' Using context: Jenkins Building on master in workspace /var/jenkins_home/workspace/merlin_core using credential ce87ff3c-94f0-400a-8303-cb4acb4918b5 > git rev-parse --is-inside-work-tree # timeout=10 Fetching changes from the remote Git repository > git config remote.origin.url https://github.com/NVIDIA-Merlin/core # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/core > git --version # timeout=10 using GIT_ASKPASS to set credentials login for merlin-systems username and pass > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/core +refs/pull/139/*:refs/remotes/origin/pr/139/* # timeout=10 > git rev-parse 4cb6455824f5a56f16ccfa2e424248e668bcb260^{commit} # timeout=10 Checking out Revision 4cb6455824f5a56f16ccfa2e424248e668bcb260 (detached) > git config core.sparsecheckout # timeout=10 > git checkout -f 4cb6455824f5a56f16ccfa2e424248e668bcb260 # timeout=10 Commit message: "Add additional tests for wildcard selectors" > git rev-list --no-walk 865b4d3f4de02ac39ff7adab6ee823e45a077743 # timeout=10 [merlin_core] $ /bin/bash /tmp/jenkins9828698646634708686.sh GLOB sdist-make: /var/jenkins_home/workspace/merlin_core/core/setup.py test-gpu inst-nodeps: /var/jenkins_home/workspace/merlin_core/core/.tox/.tmp/package/1/merlin-core-0.6.0+39.g4cb6455.zip WARNING: Discarding $PYTHONPATH from environment, to override specify PYTHONPATH in 'passenv' in your configuration. test-gpu installed: absl-py==1.2.0,alabaster==0.7.12,anyio==3.6.1,argon2-cffi==21.3.0,argon2-cffi-bindings==21.2.0,astroid==2.5.6,asttokens==2.0.7,astunparse==1.6.3,asv==0.5.1,asvdb==0.4.2,attrs==22.1.0,awscli==1.25.73,Babel==2.10.3,backcall==0.2.0,beautifulsoup4==4.11.1,betterproto==1.2.5,black==22.6.0,bleach==5.0.1,boto3==1.24.51,botocore==1.27.72,Brotli==1.0.9,cachetools==5.2.0,certifi==2019.11.28,cffi==1.15.1,chardet==3.0.4,clang==5.0,click==8.1.3,cloudpickle==2.1.0,colorama==0.4.4,coverage==6.4.4,cuda-python==11.7.1,cudf==22.4.0,cupy-cuda116==10.6.0,cycler==0.11.0,Cython==0.29.32,dask==2022.1.1,dask-cuda==22.4.0,dask-cudf==22.4.0,dbus-python==1.2.16,debugpy==1.6.2,decorator==5.1.1,defusedxml==0.7.1,dill==0.3.5.1,distlib==0.3.6,distributed==2022.3.0,distro==1.7.0,dm-tree==0.1.7,docker-pycreds==0.4.0,docutils==0.16,emoji==1.7.0,entrypoints==0.4,execnet==1.9.0,executing==0.10.0,faiss-gpu==1.7.2,fastai==2.7.9,fastapi==0.82.0,fastavro==1.6.0,fastcore==1.5.24,fastdownload==0.0.7,fastjsonschema==2.16.1,fastprogress==1.0.3,fastrlock==0.8,feast==0.19.4,fiddle==0.2.0,filelock==3.8.0,flatbuffers==1.12,fonttools==4.37.1,fsspec==2022.5.0,gast==0.4.0,gevent==21.12.0,geventhttpclient==2.0,gitdb==4.0.9,GitPython==3.1.27,google==3.0.0,google-api-core==2.10.0,google-auth==2.11.0,google-auth-oauthlib==0.4.6,google-pasta==0.2.0,googleapis-common-protos==1.52.0,graphviz==0.20.1,greenlet==1.1.2,grpcio==1.41.0,grpcio-channelz==1.47.0,grpcio-reflection==1.48.1,grpclib==0.4.3,h11==0.13.0,h2==4.1.0,h5py==3.7.0,HeapDict==1.0.1,hpack==4.0.0,httptools==0.4.0,hugectr2onnx==0.0.0,huggingface-hub==0.8.1,hyperframe==6.0.1,idna==2.8,imagesize==1.4.1,implicit==0.6.0,importlib-metadata==4.12.0,importlib-resources==5.9.0,iniconfig==1.1.1,ipykernel==6.15.1,ipython==8.4.0,ipython-genutils==0.2.0,ipywidgets==7.7.0,jedi==0.18.1,Jinja2==3.1.2,jmespath==1.0.1,joblib==1.1.0,json5==0.9.9,jsonschema==4.9.1,jupyter-cache==0.4.3,jupyter-client==7.3.4,jupyter-core==4.11.1,jupyter-server==1.18.1,jupyter-server-mathjax==0.2.5,jupyter-sphinx==0.3.2,jupyterlab==3.4.5,jupyterlab-pygments==0.2.2,jupyterlab-server==2.15.0,jupyterlab-widgets==1.1.0,keras==2.9.0,Keras-Preprocessing==1.1.2,kiwisolver==1.4.4,lazy-object-proxy==1.7.1,libclang==14.0.6,lightfm==1.16,lightgbm==3.3.2,linkify-it-py==1.0.3,llvmlite==0.39.0,locket==1.0.0,lxml==4.9.1,Markdown==3.4.1,markdown-it-py==1.1.0,MarkupSafe==2.1.1,matplotlib==3.5.3,matplotlib-inline==0.1.3,mdit-py-plugins==0.2.8,merlin-core==0.6.0+39.g4cb6455,merlin-models==0.6.0+45.g5a345d9c1,merlin-systems==0+untagged.105.gf89cc51,mistune==0.8.4,mmh3==3.0.0,mpi4py==3.1.3,msgpack==1.0.4,multidict==6.0.2,myst-nb==0.13.2,myst-parser==0.15.2,natsort==8.1.0,nbclassic==0.4.3,nbclient==0.6.6,nbconvert==6.5.3,nbdime==3.1.1,nbformat==5.4.0,nest-asyncio==1.5.5,notebook==6.4.12,notebook-shim==0.1.0,numba==0.56.0,numpy==1.21.5,nvidia-pyindex==1.0.9,# Editable install with no version control (nvtabular==1.3.3+15.g16e4e34e9),-e /usr/local/lib/python3.8/dist-packages,nvtx==0.2.5,oauthlib==3.2.0,onnx==1.12.0,onnxruntime==1.11.1,opt-einsum==3.3.0,packaging==21.3,pandas==1.3.5,pandavro==1.5.2,pandocfilters==1.5.0,parso==0.8.3,partd==1.3.0,pathtools==0.1.2,pexpect==4.8.0,pickleshare==0.7.5,Pillow==9.2.0,pkgutil_resolve_name==1.3.10,platformdirs==2.5.2,pluggy==1.0.0,prometheus-client==0.14.1,promise==2.3,prompt-toolkit==3.0.30,proto-plus==1.19.6,protobuf==3.19.4,psutil==5.9.1,ptyprocess==0.7.0,pure-eval==0.2.2,py==1.11.0,pyarrow==6.0.0,pyasn1==0.4.8,pyasn1-modules==0.2.8,pybind11==2.10.0,pycparser==2.21,pydantic==1.10.2,pydot==1.4.2,Pygments==2.12.0,PyGObject==3.36.0,pynvml==11.4.1,pyparsing==3.0.9,pyrsistent==0.18.1,pytest==7.1.2,pytest-cov==3.0.0,pytest-forked==1.4.0,pytest-xdist==2.5.0,python-apt==2.0.0+ubuntu0.20.4.7,python-dateutil==2.8.2,python-dotenv==0.21.0,python-rapidjson==1.8,pytz==2022.2.1,PyYAML==5.4.1,pyzmq==23.2.1,regex==2022.7.25,requests==2.22.0,requests-oauthlib==1.3.1,requests-unixsocket==0.2.0,rmm==21.12.0,rsa==4.7.2,s3fs==2022.2.0,s3transfer==0.6.0,sacremoses==0.0.53,scikit-build==0.15.0,scikit-learn==1.1.2,scipy==1.9.0,seedir==0.3.0,Send2Trash==1.8.0,sentry-sdk==1.9.4,setproctitle==1.3.2,setuptools-scm==7.0.5,shortuuid==1.0.9,six==1.15.0,sklearn==0.0,smmap==5.0.0,sniffio==1.2.0,snowballstemmer==2.2.0,sortedcontainers==2.4.0,soupsieve==2.3.2.post1,Sphinx==5.1.1,sphinx-multiversion==0.2.4,sphinx-togglebutton==0.3.1,sphinx_external_toc==0.3.0,sphinxcontrib-applehelp==1.0.2,sphinxcontrib-copydirs @ git+https://github.com/mikemckiernan/sphinxcontrib-copydirs.git@bd8c5d79b3f91cf5f1bb0d6995aeca3fe84b670e,sphinxcontrib-devhelp==1.0.2,sphinxcontrib-htmlhelp==2.0.0,sphinxcontrib-jsmath==1.0.1,sphinxcontrib-qthelp==1.0.3,sphinxcontrib-serializinghtml==1.1.5,SQLAlchemy==1.4.36,stack-data==0.4.0,starlette==0.19.1,stringcase==1.2.0,supervisor==4.1.0,tabulate==0.8.10,tblib==1.7.0,tdqm==0.0.1,tenacity==8.0.1,tensorboard==2.9.1,tensorboard-data-server==0.6.1,tensorboard-plugin-wit==1.8.1,tensorflow==2.6.2,tensorflow-estimator==2.9.0,tensorflow-gpu==2.9.2,tensorflow-io-gcs-filesystem==0.26.0,tensorflow-metadata==1.9.0,termcolor==1.1.0,terminado==0.15.0,testbook==0.4.2,threadpoolctl==3.1.0,tinycss2==1.1.1,tokenizers==0.10.3,toml==0.10.2,tomli==2.0.1,toolz==0.12.0,torch==1.12.1+cu113,torchmetrics==0.3.2,tornado==6.2,tox==3.25.1,tqdm==4.64.0,traitlets==5.3.0,transformers==4.12.0,transformers4rec==0.1.11+10.g21a2a836a,treelite==2.3.0,treelite-runtime==2.3.0,tritonclient==2.22.0,typing_extensions==4.3.0,uc-micro-py==1.0.1,urllib3==1.26.11,uvicorn==0.18.3,uvloop==0.16.0,versioneer==0.20,virtualenv==20.16.4,wandb==0.13.1,watchfiles==0.16.1,wcwidth==0.2.5,webencodings==0.5.1,websocket-client==1.3.3,websockets==10.3,Werkzeug==2.2.2,widgetsnbextension==3.6.0,wrapt==1.12.1,xgboost==1.6.1,zict==2.2.0,zipp==3.8.1,zope.event==4.5.0,zope.interface==5.4.0 test-gpu run-test-pre: PYTHONHASHSEED='3709598691' test-gpu run-test: commands[0] | python -m pytest --cov-report term --cov merlin -rxs tests/unit ============================= test session starts ============================== platform linux -- Python 3.8.10, pytest-7.1.2, pluggy-1.0.0 cachedir: .tox/test-gpu/.pytest_cache rootdir: /var/jenkins_home/workspace/merlin_core/core, configfile: pyproject.toml plugins: anyio-3.5.0, xdist-2.5.0, forked-1.4.0, cov-3.0.0 collected 370 items / 1 skipped |
@oliverholworthy @nv-alaiacano Thoughts on this? Downstream tests don't pass due to a breaking change related to |
---------- | ||
node : Node | ||
Output node of the graph to execute | ||
data : DataFrameType |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
input type here is now Transformable
? and return type is the same or something else?
|
||
|
||
@runtime_checkable | ||
class Transformable(DictLike, Protocol): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we only use the three methods defined here in practice? Would it continue to be valid if we dropped DictLike
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we already use more than these three methods in practice (e.g. __setitem__
), and part of what we're trying to do is make it possible to use dictionary-like objects, so even if we're not using the methods yet, this is laying the groundwork for being able to treat dictionaries and dataframes interchangeably.
Click to view CI ResultsGitHub pull request #139 of commit 41826018322df2a0a1247a7ef84fbe0dab9958bb, no merge conflicts. Running as SYSTEM Setting status of 41826018322df2a0a1247a7ef84fbe0dab9958bb to PENDING with url https://10.20.13.93:8080/job/merlin_core/210/console and message: 'Pending' Using context: Jenkins Building on master in workspace /var/jenkins_home/workspace/merlin_core using credential ce87ff3c-94f0-400a-8303-cb4acb4918b5 > git rev-parse --is-inside-work-tree # timeout=10 Fetching changes from the remote Git repository > git config remote.origin.url https://github.com/NVIDIA-Merlin/core # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/core > git --version # timeout=10 using GIT_ASKPASS to set credentials login for merlin-systems username and pass > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/core +refs/pull/139/*:refs/remotes/origin/pr/139/* # timeout=10 > git rev-parse 41826018322df2a0a1247a7ef84fbe0dab9958bb^{commit} # timeout=10 Checking out Revision 41826018322df2a0a1247a7ef84fbe0dab9958bb (detached) > git config core.sparsecheckout # timeout=10 > git checkout -f 41826018322df2a0a1247a7ef84fbe0dab9958bb # timeout=10 Commit message: "Keep `DataFrameType` and `SeriesType` for now to smooth the transition" > git rev-list --no-walk 4cb6455824f5a56f16ccfa2e424248e668bcb260 # timeout=10 [merlin_core] $ /bin/bash /tmp/jenkins14909425134326493424.sh GLOB sdist-make: /var/jenkins_home/workspace/merlin_core/core/setup.py test-gpu recreate: /var/jenkins_home/workspace/merlin_core/core/.tox/test-gpu test-gpu installdeps: pytest, pytest-cov WARNING: Discarding $PYTHONPATH from environment, to override specify PYTHONPATH in 'passenv' in your configuration. test-gpu inst: /var/jenkins_home/workspace/merlin_core/core/.tox/.tmp/package/1/merlin-core-0.7.0+39.g4182601.zip WARNING: Discarding $PYTHONPATH from environment, to override specify PYTHONPATH in 'passenv' in your configuration. test-gpu installed: absl-py==1.2.0,aiohttp==3.8.1,aiosignal==1.2.0,alabaster==0.7.12,anyio==3.6.1,argon2-cffi==21.3.0,argon2-cffi-bindings==21.2.0,astroid==2.5.6,asttokens==2.0.8,astunparse==1.6.3,asv==0.5.1,asvdb==0.4.2,async-timeout==4.0.2,attrs==22.1.0,awscli==1.25.82,Babel==2.10.3,backcall==0.2.0,beautifulsoup4==4.11.1,betterproto==1.2.5,black==22.6.0,bleach==5.0.1,boto3==1.24.75,botocore==1.27.81,Brotli==1.0.9,cachetools==5.2.0,certifi==2019.11.28,cffi==1.15.1,chardet==3.0.4,charset-normalizer==2.1.1,clang==5.0,click==8.1.3,cloudpickle==2.2.0,cmake==3.24.1.1,colorama==0.4.4,contourpy==1.0.5,coverage==6.4.4,cuda-python==11.7.1,cupy-cuda117==10.6.0,cycler==0.11.0,Cython==0.29.32,dask==2022.1.1,dbus-python==1.2.16,debugpy==1.6.3,decorator==5.1.1,defusedxml==0.7.1,dill==0.3.5.1,distlib==0.3.6,distributed==2022.5.1,distro==1.7.0,dm-tree==0.1.6,docker-pycreds==0.4.0,docutils==0.16,emoji==1.7.0,entrypoints==0.4,execnet==1.9.0,executing==1.0.0,faiss==1.7.2,faiss-gpu==1.7.2,fastai==2.7.9,fastapi==0.85.0,fastavro==1.6.1,fastcore==1.5.27,fastdownload==0.0.7,fastjsonschema==2.16.1,fastprogress==1.0.3,fastrlock==0.8,feast==0.19.4,fiddle==0.2.2,filelock==3.8.0,flatbuffers==1.12,fonttools==4.37.3,frozenlist==1.3.1,fsspec==2022.5.0,gast==0.4.0,gevent==21.12.0,geventhttpclient==2.0.2,gitdb==4.0.9,GitPython==3.1.27,google==3.0.0,google-api-core==2.10.1,google-auth==2.11.1,google-auth-oauthlib==0.4.6,google-pasta==0.2.0,googleapis-common-protos==1.52.0,graphviz==0.20.1,greenlet==1.1.3,grpcio==1.41.0,grpcio-channelz==1.49.0,grpcio-reflection==1.48.1,grpclib==0.4.3,h11==0.13.0,h2==4.1.0,h5py==3.7.0,HeapDict==1.0.1,hpack==4.0.0,httptools==0.5.0,hugectr2onnx==0.0.0,huggingface-hub==0.9.1,hyperframe==6.0.1,idna==2.8,imagesize==1.4.1,implicit==0.6.1,importlib-metadata==4.12.0,importlib-resources==5.9.0,iniconfig==1.1.1,ipykernel==6.15.3,ipython==8.5.0,ipython-genutils==0.2.0,ipywidgets==7.7.0,jedi==0.18.1,Jinja2==3.1.2,jmespath==1.0.1,joblib==1.2.0,json5==0.9.10,jsonschema==4.16.0,jupyter-cache==0.4.3,jupyter-core==4.11.1,jupyter-server==1.18.1,jupyter-server-mathjax==0.2.5,jupyter-sphinx==0.3.2,jupyter_client==7.3.5,jupyterlab==3.4.7,jupyterlab-pygments==0.2.2,jupyterlab-widgets==1.1.0,jupyterlab_server==2.15.1,keras==2.9.0,Keras-Preprocessing==1.1.2,kiwisolver==1.4.4,lazy-object-proxy==1.7.1,libclang==14.0.6,libcst==0.4.7,lightfm==1.16,lightgbm==3.3.2,linkify-it-py==1.0.3,llvmlite==0.39.1,locket==1.0.0,lxml==4.9.1,Markdown==3.4.1,markdown-it-py==1.1.0,MarkupSafe==2.1.1,matplotlib==3.6.0,matplotlib-inline==0.1.6,mdit-py-plugins==0.2.8,merlin-core==0.7.0+39.g4182601,merlin-models==0.7.0+11.g280956aa4,merlin-systems==0.5.0+4.g15074ad,mistune==2.0.4,mmh3==3.0.0,mpi4py==3.1.3,msgpack==1.0.4,multidict==6.0.2,mypy-extensions==0.4.3,myst-nb==0.13.2,myst-parser==0.15.2,natsort==8.1.0,nbclassic==0.4.3,nbclient==0.6.8,nbconvert==7.0.0,nbdime==3.1.1,nbformat==5.5.0,nest-asyncio==1.5.5,ninja==1.10.2.3,notebook==6.4.12,notebook-shim==0.1.0,numba==0.56.2,numpy==1.22.4,nvidia-pyindex==1.0.9,# Editable install with no version control (nvtabular==1.4.0+8.g95e12d347),-e /usr/local/lib/python3.8/dist-packages,nvtx==0.2.5,oauthlib==3.2.1,oldest-supported-numpy==2022.8.16,onnx==1.12.0,onnxruntime==1.11.1,opt-einsum==3.3.0,packaging==21.3,pandas==1.3.5,pandavro==1.5.2,pandocfilters==1.5.0,parso==0.8.3,partd==1.3.0,pathtools==0.1.2,pexpect==4.8.0,pickleshare==0.7.5,Pillow==9.2.0,pkgutil_resolve_name==1.3.10,platformdirs==2.5.2,pluggy==1.0.0,prometheus-client==0.14.1,promise==2.3,prompt-toolkit==3.0.31,proto-plus==1.19.6,protobuf==3.19.5,psutil==5.9.2,ptyprocess==0.7.0,pure-eval==0.2.2,py==1.11.0,pyarrow==7.0.0,pyasn1==0.4.8,pyasn1-modules==0.2.8,pybind11==2.10.0,pycparser==2.21,pydantic==1.10.2,pydot==1.4.2,Pygments==2.13.0,PyGObject==3.36.0,pynvml==11.4.1,pyparsing==3.0.9,pyrsistent==0.18.1,pytest==7.1.3,pytest-cov==3.0.0,pytest-forked==1.4.0,pytest-xdist==2.5.0,python-apt==2.0.0+ubuntu0.20.4.8,python-dateutil==2.8.2,python-dotenv==0.21.0,python-rapidjson==1.8,pytz==2022.2.1,PyYAML==5.4.1,pyzmq==24.0.0,regex==2022.9.13,requests==2.22.0,requests-oauthlib==1.3.1,requests-unixsocket==0.2.0,rsa==4.7.2,s3fs==2022.2.0,s3transfer==0.6.0,sacremoses==0.0.53,scikit-build==0.15.0,scikit-learn==1.1.2,scipy==1.9.1,seedir==0.3.0,Send2Trash==1.8.0,sentry-sdk==1.9.8,setproctitle==1.3.2,setuptools-scm==7.0.5,shortuuid==1.0.9,six==1.15.0,sklearn==0.0,smmap==5.0.0,sniffio==1.3.0,snowballstemmer==2.2.0,sortedcontainers==2.4.0,soupsieve==2.3.2.post1,Sphinx==5.2.1,sphinx-multiversion==0.2.4,sphinx-togglebutton==0.3.1,sphinx_external_toc==0.3.0,sphinxcontrib-applehelp==1.0.2,sphinxcontrib-copydirs @ git+https://github.com/mikemckiernan/sphinxcontrib-copydirs.git@bd8c5d79b3f91cf5f1bb0d6995aeca3fe84b670e,sphinxcontrib-devhelp==1.0.2,sphinxcontrib-htmlhelp==2.0.0,sphinxcontrib-jsmath==1.0.1,sphinxcontrib-qthelp==1.0.3,sphinxcontrib-serializinghtml==1.1.5,SQLAlchemy==1.4.36,stack-data==0.5.0,starlette==0.20.4,stringcase==1.2.0,supervisor==4.1.0,tabulate==0.8.10,tblib==1.7.0,tdqm==0.0.1,tenacity==8.0.1,tensorboard==2.9.1,tensorboard-data-server==0.6.1,tensorboard-plugin-wit==1.8.1,tensorflow==2.6.2,tensorflow-estimator==2.9.0,tensorflow-gpu==2.9.2,tensorflow-io-gcs-filesystem==0.27.0,tensorflow-metadata==1.10.0,termcolor==2.0.1,terminado==0.15.0,testbook==0.4.2,threadpoolctl==3.1.0,tinycss2==1.1.1,tokenizers==0.10.3,toml==0.10.2,tomli==2.0.1,toolz==0.12.0,torch==1.12.1+cu113,torchmetrics==0.3.2,tornado==6.2,tox==3.26.0,tqdm==4.64.1,traitlets==5.4.0,transformers==4.12.0,transformers4rec==0.1.12+2.gbcc939255,treelite==2.3.0,treelite-runtime==2.3.0,tritonclient==2.25.0,typing-inspect==0.8.0,typing_extensions==4.3.0,uc-micro-py==1.0.1,urllib3==1.26.12,uvicorn==0.18.3,uvloop==0.17.0,versioneer==0.20,virtualenv==20.16.5,wandb==0.13.3,watchfiles==0.17.0,wcwidth==0.2.5,webencodings==0.5.1,websocket-client==1.4.1,websockets==10.3,Werkzeug==2.2.2,widgetsnbextension==3.6.0,wrapt==1.12.1,xgboost==1.6.2,yarl==1.8.1,zict==2.2.0,zipp==3.8.1,zope.event==4.5.0,zope.interface==5.4.0 test-gpu run-test-pre: PYTHONHASHSEED='1558946721' test-gpu run-test: commands[0] | python -m pytest --cov-report term --cov merlin -rxs tests/unit ============================= test session starts ============================== platform linux -- Python 3.8.10, pytest-7.1.3, pluggy-1.0.0 cachedir: .tox/test-gpu/.pytest_cache rootdir: /var/jenkins_home/workspace/merlin_core/core, configfile: pyproject.toml plugins: anyio-3.5.0, xdist-2.5.0, forked-1.4.0, cov-3.0.0 collected 370 items / 1 skipped |
Breaking this down into smaller PRs, starting with: |
Superseded by the PRs listed above |
Depends on NVIDIA-Merlin/core#139 and NVIDIA-Merlin/core#146 We've had the concept of an `InferenceDataFrame` for a while, but it's really just a wrapper around a dictionary of arrays. That data structure is useful in a bunch of places, so this swaps out `InferenceDataFrame` for a similar class from Merlin Core called `DictArray`.
…Merlin Core (#204) * Rework operators to use `DictArray` and `LocalExecutor` from Merlin Core Depends on NVIDIA-Merlin/core#139 and NVIDIA-Merlin/core#146 We've had the concept of an `InferenceDataFrame` for a while, but it's really just a wrapper around a dictionary of arrays. That data structure is useful in a bunch of places, so this swaps out `InferenceDataFrame` for a similar class from Merlin Core called `DictArray`. * Use `np.array([1])` to make `pandas` happy * Mark FAISS tests with Triton `importorskip`s * Install current version-under-test in Tox environments When APIs/internals change, it's important that Triton is running the same version of the code that we're testing, since our tests run Triton. * Skip FAISS executor test if Triton executable is not found * Skip the Triton executor model test when Triton isn't available * Use the `_parse_model_repository` fn in executor model * Make some fixes to the Implicit op and tests * Update the FIL op's export method for `executor` mode * Fix the artifact path in the executor model * Sort out dtypes in Implicit op and tests Co-authored-by: Julio Perez <37191411+jperez999@users.noreply.github.com>
This should make it easier to use Merlin schemas outside the context of operators and DAGs without dataframes.
Included changes:
ComputeSchemaMixin
that contains only the schema computation parts ofBaseOperator
DictionaryLike
,SeriesLike
,DataframeLike
, andTransformable
Python protocolsmerlin.dag
andmerlin.dispatch
packages (replacingDataFrameType
andSeriesType
)DictArray
andColumn
classes that conform to theDataFrameLike
andSeriesLike
protocols (respectively)