Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HOPSWORKS-2206] HSFS profile to install with and without Hive dependencies #200

Merged
merged 2 commits into from
Dec 18, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion docs/integrations/python.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,9 +30,13 @@ Create a file called `featurestore.key` in your designated Python environment an
To be able to access the Hopsworks Feature Store, the `HSFS` Python library needs to be installed in the environment from which you want to connect to the Feature Store. You can install the library through pip. We recommend using a Python environment manager such as *virtualenv* or *conda*.

```
pip install hsfs~=[HOPSWORKS_VERSION]
pip install hsfs[hive]~=[HOPSWORKS_VERSION]
```

!!! attention "Hive Dependencies"

By default, `HSFS` assumes Spark/EMR is used as execution engine and therefore Hive dependencies are not installed. Hence, on a local Python evnironment, if you are planning to use a regular Python Kernel **without Spark/EMR**, make sure to install the **"hive"** extra dependencies (`hsfs[hive]`).

!!! attention "Matching Hopsworks version"
The **major version of `HSFS`** needs to match the **major version of Hopsworks**.

Expand Down
6 changes: 5 additions & 1 deletion docs/integrations/sagemaker.md
Original file line number Diff line number Diff line change
Expand Up @@ -141,9 +141,13 @@ You have two options to make your API key accessible from SageMaker:
To be able to access the Hopsworks Feature Store, the `HSFS` Python library needs to be installed. One way of achieving this is by opening a Python notebook in SageMaker and installing the `HSFS` with a magic command and pip:

```
!pip install hsfs~=[HOPSWORKS_VERSION]
!pip install hsfs[hive]~=[HOPSWORKS_VERSION]
```

!!! attention "Hive Dependencies"

By default, `HSFS` assumes Spark/EMR is used as execution engine and therefore Hive dependencies are not installed. Hence, on AWS SageMaker, if you are planning to use a regular Python Kernel **without Spark/EMR**, make sure to install the **"hive"** extra dependencies (`hsfs[hive]`).

!!! attention "Matching Hopsworks version"
The **major version of `HSFS`** needs to match the **major version of Hopsworks**.

Expand Down
11 changes: 10 additions & 1 deletion python/hsfs/engine/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,8 @@
# limitations under the License.
#

from hsfs.engine import spark, hive
from hsfs.engine import spark
from hsfs.client import exceptions

_engine = None

Expand All @@ -25,6 +26,14 @@ def init(engine_type, host=None, cert_folder=None, project=None, cert_key=None):
if engine_type == "spark":
_engine = spark.Engine()
elif engine_type == "hive":
try:
from hsfs.engine import hive
except ImportError:
raise exceptions.FeatureStoreException(
"Trying to instantiate Hive as engine, but 'hive' extras are "
"missing in HSFS installation. Install with `pip install "
"hsfs[hive]`."
)
_engine = hive.Engine(host, cert_folder, project, cert_key)


Expand Down
6 changes: 2 additions & 4 deletions python/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,10 +22,7 @@ def read(fname):
"boto3",
"pandas",
"numpy",
"pyhopshive[thrift]",
"PyMySQL",
"pyjks",
"sqlalchemy",
"mock",
],
extras_require={
Expand All @@ -37,7 +34,8 @@ def read(fname):
"mkdocs",
"mkdocs-material",
"keras-autodoc",
"markdown-include"]
"markdown-include"],
"hive": ["pyhopshive[thrift]", "sqlalchemy", "PyMySQL"],
},
author="Logical Clocks AB",
author_email="moritz@logicalclocks.com",
Expand Down