Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add REST API support for feature registry #99

Merged
merged 20 commits into from
Apr 16, 2022
Merged
Show file tree
Hide file tree
Changes from 12 commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
108 changes: 108 additions & 0 deletions docs/how-to-guides/deploy-feathr-api-as-webapp.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
# Feathr API
jainr marked this conversation as resolved.
Show resolved Hide resolved
The API currently supports following functionality
jainr marked this conversation as resolved.
Show resolved Hide resolved

1. Get Feature by Qualified Name
2. Get Feature by GUID
jainr marked this conversation as resolved.
Show resolved Hide resolved
3. Get List of Features
4. Get Lineage for a Feature


## Build and run locally
### Install
__NOTE:__ You can run the following command in your local python environment or in your Azure Virtual machine.
You can install dependencies through the requirements file
```bash
$ pip install -r requirements.txt
```

### Run
This command will start the uvicorn server locally and will dynamically load your changes.
```bash
uvicorn api:app --port 8080 --reload
jainr marked this conversation as resolved.
Show resolved Hide resolved
```

## Build and deploy on Azure
Here are the steps to build the API as a docker container, push it to Azure Container registry and then deploy it as webapp. The instructions below are for Mac/Linux but should work on Windows too. You might have to use sudo command or run docker as administrator on windows if you don't have right privileges.

1. Install Azure CLI by following instructions [here](https://docs.microsoft.com/en-us/cli/azure/install-azure-cli?view=azure-cli-latest)

1. Create Azure Container Registry. First create the resource group.
jainr marked this conversation as resolved.
Show resolved Hide resolved
```bash
az group create --name <your_rg_name> --location <location example:westus>
```

Then create the container registry
```bash
az acr create --resource-group <your_rg_name> --name <registry-name> --sku Basic
```

1. Login to your Azure container registry (ACR) account.
```bash
$ az acr login --name <registry-name>
```

1. Clone the repository and navigate to api folder
```bash
$ git clone git@github.com:linkedin/feathr.git

$ cd feathr_project/feathr/api

```

1. Build the docker container locally, you need to have docker installed locally and have it running. To set up docker on your machine follow the instructions [here](https://docs.docker.com/get-started/)
__Note: Note: <your_username>/image_name is not a mandatory format for specifying the name of the image.It’s just a useful convention to avoid tagging your image again when you need to push it to a registry. It can be anything you want in the format below__

```bash
$ docker build -t feathr/api .
```

1. Run docker images command and you will see your newly created image
```bash
$ docker images

REPOSITORY TAG IMAGE ID CREATED SIZE
feathr/api latest a647ea749b9b 5 minutes ago 529MB
```

1. Before you can push an image to your registry, you must tag it with the fully qualified name of your ACR login server. The login server name is in the format <registry-name>.azurecr.io (all lowercase), for example, mycontainerregistry007.azurecr.io. Tag the image
```bash
$ docker tag feathr/api:latest feathracr.azurecr.io/feathr/api:latest
```
1. Push the image to the registry
```bash
$ docker push feathracr.azurecr.io/feathr/api:latest
```
1. List the images from your registry to see your recently pushed image
```
az acr repository list --name feathracr --output table
```
Output:
```
Result
----------
feathr/api
```

## Deploy image to Azure WebApp for Containers

1. Go to [Azure portal](https://portal.azure.com) and search for your container registry
1. Select repositories from the left pane and click latest tag. Click on the three dots on right side of the tag and select __Deploy to WebApp__ option. If you see the __Deploy to WebApp__ option greyed out, you would have to enable Admin User on the registry by Updating it.

![Container Image 1](../images/feathr_api_image_latest.png)

![Container Image 2](../images/feathr_api_image_latest_options.png)


1. Provide a name for the deployed webapp, along with the subscription to deploy app into, the resource group and the appservice plan

![Container Image](../images/feathr_api_image_latest_deployment.png)

1. You will get the notification that your app has been successfully deployed, click on __Go to Resource__ button.


1. On the App overview page go to the URL (https://<app_name>.azurewebsites.net/docs) for deployed app (it's under URL on the app overview page) and you should see the API documentation.

![API docs](../images/api-docs.png)

Congratulations you have successfully deployed the Feathr API.

Binary file added docs/images/api-docs.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/feathr_api_image_latest.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/feathr_api_image_latest_options.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
67 changes: 61 additions & 6 deletions feathr_project/feathr/_feature_registry.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,6 @@
from pyapacheatlas.core.util import GuidTracker
from pyhocon import ConfigFactory

from feathr._envvariableutil import _EnvVaraibleUtil
from feathr._file_utils import write_to_file
from feathr.anchor import FeatureAnchor
from feathr.constants import *
Expand Down Expand Up @@ -52,9 +51,9 @@ def __init__(self, project_name: str, azure_purview_name: str, registry_delimite
self.credential = DefaultAzureCredential(exclude_interactive_browser_credential=False) if credential is None else credential
self.oauth = AzCredentialWrapper(credential=self.credential)
self.purview_client = PurviewClient(
account_name=self.azure_purview_name,
authentication=self.oauth
)
account_name=self.azure_purview_name,
authentication=self.oauth
)
self.guid = GuidTracker(starting=-1000)
self.entity_batch_queue = []

Expand Down Expand Up @@ -657,6 +656,7 @@ def register_features(self, workspace_path: Optional[Path] = None, from_context:
logger.info(
"Finished registering features. See {} to access the Purview web interface", webinterface_path)


def _purge_feathr_registry(self):
"""
Delete all the feathr related entities and type definitions in feathr registry. For internal use only
Expand Down Expand Up @@ -704,7 +704,8 @@ def _delete_all_feathr_entitties(self):
self.purview_client.delete_entity(
guid=guid_list[i:i+batch_delte_size])
logger.info("{} feathr entities deleted", batch_delte_size)


@classmethod
def _get_registry_client(self):
"""
Return a client object and users can operate more on it (like doing search)
Expand Down Expand Up @@ -733,12 +734,66 @@ def list_registered_features(self, project_name: str = None, limit=50, starting_
feature_list.append(entity["name"])

return feature_list

def get_feature_by_fqdn_type(self, qualifiedName, typeName):
jainr marked this conversation as resolved.
Show resolved Hide resolved
"""
Get a single feature by it's QualifiedName and Type
Returns the feature else throws an AtlasException with 400 error code
"""
response = self.purview_client.get_entity(qualifiedName=qualifiedName, typeName=typeName)
entities = response.get('entities')
for entity in entities:
if entity.get('typeName') == typeName and entity.get('attributes').get('qualifiedName') == qualifiedName:
return entity

def get_feature_by_fqdn(self, qualifiedName):
"""
Get feature by qualifiedName
Returns the feature else throws an AtlasException with 400 error code
"""
guid = self.get_feature_guid(qualifiedName)
return self.get_feature_by_guid(guid)
jainr marked this conversation as resolved.
Show resolved Hide resolved

def get_feature_by_guid(self, guid):
"""
Get a single feature by it's GUID
Returns the feature else throws an AtlasException with 400 error code
jainr marked this conversation as resolved.
Show resolved Hide resolved
"""
response = self.purview_client.get_single_entity(guid=guid)
return response

def get_feature_lineage(self, guid):
"""
Get feature's lineage by it's GUID
Returns the feature else throws an AtlasException with 400 error code
"""
return self.purview_client.get_entity_lineage(guid=guid)

def get_feature_guid(self, qualifiedName):
"""
Get guid of a feature given its qualifiedName
"""
search_term = "qualifiedName:{0}".format(qualifiedName)
entities = self.purview_client.discovery.search_entities(search_term)
for entity in entities:
if entity.get('qualifiedName') == qualifiedName:
return entity.get('id')

def search_features(self, searchTerm):
"""
Search the registry for the given query term
jainr marked this conversation as resolved.
Show resolved Hide resolved
For a ride hailing company few examples could be - "taxi", "passenger", "fare" etc.
It's a keyword search on the registry metadata
"""
search_term = "qualifiedName:{0}".format(searchTerm)
entities = self.purview_client.discovery.search_entities(search_term)
return entities

def _list_registered_entities_with_details(self, project_name: str = None, entity_type: Union[str, List[str]] = None, limit=50, starting_offset=0,) -> List[Dict]:
"""
List all the already registered entities. entity_type should be one of: SOURCE, DERIVED_FEATURE, ANCHOR, ANCHOR_FEATURE, FEATHR_PROJECT, or a list of those values
limit: a maximum 1000 will be enforced at the underlying API

returns a list of the result entities.
"""
entity_type_list = [entity_type] if isinstance(
Expand Down
Loading