Skip to content

Commit

Permalink
Add telemetry to Python SDK (#1289)
Browse files Browse the repository at this point in the history
* add basic anonymous telemetry

Signed-off-by: Jacob Klegar <jacob@tecton.ai>

* add telemetry docs

Signed-off-by: Jacob Klegar <jacob@tecton.ai>

* add testing

Signed-off-by: Jacob Klegar <jacob@tecton.ai>

* fix test

Signed-off-by: Jacob Klegar <jacob@tecton.ai>

* lint

Signed-off-by: Jacob Klegar <jacob@tecton.ai>

* fix prow job

Signed-off-by: Jacob Klegar <jacob@tecton.ai>

* bugfix

Signed-off-by: Jacob Klegar <jacob@tecton.ai>

* fix env var setting

Signed-off-by: Jacob Klegar <jacob@tecton.ai>

* lint

Signed-off-by: Jacob Klegar <jacob@tecton.ai>

* address comments and fix unit test

Signed-off-by: Jacob Klegar <jacob@tecton.ai>

* lint

Signed-off-by: Jacob Klegar <jacob@tecton.ai>
  • Loading branch information
jklegar committed Jan 29, 2021
1 parent 4059a21 commit a8c4bbf
Show file tree
Hide file tree
Showing 15 changed files with 232 additions and 5 deletions.
20 changes: 20 additions & 0 deletions .prow/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -144,6 +144,26 @@ presubmits:
- image: python:3.7
command: ["infra/scripts/test-python-sdk.sh"]

- name: test-telemetry
decorate: true
run_if_changed: "sdk/python/.*"
spec:
containers:
- image: python:3.7
command: ["infra/scripts/test-telemetry.sh"]
env:
- name: GOOGLE_APPLICATION_CREDENTIALS
value: /etc/gcloud/service-account.json
volumeMounts:
- mountPath: /etc/gcloud/service-account.json
name: service-account
readOnly: true
subPath: service-account.json
volumes:
- name: service-account
secret:
secretName: feast-service-account

- name: test-golang-sdk
decorate: true
spec:
Expand Down
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ install-python: compile-protos-python
python -m pip install -e sdk/python

test-python:
pytest --verbose --color=yes sdk/python/tests
FEAST_TELEMETRY=False pytest --verbose --color=yes sdk/python/tests

format-python:
# Sort
Expand Down
1 change: 1 addition & 0 deletions docs/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@
* [Security](advanced/security.md)
* [Audit Logging](advanced/audit-logging.md)
* [Metrics](advanced/metrics.md)
* [Telemetry](advanced/telemetry.md)
* [Troubleshooting](advanced/troubleshooting.md)

## Reference
Expand Down
10 changes: 10 additions & 0 deletions docs/advanced/telemetry.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# Telemetry

### How telemetry is used

The Feast maintainers use anonymous usage statistics to help shape the Feast roadmap. Several client methods are tracked, beginning in Feast 0.9. Users are assigned a UUID which is sent along with the name of the method, the Feast version, the OS (using `sys.platform`), and the current time. For more detailed information see [the source code](https://github.com/feast-dev/feast/blob/master/sdk/python/feast/telemetry.py).

### How to disable telemetry

To opt out of telemetry, simply set the environment variable `FEAST_TELEMETRY` to `False` in the environment in which the Feast client is run.

3 changes: 2 additions & 1 deletion infra/scripts/test-end-to-end-aws.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ pip install "s3fs" "boto3" "urllib3>=1.25.4"

export DISABLE_FEAST_SERVICE_FIXTURES=1
export DISABLE_SERVICE_FIXTURES=1
export FEAST_TELEMETRY="False"

PYTHONPATH=sdk/python pytest tests/e2e/ \
--feast-version develop \
Expand All @@ -17,4 +18,4 @@ PYTHONPATH=sdk/python pytest tests/e2e/ \
--redis-url $NODE_IP:32379 \
--emr-region us-west-2 \
--kafka-brokers $NODE_IP:30092 \
-m "not bq"
-m "not bq"
1 change: 1 addition & 0 deletions infra/scripts/test-end-to-end-gcp.sh
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ make build-java-no-tests REVISION=develop
python -m pip install --upgrade pip setuptools wheel
make install-python
python -m pip install -qr tests/requirements.txt
export FEAST_TELEMETRY="False"

su -p postgres -c "PATH=$PATH HOME=/tmp pytest -v tests/e2e/ \
--feast-version develop --env=gcloud --dataproc-cluster-name feast-e2e \
Expand Down
3 changes: 2 additions & 1 deletion infra/scripts/test-end-to-end-sparkop.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ pip install "s3fs" "boto3" "urllib3>=1.25.4"

export DISABLE_FEAST_SERVICE_FIXTURES=1
export DISABLE_SERVICE_FIXTURES=1
export FEAST_TELEMETRY="False"

export FEAST_SPARK_K8S_NAMESPACE=sparkop

Expand All @@ -17,4 +18,4 @@ PYTHONPATH=sdk/python pytest tests/e2e/ \
--staging-path $STAGING_PATH \
--redis-url sparkop-redis-master.sparkop.svc.cluster.local:6379 \
--kafka-brokers sparkop-kafka.sparkop.svc.cluster.local:9092 \
-m "not bq"
-m "not bq"
3 changes: 2 additions & 1 deletion infra/scripts/test-end-to-end.sh
Original file line number Diff line number Diff line change
Expand Up @@ -10,5 +10,6 @@ make build-java-no-tests REVISION=develop
python -m pip install --upgrade pip setuptools wheel
make install-python
python -m pip install -qr tests/requirements.txt
export FEAST_TELEMETRY="False"

su -p postgres -c "PATH=$PATH HOME=/tmp pytest -v tests/e2e/ --feast-version develop"
su -p postgres -c "PATH=$PATH HOME=/tmp pytest -v tests/e2e/ --feast-version develop"
3 changes: 2 additions & 1 deletion infra/scripts/test-integration.sh
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,5 @@ python -m pip install --upgrade pip setuptools wheel
make install-python
python -m pip install -qr tests/requirements.txt

pytest tests/integration --dataproc-cluster-name feast-e2e --dataproc-project kf-feast --dataproc-region us-central1 --dataproc-staging-location gs://feast-templocation-kf-feast
export FEAST_TELEMETRY="False"
pytest tests/integration --dataproc-cluster-name feast-e2e --dataproc-project kf-feast --dataproc-region us-central1 --dataproc-staging-location gs://feast-templocation-kf-feast
2 changes: 2 additions & 0 deletions infra/scripts/test-python-sdk.sh
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,6 @@ make lint-python

cd sdk/python/
pip install -e .
cd tests/
export FEAST_TELEMETRY="False"
pytest --junitxml=${LOGS_ARTIFACT_PATH}/python-sdk-test-report.xml
15 changes: 15 additions & 0 deletions infra/scripts/test-telemetry.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
#!/usr/bin/env bash

set -e

# Default artifact location setting in Prow jobs
LOGS_ARTIFACT_PATH=/logs/artifacts

pip install -r sdk/python/requirements-ci.txt
make compile-protos-python
make lint-python

cd sdk/python/
pip install -e .
cd telemetry_tests/
pytest --junitxml=${LOGS_ARTIFACT_PATH}/python-sdk-test-report.xml
56 changes: 56 additions & 0 deletions sdk/python/feast/client.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
import warnings
from datetime import datetime
from itertools import groupby
from os.path import expanduser, join
from typing import Any, Dict, List, Optional, Union

import grpc
Expand Down Expand Up @@ -101,6 +102,7 @@
stage_entities_to_fs,
table_reference_from_string,
)
from feast.telemetry import log_usage

_logger = logging.getLogger(__name__)

Expand Down Expand Up @@ -148,6 +150,8 @@ def __init__(self, options: Optional[Dict[str, str]] = None, **kwargs):
if self._config.getboolean(opt.ENABLE_AUTH):
self._auth_metadata = feast_auth.get_auth_metadata_plugin(self._config)

self._configure_telemetry()

@property
def _core_service(self):
"""
Expand Down Expand Up @@ -382,6 +386,27 @@ def version(self):

return result

def _configure_telemetry(self):
telemetry_filepath = join(expanduser("~"), ".feast", "telemetry")
self._telemetry_enabled = (
self._config.get(opt.TELEMETRY, "True") == "True"
) # written this way to turn the env var string into a boolean
if self._telemetry_enabled:
self._telemetry_counter = {"get_online_features": 0}
if os.path.exists(telemetry_filepath):
with open(telemetry_filepath, "r") as f:
self._telemetry_id = f.read()
else:
self._telemetry_id = str(uuid.uuid4())
print(
"Feast is an open source project that collects anonymized usage statistics. To opt out or learn more see https://docs.feast.dev/v/master/advanced/telemetry"
)
with open(telemetry_filepath, "w") as f:
f.write(self._telemetry_id)
else:
if os.path.exists(telemetry_filepath):
os.remove(telemetry_filepath)

@property
def project(self) -> str:
"""
Expand Down Expand Up @@ -487,6 +512,8 @@ def apply(
>>> feast_client.apply(entity)
"""

if self._telemetry_enabled:
log_usage("apply", self._telemetry_id, datetime.utcnow(), self.version())
if project is None:
project = self.project

Expand Down Expand Up @@ -594,6 +621,10 @@ def get_entity(self, name: str, project: str = None) -> Entity:
none is found
"""

if self._telemetry_enabled:
log_usage(
"get_entity", self._telemetry_id, datetime.utcnow(), self.version()
)
if project is None:
project = self.project

Expand Down Expand Up @@ -707,6 +738,13 @@ def get_feature_table(self, name: str, project: str = None) -> FeatureTable:
none is found
"""

if self._telemetry_enabled:
log_usage(
"get_feature_table",
self._telemetry_id,
datetime.utcnow(),
self.version(),
)
if project is None:
project = self.project

Expand Down Expand Up @@ -835,6 +873,8 @@ def ingest(
>>> client.ingest(driver_ft, ft_df)
"""

if self._telemetry_enabled:
log_usage("ingest", self._telemetry_id, datetime.utcnow(), self.version())
if project is None:
project = self.project
if isinstance(feature_table, str):
Expand Down Expand Up @@ -959,6 +999,15 @@ def get_online_features(
{'sales:daily_transactions': [1.1,1.2], 'sales:customer_id': [0,1]}
"""

if self._telemetry_enabled:
if self._telemetry_counter["get_online_features"] % 1000 == 0:
log_usage(
"get_online_features",
self._telemetry_id,
datetime.utcnow(),
self.version(),
)
self._telemetry_counter["get_online_features"] += 1
try:
response = self._serving_service.GetOnlineFeaturesV2(
GetOnlineFeaturesRequestV2(
Expand Down Expand Up @@ -1019,6 +1068,13 @@ def get_historical_features(
>>> output_file_uri = feature_retrieval_job.get_output_file_uri()
"gs://some-bucket/output/
"""
if self._telemetry_enabled:
log_usage(
"get_historical_features",
self._telemetry_id,
datetime.utcnow(),
self.version(),
)
feature_tables = self._get_feature_tables_from_feature_refs(
feature_refs, self.project
)
Expand Down
3 changes: 3 additions & 0 deletions sdk/python/feast/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -259,6 +259,9 @@ class ConfigOptions(metaclass=ConfigMeta):
#: Oauth token request url
OAUTH_TOKEN_REQUEST_URL: Optional[str] = None

#: Telemetry enabled
TELEMETRY = "True"

def defaults(self):
return {
k: getattr(self, k)
Expand Down
45 changes: 45 additions & 0 deletions sdk/python/feast/telemetry.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# Copyright 2019 The Feast Authors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import os
import sys
from datetime import datetime
from typing import Dict

import requests

TELEMETRY_ENDPOINT = (
"https://us-central1-kf-feast.cloudfunctions.net/bq_telemetry_logger"
)


def log_usage(
function_name: str,
telemetry_id: str,
timestamp: datetime,
version: Dict[str, Dict[str, str]],
):
json = {
"function_name": function_name,
"telemetry_id": telemetry_id,
"timestamp": timestamp.isoformat(),
"version": version,
"os": sys.platform,
"is_test": os.getenv("FEAST_IS_TELEMETRY_TEST", "False"),
}
try:
requests.post(TELEMETRY_ENDPOINT, json=json)
except Exception:
pass
return
70 changes: 70 additions & 0 deletions sdk/python/telemetry_tests/test_telemetry.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
# Copyright 2020 The Feast Authors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from datetime import datetime
from feast.client import Client
from feast.entity import Entity
from feast.value_type import ValueType
from google.cloud import bigquery
import os
import pytest
from time import sleep

def test_telemetry_on():
os.environ["FEAST_IS_TELEMETRY_TEST"] = 'True'
test_client = Client(serving_url=None, core_url=None)
test_client.set_project("project1")
entity = Entity(
name="driver_car_id",
description="Car driver id",
value_type=ValueType.STRING,
labels={"team": "matchmaking"},
)

timestamp = datetime.utcnow()
try:
test_client.apply(entity)
except Exception:
pass

sleep(30)
bq_client = bigquery.Client()
query = f"select * from `kf-feast.feast_telemetry.cloudfunctions_googleapis_com_cloud_functions` where timestamp >= TIMESTAMP(\"{timestamp.date().isoformat()}\") and JSON_EXTRACT(textPayload, '$.is_test')='\"True\"' and JSON_EXTRACT(textPayload, '$.timestamp')>'\"{timestamp.isoformat()}\"'"
query_job = bq_client.query(query)
rows = query_job.result()
assert(rows.total_rows == 1)

def test_telemetry_off():
os.environ["FEAST_IS_TELEMETRY_TEST"] = 'True'
test_client = Client(serving_url=None, core_url=None, telemetry=False)
test_client.set_project("project1")
entity = Entity(
name="driver_car_id",
description="Car driver id",
value_type=ValueType.STRING,
labels={"team": "matchmaking"},
)

timestamp = datetime.utcnow()
try:
test_client.apply(entity)
except Exception:
pass

sleep(30)
bq_client = bigquery.Client()
query = f"select * from `kf-feast.feast_telemetry.cloudfunctions_googleapis_com_cloud_functions` where timestamp >= TIMESTAMP(\"{timestamp.date().isoformat()}\") and JSON_EXTRACT(textPayload, '$.is_test')='\"True\"' and JSON_EXTRACT(textPayload, '$.timestamp')>'\"{timestamp.isoformat()}\"'"
query_job = bq_client.query(query)
rows = query_job.result()
assert(rows.total_rows == 0)

0 comments on commit a8c4bbf

Please sign in to comment.