Use StackDriver + BigQuery to log predictions #79

jlewi · 2019-12-26T14:33:51Z

We should use bigquery and stackdriver to log predictions.

This should work as follows

We should emit json log entries containing the predictions
We should store logs in stackdriver
We should setup the bigquery sink for stackdriver

issue-label-bot · 2019-12-26T14:33:53Z

Issue-Label Bot is automatically applying the label kind/feature to this issue, with a confidence of 0.98. Please mark this comment with 👍 or 👎 to give our bot feedback!

Links: app homepage, dashboard and code for this bot.

jlewi · 2019-12-29T20:17:46Z

I setup a biguery sink in the project issue-label-bot-dev to begin experimenting with.

jlewi · 2019-12-31T01:29:24Z

There's some information here about doing structured logging with the logging module.
https://docs.python.org/3/howto/logging-cookbook.html#implementing-structured-logging

It looks like this relies on the caller of logging.info() to pass a string representing the json dictionary.

I think we want to do something like the json formatter to automatically format all entries as json
https://pypi.org/project/JSON-log-formatter/

jlewi · 2020-01-03T19:10:19Z

I created a new sync. It looks like before when I created a sync I used a filter expression that wouldn't include the new prod deployment.

I created a new sync with the filter

resource.type="k8s_container" resource.labels.cluster_name="issue-label-bot" resource.labels.container_name="app"

* worker.py should format logs as json entries. This will make it easier to query the data in BigQuery and stackdriver to measure performance. * Related to kubeflow#79 * To deal with workload identity flakiness (kubeflow#88) test that we can get application default credentials on startup and if not exit. * As a hack to deal with multi-threading issues with Keras models (kubeflow#89) have the predict function load a new model on each call * It looks like the way pubsub works there is actually a thread pool so predict calls won't be handled in the same thread even though we throttle it to handle one item at a time.

* worker.py should format logs as json entries. This will make it easier to query the data in BigQuery and stackdriver to measure performance. * Related to #79 * To deal with workload identity flakiness (#88) test that we can get application default credentials on startup and if not exit. * As a hack to deal with multi-threading issues with Keras models (#89) have the predict function load a new model on each call * It looks like the way pubsub works there is actually a thread pool so predict calls won't be handled in the same thread even though we throttle it to handle one item at a time.

jlewi · 2020-01-19T01:19:14Z

Logs are now in stackdriver. Here's a sample query

SELECT jsonPayload  FROM `issue-label-bot-dev.issue_label_bot_logs_dev.stderr_20200117` where jsonPayload.repo_owner="kubeflow" LIMIT 1000

jlewi · 2020-01-19T21:21:46Z

it looks like logs are streamed in nearly real time to BigQuery. I observed log entries showing up almost immediately. So it looks as though the sync is much more frequent then once a day.

jlewi added the priority/p1 label Dec 26, 2019

issue-label-bot bot added the kind/feature label Dec 26, 2019

jlewi mentioned this issue Dec 27, 2019

Use json logging to log information about the issue being processed when exceptions occur #77

Merged

jlewi mentioned this issue Jan 4, 2020

Structured logging and other bug fixes. #91

Merged

jlewi closed this as completed Jan 19, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use StackDriver + BigQuery to log predictions #79

Use StackDriver + BigQuery to log predictions #79

jlewi commented Dec 26, 2019

issue-label-bot bot commented Dec 26, 2019

jlewi commented Dec 29, 2019

jlewi commented Dec 31, 2019

jlewi commented Jan 3, 2020

jlewi commented Jan 19, 2020

jlewi commented Jan 19, 2020

Use StackDriver + BigQuery to log predictions #79

Use StackDriver + BigQuery to log predictions #79

Comments

jlewi commented Dec 26, 2019

issue-label-bot bot commented Dec 26, 2019

jlewi commented Dec 29, 2019

jlewi commented Dec 31, 2019

jlewi commented Jan 3, 2020

jlewi commented Jan 19, 2020

jlewi commented Jan 19, 2020