-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: implements datacollection function
+ Adds proto definition for ConversationBit. + Changes ConversationBit.Created field to int64. + Adds error-handling in client side JS.
- Loading branch information
Showing
19 changed files
with
1,052 additions
and
50 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
# Services | ||
|
||
The MyHerodotus app has several microservices running behind the scenes to assist | ||
with data collection, evaluation, and model tuning. | ||
|
||
## Data collection | ||
|
||
The [Firestore-to-BigQuery](../services/data-collection/) service updates a | ||
BigQuery table with user data and responses from the MyHerodotus app. The service | ||
is triggered by a specific event: when a document is updated in the Firestore database. | ||
This event is used for data collection because it only occurs when a user has rated | ||
a response provided by the app. | ||
|
||
All data collected has had PII removed from it, specifically first name, last name, | ||
age, and email addresses. This list of deidentified info types is configurable in | ||
the app. | ||
|
||
The following code shows the equivalent gcloud command for exporting from Firestore. | ||
|
||
```sh | ||
$ gcloud firestore export gs://myherodotus --database=l200 --collection-ids=HerodotusDev,Conversations | ||
``` | ||
|
||
### Deploy the service to Cloud functions | ||
|
||
To deploy the `data-collection` function to Cloud Run, run the following command from the | ||
`data-collection/` directory. Be sure to set the project ID using `gcloud config set project`. | ||
|
||
**IMPORTANT**: Make sure that `$PROJECT_ID` and `$DATASET_NAME` env vars are set before deploying | ||
the function! | ||
|
||
```sh | ||
$ gcloud functions deploy data-collection \ | ||
--gen2 \ | ||
--runtime=go121 \ | ||
--region="us-west1" \ | ||
--trigger-location="us-west1" \ | ||
--source=. \ | ||
--entry-point=CollectData \ | ||
--set-env-vars PROJECT_ID=${PROJECT_ID},DATASET_NAME=${DATASET_NAME},BUILD_VER=Herodotus \ | ||
--trigger-event-filters="type=google.cloud.firestore.document.v1.updated" \ | ||
--trigger-event-filters="database=l200" \ | ||
--trigger-event-filters-path-pattern=document='Herodotus/{userId}/Conversations/{conversationId}' | ||
``` | ||
|
||
### Sources | ||
|
||
+ https://cloud.google.com/functions/docs/calling/cloud-firestore | ||
+ https://cloud.google.com/functions/docs/tutorials/storage | ||
+ https://cloud.google.com/functions/docs/calling/eventarc | ||
+ https://cloud.google.com/eventarc/docs/reference/supported-events#cloud-firestore | ||
+ https://cloud.google.com/bigquery/docs/loading-data-cloud-firestore#python | ||
+ https://cloud.google.com/firestore/docs/manage-data/export-import#gcloud |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
version: v2 | ||
managed: | ||
enabled: false | ||
#override: | ||
# - file_option: go_package_prefix | ||
# path: conversation.proto | ||
# value: myherodotus.com/datacollection | ||
plugins: | ||
- remote: buf.build/protocolbuffers/go | ||
out: ../server | ||
opt: | ||
- module=myherodotus.com/main | ||
- remote: buf.build/protocolbuffers/go | ||
out: ../services/data-collection | ||
opt: | ||
- module=myherodotus.com/datacollection | ||
inputs: | ||
- directory: . |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
# For details on buf.yaml configuration, visit https://buf.build/docs/configuration/v2/buf-yaml | ||
version: v2 | ||
modules: | ||
- path: . | ||
lint: | ||
use: | ||
- STANDARD | ||
breaking: | ||
use: | ||
- FILE |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
syntax = "proto3"; | ||
package myherodotus; | ||
|
||
//option go_package = "myherodotus.com/main"; | ||
|
||
message ConversationBit { | ||
string bot_response = 1; | ||
string user_query = 2; | ||
string model = 3; | ||
string prompt = 4; | ||
int64 created = 5; | ||
int32 token_count = 6; | ||
string rating = 7; | ||
} |
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
Oops, something went wrong.