Skip to content

Commit a9c1097

Browse files
committed
Remove data collector service
1 parent 0976655 commit a9c1097

File tree

19 files changed

+22
-1226
lines changed

19 files changed

+22
-1226
lines changed

.github/workflows/e2e_tests.yaml

Lines changed: 1 addition & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -87,14 +87,7 @@ jobs:
8787
feedback_storage: "/tmp/data/feedback"
8888
transcripts_disabled: false
8989
transcripts_storage: "/tmp/data/transcripts"
90-
data_collector:
91-
enabled: false
92-
ingress_server_url: null
93-
ingress_server_auth_token: null
94-
ingress_content_service_name: null
95-
collection_interval: 7200 # 2 hours in seconds
96-
cleanup_after_send: true
97-
connection_timeout_seconds: 30
90+
9891
authentication:
9992
module: "noop"
10093

Makefile

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,8 +8,7 @@ PYTHON_REGISTRY = pypi
88
run: ## Run the service locally
99
uv run src/lightspeed_stack.py
1010

11-
run-data-collector: ## Run the data collector service locally
12-
uv run src/lightspeed_stack.py --data-collector
11+
1312

1413
test-unit: ## Run the unit tests
1514
@echo "Running unit tests..."

README.md

Lines changed: 1 addition & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -42,10 +42,7 @@ Lightspeed Core Stack (LCS) is an AI-powered assistant that provides answers to
4242
* [Utility to generate OpenAPI schema](#utility-to-generate-openapi-schema)
4343
* [Path](#path)
4444
* [Usage](#usage-1)
45-
* [Data Collector Service](#data-collector-service)
46-
* [Features](#features)
47-
* [Configuration](#configuration-1)
48-
* [Running the Service](#running-the-service)
45+
4946

5047
<!-- vim-markdown-toc -->
5148

@@ -253,7 +250,6 @@ Usage: make <OPTIONS> ... <TARGETS>
253250
Available targets are:
254251

255252
run Run the service locally
256-
run-data-collector Run the data collector service
257253
test-unit Run the unit tests
258254
test-integration Run integration tests tests
259255
test-e2e Run BDD tests for the service
@@ -421,50 +417,6 @@ This script re-generated OpenAPI schema for the Lightspeed Service REST API.
421417
make schema
422418
```
423419

424-
## Data Collector Service
425-
426-
The data collector service is a standalone service that runs separately from the main web service. It is responsible for collecting and sending user data including feedback and transcripts to an ingress server for analysis and archival.
427-
428-
### Features
429-
430-
- **Periodic Collection**: Runs at configurable intervals
431-
- **Data Packaging**: Packages feedback and transcript files into compressed tar.gz archives
432-
- **Secure Transmission**: Sends data to a configured ingress server with optional authentication
433-
- **File Cleanup**: Optionally removes local files after successful transmission
434-
- **Error Handling**: Includes retry logic and comprehensive error handling
435-
436-
### Configuration
437-
438-
The data collector service is configured through the `user_data_collection.data_collector` section in your configuration file:
439-
440-
```yaml
441-
user_data_collection:
442-
feedback_enabled: true
443-
feedback_storage: "/tmp/data/feedback"
444-
transcripts_enabled: true
445-
transcripts_storage: "/tmp/data/transcripts"
446-
data_collector:
447-
enabled: true
448-
ingress_server_url: "https://your-ingress-server.com"
449-
ingress_server_auth_token: "your-auth-token"
450-
ingress_content_service_name: "lightspeed-team"
451-
collection_interval: 7200 # 2 hours in seconds
452-
cleanup_after_send: true
453-
connection_timeout: 30
454-
```
455-
456-
### Running the Service
457-
458-
To run the data collector service:
459-
460-
```bash
461-
# Using Python directly
462-
uv run src/lightspeed_stack.py --data-collector
463-
464-
# Using Make target
465-
make run-data-collector
466-
```
467-
468420

469421

470422
# Project structure

docs/config.puml

Lines changed: 2 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -26,16 +26,7 @@ class "Customization" as src.models.config.Customization {
2626
system_prompt_path : Optional[FilePath]
2727
check_customization_model() -> Self
2828
}
29-
class "DataCollectorConfiguration" as src.models.config.DataCollectorConfiguration {
30-
cleanup_after_send : bool
31-
collection_interval : Annotated
32-
connection_timeout : Annotated
33-
enabled : bool
34-
ingress_content_service_name : Optional[str]
35-
ingress_server_auth_token : Optional[str]
36-
ingress_server_url : Optional[str]
37-
check_data_collector_configuration() -> Self
38-
}
29+
3930
class "InferenceConfiguration" as src.models.config.InferenceConfiguration {
4031
default_model : Optional[str]
4132
default_provider : Optional[str]
@@ -78,14 +69,13 @@ class "TLSConfiguration" as src.models.config.TLSConfiguration {
7869
check_tls_configuration() -> Self
7970
}
8071
class "UserDataCollection" as src.models.config.UserDataCollection {
81-
data_collector
8272
feedback_enabled : bool
8373
feedback_storage : Optional[str]
8474
transcripts_enabled : bool
8575
transcripts_storage : Optional[str]
8676
check_storage_location_is_set_when_needed() -> Self
8777
}
88-
src.models.config.DataCollectorConfiguration --* src.models.config.UserDataCollection : data_collector
78+
8979
src.models.config.InferenceConfiguration --* src.models.config.Configuration : inference
9080
src.models.config.JwtConfiguration --* src.models.config.JwkConfiguration : jwt_configuration
9181
src.models.config.LlamaStackConfiguration --* src.models.config.Configuration : llama_stack

docs/deployment_guide.md

Lines changed: 2 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1099,14 +1099,7 @@ user_data_collection:
10991099
feedback_storage: "/tmp/data/feedback"
11001100
transcripts_enabled: true
11011101
transcripts_storage: "/tmp/data/transcripts"
1102-
data_collector:
1103-
enabled: false
1104-
ingress_server_url: null
1105-
ingress_server_auth_token: null
1106-
ingress_content_service_name: null
1107-
collection_interval: 7200 # 2 hours in seconds
1108-
cleanup_after_send: true
1109-
connection_timeout_seconds: 30
1102+
11101103
authentication:
11111104
module: "noop"
11121105
```
@@ -1261,14 +1254,7 @@ user_data_collection:
12611254
feedback_storage: "/tmp/data/feedback"
12621255
transcripts_enabled: true
12631256
transcripts_storage: "/tmp/data/transcripts"
1264-
data_collector:
1265-
enabled: false
1266-
ingress_server_url: null
1267-
ingress_server_auth_token: null
1268-
ingress_content_service_name: null
1269-
collection_interval: 7200 # 2 hours in seconds
1270-
cleanup_after_send: true
1271-
connection_timeout_seconds: 30
1257+
12721258
authentication:
12731259
module: "noop"
12741260
```

docs/getting_started.md

Lines changed: 1 addition & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -264,14 +264,7 @@ user_data_collection:
264264
feedback_storage: "/tmp/data/feedback"
265265
transcripts_enabled: true
266266
transcripts_storage: "/tmp/data/transcripts"
267-
data_collector:
268-
enabled: false
269-
ingress_server_url: null
270-
ingress_server_auth_token: null
271-
ingress_content_service_name: null
272-
collection_interval: 7200 # 2 hours in seconds
273-
cleanup_after_send: true
274-
connection_timeout_seconds: 30
267+
275268
authentication:
276269
module: "noop"
277270
```

docs/openapi.json

Lines changed: 9 additions & 53 deletions
Original file line numberDiff line numberDiff line change
@@ -1101,67 +1101,32 @@
11011101
"title": "Customization",
11021102
"description": "Service customization."
11031103
},
1104-
"DataCollectorConfiguration": {
1104+
"DatabaseConfiguration": {
11051105
"properties": {
1106-
"enabled": {
1107-
"type": "boolean",
1108-
"title": "Enabled",
1109-
"default": false
1110-
},
1111-
"ingress_server_url": {
1112-
"anyOf": [
1113-
{
1114-
"type": "string"
1115-
},
1116-
{
1117-
"type": "null"
1118-
}
1119-
],
1120-
"title": "Ingress Server Url"
1121-
},
1122-
"ingress_server_auth_token": {
1106+
"sqlite": {
11231107
"anyOf": [
11241108
{
1125-
"type": "string"
1109+
"$ref": "#/components/schemas/SQLiteDatabaseConfiguration"
11261110
},
11271111
{
11281112
"type": "null"
11291113
}
1130-
],
1131-
"title": "Ingress Server Auth Token"
1114+
]
11321115
},
1133-
"ingress_content_service_name": {
1116+
"postgres": {
11341117
"anyOf": [
11351118
{
1136-
"type": "string"
1119+
"$ref": "#/components/schemas/PostgreSQLDatabaseConfiguration"
11371120
},
11381121
{
11391122
"type": "null"
11401123
}
1141-
],
1142-
"title": "Ingress Content Service Name"
1143-
},
1144-
"collection_interval": {
1145-
"type": "integer",
1146-
"exclusiveMinimum": 0.0,
1147-
"title": "Collection Interval",
1148-
"default": 7200
1149-
},
1150-
"cleanup_after_send": {
1151-
"type": "boolean",
1152-
"title": "Cleanup After Send",
1153-
"default": true
1154-
},
1155-
"connection_timeout": {
1156-
"type": "integer",
1157-
"exclusiveMinimum": 0.0,
1158-
"title": "Connection Timeout",
1159-
"default": 30
1124+
]
11601125
}
11611126
},
11621127
"type": "object",
1163-
"title": "DataCollectorConfiguration",
1164-
"description": "Data collector configuration for sending data to ingress server."
1128+
"title": "DatabaseConfiguration",
1129+
"description": "Database configuration."
11651130
},
11661131
"DatabaseConfiguration": {
11671132
"properties": {
@@ -2122,15 +2087,6 @@
21222087
}
21232088
],
21242089
"title": "Transcripts Storage"
2125-
},
2126-
"data_collector": {
2127-
"$ref": "#/components/schemas/DataCollectorConfiguration",
2128-
"default": {
2129-
"enabled": false,
2130-
"collection_interval": 7200,
2131-
"cleanup_after_send": true,
2132-
"connection_timeout": 30
2133-
}
21342090
}
21352091
},
21362092
"type": "object",

docs/openapi.md

Lines changed: 0 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -577,21 +577,6 @@ Service customization.
577577
| system_prompt | | |
578578

579579

580-
## DataCollectorConfiguration
581-
582-
583-
Data collector configuration for sending data to ingress server.
584-
585-
586-
| Field | Type | Description |
587-
|-------|------|-------------|
588-
| enabled | boolean | |
589-
| ingress_server_url | | |
590-
| ingress_server_auth_token | | |
591-
| ingress_content_service_name | | |
592-
| collection_interval | integer | |
593-
| cleanup_after_send | boolean | |
594-
| connection_timeout | integer | |
595580

596581

597582
## DatabaseConfiguration
@@ -1026,7 +1011,6 @@ User data collection configuration.
10261011
| feedback_storage | | |
10271012
| transcripts_enabled | boolean | |
10281013
| transcripts_storage | | |
1029-
| data_collector | | |
10301014

10311015

10321016
## ValidationError

docs/output.md

Lines changed: 0 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -577,22 +577,6 @@ Service customization.
577577
| system_prompt | | |
578578

579579

580-
## DataCollectorConfiguration
581-
582-
583-
Data collector configuration for sending data to ingress server.
584-
585-
586-
| Field | Type | Description |
587-
|-------|------|-------------|
588-
| enabled | boolean | |
589-
| ingress_server_url | | |
590-
| ingress_server_auth_token | | |
591-
| ingress_content_service_name | | |
592-
| collection_interval | integer | |
593-
| cleanup_after_send | boolean | |
594-
| connection_timeout | integer | |
595-
596580

597581
## DatabaseConfiguration
598582

@@ -1016,7 +1000,6 @@ User data collection configuration.
10161000
| feedback_storage | | |
10171001
| transcripts_enabled | boolean | |
10181002
| transcripts_storage | | |
1019-
| data_collector | | |
10201003

10211004

10221005
## ValidationError

docs/testing.md

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -105,11 +105,9 @@ As specified in Definition of Done, new changes need to be covered by tests.
105105
│   ├── test_requests.py
106106
│   └── test_responses.py
107107
├── runners
108-
│   ├── __init__.py
109-
│   ├── test_data_collector_runner.py
110-
│   └── test_uvicorn_runner.py
108+
│ ├── __init__.py
109+
│ └── test_uvicorn_runner.py
111110
├── services
112-
│   └── test_data_collector.py
113111
├── test_client.py
114112
├── test_configuration.py
115113
├── test_lightspeed_stack.py

0 commit comments

Comments
 (0)