Skip to content

Commit f7f85c1

Browse files
committed
feat(influxdb3): Runtime architecture and thread allocation reference:- IO runtime thread pool (line protocol parsing and validation)- DataFusion runtime thread pool (WAL persistence and queries)
- Recommendations and best practices for monitoring and configuring Core and Enterprise
1 parent ed87cf5 commit f7f85c1

File tree

3 files changed

+244
-0
lines changed

3 files changed

+244
-0
lines changed
Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
---
2+
title: Runtime thread architecture
3+
seotitle: InfluxDB 3 Core runtime thread architecture
4+
description: >
5+
Learn how InfluxDB 3 Core allocates and manages runtime threads between IO and DataFusion thread pools
6+
for optimal performance.
7+
weight: 200
8+
menu:
9+
influxdb3_core:
10+
parent: Core internals
11+
name: Runtime architecture
12+
influxdb3/core/tags: [architecture, threads, runtime, performance]
13+
related:
14+
- /influxdb3/core/admin/performance-tuning/
15+
- /influxdb3/core/reference/config-options/
16+
- /influxdb3/core/admin/monitor-metrics/
17+
source: /shared/influxdb3-internals-reference/runtime-architecture.md
18+
---
19+
20+
<!--
21+
The content of this file is located at
22+
//SOURCE - content/shared/influxdb3-internals-reference/runtime-architecture.md
23+
-->
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
---
2+
title: Runtime thread architecture
3+
seotitle: InfluxDB 3 Enterprise runtime thread architecture
4+
description: >
5+
Learn how InfluxDB 3 Enterprise allocates and manages runtime threads between IO and DataFusion thread pools
6+
for optimal cluster performance.
7+
weight: 200
8+
menu:
9+
influxdb3_enterprise:
10+
parent: Enterprise internals
11+
name: Runtime architecture
12+
influxdb3/enterprise/tags: [architecture, threads, runtime, performance, clustering]
13+
related:
14+
- /influxdb3/enterprise/admin/clustering/
15+
- /influxdb3/enterprise/admin/performance-tuning/
16+
- /influxdb3/enterprise/reference/config-options/
17+
- /influxdb3/enterprise/admin/monitor-metrics/
18+
source: /shared/influxdb3-internals-reference/runtime-architecture.md
19+
---
20+
21+
<!--
22+
The content of this file is located at
23+
//SOURCE - content/shared/influxdb3-internals-reference/runtime-architecture.md
24+
-->
Lines changed: 197 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,197 @@
1+
<!-- Allow leading shortcode -->
2+
{{% product-name %}} uses a dual thread pool architecture to efficiently handle different types of workloads.
3+
Understanding how threads are allocated and used helps you optimize performance for your specific use case.
4+
5+
## Thread pool architecture
6+
7+
{{% product-name %}} divides available CPU resources between two specialized thread pools:
8+
9+
- [IO runtime thread pool](#io-runtime-thread-pool)
10+
- [DataFusion runtime thread pool](#datafusion-runtime-thread-pool)
11+
12+
### IO runtime thread pool
13+
14+
The IO runtime handles all input/output operations and initial data processing:
15+
16+
- **HTTP request handling**: Receives and responds to HTTP API requests
17+
- **Line protocol parsing**: Parses and validates incoming line protocol data
18+
- **Network communication**: Manages all network IO operations
19+
- **File system operations**: Reads from and writes to local file systems
20+
- **Object store operations**: Interacts with object storage (S3, Azure Blob, GCS)
21+
- **Initial request routing**: Routes requests to appropriate handlers
22+
23+
> [!Important]
24+
> Line protocol parsing is CPU-intensive and happens on IO threads. Each concurrent writer
25+
> can utilize one IO thread for parsing, making IO thread count critical for write throughput.
26+
27+
### DataFusion runtime thread pool
28+
29+
The DataFusion runtime handles query processing and data management:
30+
31+
- **Query execution**: Processes SQL and InfluxQL queries
32+
- **Data aggregation**: Performs aggregations and transformations
33+
- **Snapshot creation**: Executes sort and dedupe operations during WAL snapshots
34+
- **Parquet file generation**: Creates and optimizes Parquet files
35+
- **Compaction operations**: Merges and optimizes stored data
36+
- **Cache operations**: Manages query result caching
37+
38+
> [!Note]
39+
> Even nodes dedicated to ingest (Enterprise `--mode=ingest`) require DataFusion threads
40+
> for snapshot operations that create Parquet files from WAL data.
41+
42+
## Default thread allocation
43+
44+
When you start {{% product-name %}} without specifying thread counts, the system uses these defaults:
45+
46+
### Without explicit configuration
47+
48+
```
49+
Total system cores: N
50+
IO threads: 2 (or 1 if N < 4)
51+
DataFusion threads: N - IO threads
52+
```
53+
54+
**Examples:**
55+
- 4-core system: 2 IO threads, 2 DataFusion threads
56+
- 32-core system: 2 IO threads, 30 DataFusion threads
57+
- 96-core system: 2 IO threads, 94 DataFusion threads
58+
59+
> [!Warning]
60+
> The default 2 IO threads can severely limit performance with multiple concurrent writers.
61+
> A 96-core system using only 2 cores for ingest is significantly underutilized.
62+
63+
{{% show-in "enterprise" %}}
64+
### With --num-cores set
65+
66+
When you limit total cores with `--num-cores`, {{% product-name %}} automatically adjusts thread allocation:
67+
68+
```
69+
num-cores value: N
70+
1-2 cores: 1 IO thread, 1 DataFusion thread
71+
3 cores: 1 IO thread, 2 DataFusion threads
72+
4+ cores: 2 IO threads, (N-2) DataFusion threads
73+
```
74+
{{% /show-in %}}
75+
76+
## Manual thread configuration
77+
78+
Override default thread allocation for optimal performance--for example:
79+
80+
### Configure IO threads
81+
82+
```bash
83+
# Increase IO threads for write-heavy workloads
84+
influxdb3 serve \
85+
--node-id=node0 \
86+
--object-store=file \
87+
--data-dir=~/.influxdb3 \
88+
--num-io-threads=8
89+
```
90+
91+
For detailed configuration examples with memory tuning, caching, and other performance
92+
optimizations, see [Performance tuning](/influxdb3/version/admin/performance-tuning/).
93+
94+
## Thread utilization patterns
95+
96+
- [Write operations](#write-operations)
97+
- [Query operations](#query-operations)
98+
- [Snapshot operations](#snapshot-operations)
99+
100+
### Write operations
101+
102+
1. HTTP request arrives on IO thread
103+
2. IO thread parses line protocol (CPU-intensive)
104+
3. IO thread validates data against schema
105+
4. Data queued for WAL write
106+
5. IO thread sends response
107+
108+
**Bottleneck indicators:**
109+
- IO threads at 100% CPU utilization
110+
- Write latency increases with concurrent writers
111+
- Throughput plateaus despite available CPU
112+
113+
### Query operations
114+
115+
1. HTTP request arrives on IO thread
116+
2. IO thread routes to query handler
117+
3. DataFusion thread plans query execution
118+
4. DataFusion threads execute query in parallel
119+
5. Results assembled and returned via IO thread
120+
121+
**Bottleneck indicators:**
122+
- DataFusion threads at 100% CPU utilization
123+
- Query latency increases with complexity
124+
- Concurrent query throughput limited
125+
126+
### Snapshot operations
127+
128+
1. Triggered by time or WAL size threshold
129+
2. DataFusion threads sort and deduplicate data
130+
3. DataFusion threads create Parquet files
131+
4. IO threads write files to object storage
132+
133+
> [!Important]
134+
> Snapshots use DataFusion threads even on ingest-only nodes.
135+
136+
## Performance implications
137+
138+
Thread allocation impacts performance based on workload characteristics.
139+
140+
- [Concurrent writer scaling](#concurrent-writer-scaling)
141+
- [Memory considerations](#memory-considerations)
142+
- [CPU efficiency](#cpu-efficiency)
143+
144+
### Concurrent writer scaling
145+
146+
Each concurrent writer (for example, Telegraf agent or API client) can utilize approximately one IO thread for line protocol parsing:
147+
148+
| Concurrent Writers | Recommended IO Threads | Rationale |
149+
|-------------------|------------------------|-----------|
150+
| 1-2 | 2-4 | Some headroom for system operations |
151+
| 5 | 5-8 | One thread per writer plus overhead |
152+
| 10 | 10-14 | Linear scaling with writers |
153+
154+
### Memory considerations
155+
156+
Thread pools have associated memory overhead:
157+
158+
- **IO threads**: Generally lower memory usage, mainly buffers for parsing
159+
- **DataFusion threads**: Higher memory usage for query execution and sorting
160+
- Default execution memory pool: 70% of available RAM
161+
- Divided among DataFusion threads
162+
- More threads = less memory per thread
163+
164+
### CPU efficiency
165+
166+
- **IO threads**: Typically CPU-bound during parsing
167+
- **DataFusion threads**: Mix of CPU and memory-bound operations
168+
- **Context switching**: Too many threads relative to cores causes overhead
169+
170+
## Monitoring thread utilization
171+
172+
Monitor thread pool utilization to identify bottlenecks--for example:
173+
174+
```bash
175+
# View thread utilization (Linux)
176+
top -H -p $(pgrep influxdb3)
177+
178+
# Monitor IO wait
179+
iostat -x 1
180+
181+
# Check CPU utilization by core
182+
mpstat -P ALL 1
183+
```
184+
185+
## Recommendations by workload
186+
187+
Thread allocation should match your workload characteristics:
188+
189+
- **Write-heavy workloads**: Allocate more IO threads (10-40% of cores)
190+
- **Query-heavy workloads**: Maximize DataFusion threads (85-95% of cores)
191+
- **Balanced workloads**: Split evenly based on actual usage patterns
192+
193+
{{% show-in "enterprise" %}}
194+
> [!Note]
195+
> Even nodes dedicated to ingest (`--mode=ingest`) require DataFusion threads
196+
> for snapshot operations that create Parquet files from WAL data.
197+
{{% /show-in %}}

0 commit comments

Comments
 (0)