|
| 1 | +<!-- Allow leading shortcode --> |
| 2 | +{{% product-name %}} uses a dual thread pool architecture to efficiently handle different types of workloads. |
| 3 | +Understanding how threads are allocated and used helps you optimize performance for your specific use case. |
| 4 | + |
| 5 | +## Thread pool architecture |
| 6 | + |
| 7 | +{{% product-name %}} divides available CPU resources between two specialized thread pools: |
| 8 | + |
| 9 | +- [IO runtime thread pool](#io-runtime-thread-pool) |
| 10 | +- [DataFusion runtime thread pool](#datafusion-runtime-thread-pool) |
| 11 | + |
| 12 | +### IO runtime thread pool |
| 13 | + |
| 14 | +The IO runtime handles all input/output operations and initial data processing: |
| 15 | + |
| 16 | +- **HTTP request handling**: Receives and responds to HTTP API requests |
| 17 | +- **Line protocol parsing**: Parses and validates incoming line protocol data |
| 18 | +- **Network communication**: Manages all network IO operations |
| 19 | +- **File system operations**: Reads from and writes to local file systems |
| 20 | +- **Object store operations**: Interacts with object storage (S3, Azure Blob, GCS) |
| 21 | +- **Initial request routing**: Routes requests to appropriate handlers |
| 22 | + |
| 23 | +> [!Important] |
| 24 | +> Line protocol parsing is CPU-intensive and happens on IO threads. Each concurrent writer |
| 25 | +> can utilize one IO thread for parsing, making IO thread count critical for write throughput. |
| 26 | +
|
| 27 | +### DataFusion runtime thread pool |
| 28 | + |
| 29 | +The DataFusion runtime handles query processing and data management: |
| 30 | + |
| 31 | +- **Query execution**: Processes SQL and InfluxQL queries |
| 32 | +- **Data aggregation**: Performs aggregations and transformations |
| 33 | +- **Snapshot creation**: Executes sort and dedupe operations during WAL snapshots |
| 34 | +- **Parquet file generation**: Creates and optimizes Parquet files |
| 35 | +- **Compaction operations**: Merges and optimizes stored data |
| 36 | +- **Cache operations**: Manages query result caching |
| 37 | + |
| 38 | +> [!Note] |
| 39 | +> Even nodes dedicated to ingest (Enterprise `--mode=ingest`) require DataFusion threads |
| 40 | +> for snapshot operations that create Parquet files from WAL data. |
| 41 | +
|
| 42 | +## Default thread allocation |
| 43 | + |
| 44 | +When you start {{% product-name %}} without specifying thread counts, the system uses these defaults: |
| 45 | + |
| 46 | +### Without explicit configuration |
| 47 | + |
| 48 | +``` |
| 49 | +Total system cores: N |
| 50 | +IO threads: 2 (or 1 if N < 4) |
| 51 | +DataFusion threads: N - IO threads |
| 52 | +``` |
| 53 | + |
| 54 | +**Examples:** |
| 55 | +- 4-core system: 2 IO threads, 2 DataFusion threads |
| 56 | +- 32-core system: 2 IO threads, 30 DataFusion threads |
| 57 | +- 96-core system: 2 IO threads, 94 DataFusion threads |
| 58 | + |
| 59 | +> [!Warning] |
| 60 | +> The default 2 IO threads can severely limit performance with multiple concurrent writers. |
| 61 | +> A 96-core system using only 2 cores for ingest is significantly underutilized. |
| 62 | +
|
| 63 | +{{% show-in "enterprise" %}} |
| 64 | +### With --num-cores set |
| 65 | + |
| 66 | +When you limit total cores with `--num-cores`, {{% product-name %}} automatically adjusts thread allocation: |
| 67 | + |
| 68 | +``` |
| 69 | +num-cores value: N |
| 70 | +1-2 cores: 1 IO thread, 1 DataFusion thread |
| 71 | +3 cores: 1 IO thread, 2 DataFusion threads |
| 72 | +4+ cores: 2 IO threads, (N-2) DataFusion threads |
| 73 | +``` |
| 74 | +{{% /show-in %}} |
| 75 | + |
| 76 | +## Manual thread configuration |
| 77 | + |
| 78 | +Override default thread allocation for optimal performance--for example: |
| 79 | + |
| 80 | +### Configure IO threads |
| 81 | + |
| 82 | +```bash |
| 83 | +# Increase IO threads for write-heavy workloads |
| 84 | +influxdb3 serve \ |
| 85 | + --node-id=node0 \ |
| 86 | + --object-store=file \ |
| 87 | + --data-dir=~/.influxdb3 \ |
| 88 | + --num-io-threads=8 |
| 89 | +``` |
| 90 | + |
| 91 | +For detailed configuration examples with memory tuning, caching, and other performance |
| 92 | +optimizations, see [Performance tuning](/influxdb3/version/admin/performance-tuning/). |
| 93 | + |
| 94 | +## Thread utilization patterns |
| 95 | + |
| 96 | +- [Write operations](#write-operations) |
| 97 | +- [Query operations](#query-operations) |
| 98 | +- [Snapshot operations](#snapshot-operations) |
| 99 | + |
| 100 | +### Write operations |
| 101 | + |
| 102 | +1. HTTP request arrives on IO thread |
| 103 | +2. IO thread parses line protocol (CPU-intensive) |
| 104 | +3. IO thread validates data against schema |
| 105 | +4. Data queued for WAL write |
| 106 | +5. IO thread sends response |
| 107 | + |
| 108 | +**Bottleneck indicators:** |
| 109 | +- IO threads at 100% CPU utilization |
| 110 | +- Write latency increases with concurrent writers |
| 111 | +- Throughput plateaus despite available CPU |
| 112 | + |
| 113 | +### Query operations |
| 114 | + |
| 115 | +1. HTTP request arrives on IO thread |
| 116 | +2. IO thread routes to query handler |
| 117 | +3. DataFusion thread plans query execution |
| 118 | +4. DataFusion threads execute query in parallel |
| 119 | +5. Results assembled and returned via IO thread |
| 120 | + |
| 121 | +**Bottleneck indicators:** |
| 122 | +- DataFusion threads at 100% CPU utilization |
| 123 | +- Query latency increases with complexity |
| 124 | +- Concurrent query throughput limited |
| 125 | + |
| 126 | +### Snapshot operations |
| 127 | + |
| 128 | +1. Triggered by time or WAL size threshold |
| 129 | +2. DataFusion threads sort and deduplicate data |
| 130 | +3. DataFusion threads create Parquet files |
| 131 | +4. IO threads write files to object storage |
| 132 | + |
| 133 | +> [!Important] |
| 134 | +> Snapshots use DataFusion threads even on ingest-only nodes. |
| 135 | +
|
| 136 | +## Performance implications |
| 137 | + |
| 138 | +Thread allocation impacts performance based on workload characteristics. |
| 139 | + |
| 140 | +- [Concurrent writer scaling](#concurrent-writer-scaling) |
| 141 | +- [Memory considerations](#memory-considerations) |
| 142 | +- [CPU efficiency](#cpu-efficiency) |
| 143 | + |
| 144 | +### Concurrent writer scaling |
| 145 | + |
| 146 | +Each concurrent writer (for example, Telegraf agent or API client) can utilize approximately one IO thread for line protocol parsing: |
| 147 | + |
| 148 | +| Concurrent Writers | Recommended IO Threads | Rationale | |
| 149 | +|-------------------|------------------------|-----------| |
| 150 | +| 1-2 | 2-4 | Some headroom for system operations | |
| 151 | +| 5 | 5-8 | One thread per writer plus overhead | |
| 152 | +| 10 | 10-14 | Linear scaling with writers | |
| 153 | + |
| 154 | +### Memory considerations |
| 155 | + |
| 156 | +Thread pools have associated memory overhead: |
| 157 | + |
| 158 | +- **IO threads**: Generally lower memory usage, mainly buffers for parsing |
| 159 | +- **DataFusion threads**: Higher memory usage for query execution and sorting |
| 160 | + - Default execution memory pool: 70% of available RAM |
| 161 | + - Divided among DataFusion threads |
| 162 | + - More threads = less memory per thread |
| 163 | + |
| 164 | +### CPU efficiency |
| 165 | + |
| 166 | +- **IO threads**: Typically CPU-bound during parsing |
| 167 | +- **DataFusion threads**: Mix of CPU and memory-bound operations |
| 168 | +- **Context switching**: Too many threads relative to cores causes overhead |
| 169 | + |
| 170 | +## Monitoring thread utilization |
| 171 | + |
| 172 | +Monitor thread pool utilization to identify bottlenecks--for example: |
| 173 | + |
| 174 | +```bash |
| 175 | +# View thread utilization (Linux) |
| 176 | +top -H -p $(pgrep influxdb3) |
| 177 | + |
| 178 | +# Monitor IO wait |
| 179 | +iostat -x 1 |
| 180 | + |
| 181 | +# Check CPU utilization by core |
| 182 | +mpstat -P ALL 1 |
| 183 | +``` |
| 184 | + |
| 185 | +## Recommendations by workload |
| 186 | + |
| 187 | +Thread allocation should match your workload characteristics: |
| 188 | + |
| 189 | +- **Write-heavy workloads**: Allocate more IO threads (10-40% of cores) |
| 190 | +- **Query-heavy workloads**: Maximize DataFusion threads (85-95% of cores) |
| 191 | +- **Balanced workloads**: Split evenly based on actual usage patterns |
| 192 | + |
| 193 | +{{% show-in "enterprise" %}} |
| 194 | +> [!Note] |
| 195 | +> Even nodes dedicated to ingest (`--mode=ingest`) require DataFusion threads |
| 196 | +> for snapshot operations that create Parquet files from WAL data. |
| 197 | +{{% /show-in %}} |
0 commit comments