Skip to content

Commit

Permalink
Merge pull request #2913 from vespa-engine/bjorncs/container-threadpo…
Browse files Browse the repository at this point in the history
…ol-configuration

Bjorncs/container threadpool configuration
  • Loading branch information
baldersheim authored Sep 29, 2023
2 parents d59f03f + 043a966 commit a313d25
Show file tree
Hide file tree
Showing 3 changed files with 45 additions and 89 deletions.
35 changes: 17 additions & 18 deletions en/performance/container-tuning.html
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ <h2 id="container-worker-threads">Container worker threads</h2>
Most components including request handlers use the container's <em>default thread pool</em>,
which is controlled by a shared executor instance.
Any component can utilize the default pool by injecting an <code>java.util.concurrent.Executor</code> instance.
Some built-in components have dedicated thread pools - such as the Jetty server, the search handler and the feed handler.
Some built-in components have dedicated thread pools - such as the Jetty server and the search handler.
These thread pools are injected through special wiring in the config model and
are not easily accessible from other components.
</p>
Expand All @@ -38,46 +38,45 @@ <h2 id="container-worker-threads">Container worker threads</h2>
<p>
The container will pre-start the minimum number of worker threads,
so even an idle container may report running several hundred threads.
For thread pools with fixed size (<em>min == max</em>), all threads are pre-started.
The thread pool is pre-started with the number of thread specified in the <code>threads</code> parameter.
Note that tuning the capacity upwards increases the risk of high GC pressure
as concurrency becomes higher with more in-flight requests.
The GC pressure is a function of number of in-flight requests, the time it takes to complete the request
and the amount of garbage produced per request.
Increasing the queue size will allow the application to handle shorter traffic bursts without rejecting requests,
although increasing the average latency for those requests that are queued up.
Large queues will also increase heap consumption in overload situations.
Extra threads will be created once the queue is full (if <em>max &gt; min</em>), and are destroyed after an idle timeout.
Extra threads will be created once the queue is full (when <code>boost</code> is specified), and are destroyed after an idle timeout.
If all threads are occupied, requests are rejected with a 503 response.
</p>
<p>
The effective thread pool configuration and utilization statistics can be observed through the
<a href="../operations/container.html#container-metrics">Container Metrics</a>.
See <a href="../operations/container.html#thread-pool-metrics">Thread Pool Metrics</a> for a list of metrics exported.
</p>
<p>Example configuration override:</p>
{% include note.html content=' If the queue size is set to 0 the metric measuring the queue size -
<code>jdisc.thread_pool.work_queue.size</code> - will instead switch to measure how many threads are active.'%}

<h3 id="container-worker-threads-min">Lower limit</h3>
The container will override any configuration if the effective value is below a fixed minimum. This is to
reduce the risk of certain deadlock scenarios and improve concurrency for low-resource environments.
<ul>
<li>Minimum 8 threads.</li>
<li>Minimum 650 queue capacity (if queue is not disabled).</li>
</ul>

<h3 id="container-worker-threads-example">Example</h3>
<pre>{% highlight xml %}
<container id="container" version="1.0">

<search>
<!-- Search handler thread pool -->
<threadpool>
<max-threads>500</max-threads>
<min-threads>500</min-threads>
<queue-size>0</queue-size>
<threads boost="12">4</threads>
<queue>100</queue>
</threadpool>
</search>

<document-api>
<!-- Feed handler thread pool -->
<http-client-api>
<threadpool>
<max-threads>50</max-threads>
<min-threads>10</min-threads>
<queue-size>1000</queue-size>
</threadpool>
</http-client-api>
</document-api>

<!-- Default thread pool -->
<config name="container.handler.threadpool">
<maxthreads>200</maxthreads>
Expand Down
49 changes: 0 additions & 49 deletions en/reference/services-container.html
Original file line number Diff line number Diff line change
Expand Up @@ -50,11 +50,6 @@
<a href="#tracelevel">tracelevel</a>
<a href="#mbusport">mbusport</a>
<a href="#ignore-undefined-fields">ignore-undefined-fields</a>
<a href="#http-client-api">http-client-api</a>
<a href="#http-client-api-threadpool">threadpool</a>
<a href="#http-client-api-threadpool">max-threads</a>
<a href="#http-client-api-threadpool">min-threads</a>
<a href="#http-client-api-threadpool">queue-size</a>
<a href="../stateless-model-evaluation.html">model-evaluation</a>
<a href="../stateless-model-evaluation.html#onnx-inference-options">onnx</a>
<a href="#document">document [type, class, bundle]</a>
Expand Down Expand Up @@ -397,13 +392,6 @@ <h2 id="document-api">document-api</h2>
</p>
</td>
</tr>
<tr>
<th>http-client-api</th>
<td>optional</td>
<td></td>
<td></td>
<td><p id="document-api.http-client-api">Configuration for the Vespa HTTP client API</p></td>
</tr>
</tbody>
</table>
<p>Example:</p>
Expand All @@ -420,48 +408,11 @@ <h2 id="document-api">document-api</h2>
&lt;route&gt;default&lt;/route&gt;
&lt;timeout&gt;250.5&lt;/timeout&gt;
&lt;tracelevel&gt;3&lt;/tracelevel&gt;
&lt;http-client-api&gt;
&lt;threadpool&gt;
&lt;max-threads&gt;50&lt;/max-threads&gt;
&lt;min-threads&gt;10&lt;/min-threads&gt;
&lt;queue-size&gt;1000&lt;/queue-size&gt;
&lt;/threadpool&gt;
&lt;/http-client-api&gt;
&lt;document-api&gt;
</pre>



<h2 id="http-client-api">http-client-api</h2>
<pre>
&lt;http-client-api&gt;
&lt;threadpool&gt;
&lt;max-threads&gt;50&lt;/max-threads&gt;
&lt;min-threads&gt;10&lt;/min-threads&gt;
&lt;queue-size&gt;1000&lt;/queue-size&gt;
&lt;/threadpool&gt;
&lt;/http-client-api&gt;
</pre>
<p>Children elements:</p>
<table class="table">
<thead>
<tr><th>Name</th><th>Required</th><th>Value</th><th>Default</th><th>Description</th></tr>
</thead><tbody>
<tr><th>threadpool</th>
<td>optional</td>
<td></td>
<td></td>
<td></td>
<td>
<p id="http-client-api-threadpool">
Contains configuration of the threadpool for the http client api handler.
The pool is initialized with minimum number of threads during startup.
Additional threads will be created on demand once the request queue is full.
Requests are rejected once maximum threads are reached, all threads are busy and the request queue is full.
</p>
</td>
</tr></tbody>
</table>



Expand Down
50 changes: 28 additions & 22 deletions en/reference/services-search.html
Original file line number Diff line number Diff line change
Expand Up @@ -37,9 +37,8 @@
<a href="#node">node</a>
<a href="#renderer">renderer [id, class, bundle]</a>
<a href="#threadpool">threadpool</a>
<a href="#threadpool-max-threads">max-threads</a>
<a href="#threadpool-min-threads">min-threads</a>
<a href="#threadpool-queue-size">queue-size</a>
<a href="#threadpool-threads">threads [ boost ]</a>
<a href="#threadpool-queue">queue</a>
</pre>
<p>
<a href="config-files.html#generic-configuration-in-services-xml">config</a>
Expand Down Expand Up @@ -597,29 +596,36 @@ <h2 id="node">node</h2>

<h2 id="threadpool">threadpool</h2>
<p>
Contains configuration of the threadpool for the jdisc search handler.
The pool is initialized with minimum number of threads during startup.
Additional threads will be created on demand once the request queue is full.
Requests are rejected once maximum threads are reached, all threads are busy and the request queue is full.
Specifies configuration for the thread pool for the jdisc search handler. All parameters are relative to the number of CPU cores -
if a node has 8 vCPU with <code>threads=4</code>, the thread pool will have 32 threads.
Same for queue size - if <code>queue=10</code>, the queue will have capacity for 80 entries.
If the <code>boost</code> attribute is specified, additional threads will be created on demand once the request queue is full.
These threads are then destructed after idling for a fixed amount of time.
Requests are rejected once the maximum number of allowed threads is reached, all threads are busy and the request queue is full.
See <a href="container-tuning.html">Container Tuning</a> for more details.
</p>



<h2 id="threadpool-max-threads">max-threads</h2>
<p>
Maximum number of threads in pool.
</p>



<h2 id="threadpool-min-threads">min-threads</h2>
<h2 id="threadpool-threads">threads</h2>
<p>
Minimum number of threads in pool.
The number of permanent threads relative to CPU cores. Default value is <code>2</code>.
</p>
<table class="table">
<thead>
<tr><th>Attribute</th><th>Required</th><th>Value</th><th>Default</th><th>Description</th></tr>
</thead><tbody>
<tr><th>boost</th>
<td>optional</td>
<td>number</td>
<td></td>
<td>
<p id="threads.boost">
The number of additional threads relative to CPU cores. Default value is <code>2</code>.
</p>
</td></tr>
</tbody>
</table>



<h2 id="threadpool-queue-size">queue-size</h2>
<h2 id="threadpool-queue">queue</h2>
<p>
Request queue size.
The size of the request queue relative to CPU cores. Specify <code>0</code> to disable queuing. Default value is <code>40</code>.
</p>

0 comments on commit a313d25

Please sign in to comment.