Added draft proposal for WFLY-15659 : Transaction SlotStore config #446

jhalliday · 2021-11-16T11:31:32Z

https://issues.redhat.com/browse/WFLY-15421

…tions

mmusgrov · 2021-11-16T11:38:44Z

transactions/WFLY-15659-Transaction-SlotStore-config-options.adoc

+
+=== Dev Contacts
+
+* mailto:{email}[{author}]


you may add myself or yourself or both here

mmusgrov · 2021-11-16T11:41:15Z

transactions/WFLY-15659-Transaction-SlotStore-config-options.adoc

+
+=== Nice-to-Have Requirements
+
+Extend the sever dependency model to allow use of the Persistent Memory library, mashona, to support SlotStore use on pmem hardware. Optionally the components consuming it could simply bundle their own copies, trading version flexibility vs. footprint.


If it is only for transactions and infinispan could we initially just add the library as a resource in the module.xml file?

It would be really problematic for the WildFly build to deal with two different versions of the same library being provided from the same feature pack, which is what we'd be talking about with WildFly's own use of mashona.

Is it expected that different consumers of mashona, e.g. Narayana and Infinispan, aren't going to be able to align on a consistent mashona version?

If not, simplest is to provide a separate module, consistent with how most artifacts in WildFly are provided.

The upstreams for narayana and infinispan will inevitably go though periods when they diverge somewhat in the version of the mashona library they use, since they don't release, or subsequently get updated in WF, in lockstep. On the other hand, they should generally be in agreement on which version of the mashona library API they use as with e.g. jboss-logging, so it mostly shouldn't matter if the wildfly pom overrides the minor/patch version that the upstream prefers for sake of unity.

mmusgrov · 2021-11-16T11:43:47Z

transactions/WFLY-15659-Transaction-SlotStore-config-options.adoc

+* Testing of the pmem options will require appropriate hardware, though this can be simulated by system configuration (similar to a RAM disk)
+
+== Community Documentation
+////


We need something in the jbosstm docs and in the wildfly transaction model

https://docs.wildfly.org/25/Admin_Guide#Transactions_Subsystem is the relevant Admin Guide section.

bstansberry · 2021-11-16T13:22:23Z

@jmesnil FYI

bstansberry · 2021-11-17T01:32:24Z

transactions/WFLY-15659-Transaction-SlotStore-config-options.adoc

+
+The current options include filesystem based using file-per-transaction or append-only log (reusing code from HornetQ / ActiveMQ Artemis), or a JDBC database.
+
+Narayana upstream now also offers the SlotStore, a filesystem based store that employs an efficient memory mapping approach. Additionally and uniquely, this store can utilise Persistent Memory (pmem) hardware where available for very fast transaction logging.


Is this expected to be more efficient than the journal store even without pmem hardware?

YMMV.
The journal model works by gathering a number of tx log records into a single disk flush, which is great if you happen to have a lot of concurrent tx going on. With the trend to smaller deployments, containers with only one microservice and such, that's less of an advantage. It comes at the cost of higher latency transactions, as each must wait to join the next batch. It also forces global ordering on the tx, which is needed if you're a resource manager (databases, message systems) because data updates have to respect causal order. But it's unnecessary overhead for the tx manager, to which the tx are not ordered.
The SlotStore does one disk flush per tx, which at first glance makes it really inefficient. However, modern SSD can sustain a much higher flush rate that HDD could and indeed benefit from the added concurrency as they can better internally stripe the writes than an HDD with few heads can. It also means each tx can flush immediately instead of waiting for a batch fill/timer, which can reduce latency. At some point you hit a scale ceiling where batching is still beneficial, On pmem that's crazy high, since a flush is in the cost cost ballpark as the thread coordination. On an enterprise SSD not quite so much, but it's at a higher point than many smaller deployments with low tx concurrency ever reach.
To be fair, if its batch interval is tuned to the SSD it's on, the journal can be almost as good even at lower concurrency since you'll essentially be running batches of size 1, though it's still got more thread coordination overhead than the SlotStore. Not that anyone ever tunes it, and the defaults we ship are... somewhat dated compared to modern hardware capabilities. But that's a whole other discussion.
So, not guaranteed to be a win for everyone, but helpful in some use cases.

bstansberry · 2021-11-17T01:53:43Z

transactions/WFLY-15659-Transaction-SlotStore-config-options.adoc

+
+Extending the server's transaction management model to allow configuration of these options would allow users to access the new functionality of this component.
+
+Although the SlotStore code is part of Narayana, the mashona library used to support use on pmem is independent and may also be utilised by other components requiring similar hardware support e.g. Infinispan and messaging. For this reason, it may be suited to packaging as a separate module.


I talked elsewhere about ways to use the subsystem code to allow such a module to be optionally provisioned. But it seems mashona-logwriting is a 32kb jar so that seems like extreme overkill. :)

Indeed. I think the decision probably revolves more around the build and version flexibility than the footprint.

bstansberry · 2021-11-17T02:00:28Z

transactions/WFLY-15659-Transaction-SlotStore-config-options.adoc

+
+=== Hard Requirements
+
+Extend the server management model to facilitate configuration of the new SlotStore transaction log type.


Some details on what config options will be available would be good.

Config is at two levels: telling the tx engine to use the SlotStore and where to put it on the filesystem, then sizing it. Because it's memory mapped and Java doesn't like unmapping things, you pretty much have to pick the sizing ahead of time. So, number of slots (roughly the number of concurrent tx you expect) and size of each slot (how much information each tx record contains) Both those are relatively small, such that the best bet may be to just overprovision it significantly as default. I'm almost tempted not to expose the sizing params at all (as with many of the 100+ tx config options, they could still be tweaked by system properties, just not though the model) but maybe that's just inviting trouble.

bstansberry · 2021-11-17T02:16:43Z

transactions/WFLY-15659-Transaction-SlotStore-config-options.adoc

+
+* Testing of the SlotStore itself can be accomplished by using the same transaction tests that exercise existing store types, but changing the server config to use the new store type.
+
+* Testing of the pmem options will require appropriate hardware, though this can be simulated by system configuration (similar to a RAM disk)


How much of this is covered within Narayana testing?

The ObjectStore interface - all of it. The SlotStore implementation of it - no idea. Theoretically all of it, but that would require running all the store tests against each store implementation, which makes for quite a large matrix. I don't know what the current CI setup does. pmem - none, as the tx CI doesn't have any. I run the mashona tests on real pmem hardware for each release, but don't run the tx test suite, though that should be possible. Feels like if the CI for that has to be somewhere, it's better on the narayana side using fake-pmem or just the SlotStore on SSD, rather than on the mashona side. The advantage of having real pmem hardware is in accurate perf numbers for e.g. regression testing, not in functional testing.

bstansberry · 2021-11-17T02:25:21Z

transactions/WFLY-15659-Transaction-SlotStore-config-options.adoc

+
+* Testing of the new server configuration options will require new tests, patterned on those for existing store configurations.
+
+* Testing of the SlotStore itself can be accomplished by using the same transaction tests that exercise existing store types, but changing the server config to use the new store type.


Are the relevant tests (somewhat) known? Are they fairly concentrated or are we talking about running significant chunks of the testsuite with an adjusted config?

The WF tests? Anything that hits the tx store, which by default is anything running a tx across two resources e.g. a mdb writing a database. Historically I'd have been more worried that set wasn't big enough, rather than that its was too large to be efficiently run, but my knowledge of the app server test coverage is out of date to put it mildly. Coverage can be supplemented somewhat by running with 1PC optimization disabled, such that any tx with even a single resource gets logged to the store, but I'd guess that's still not huge. What's the approach for non-default store configs today? Is the full WF suite run for each of the fs store, journal and jdbc store, or does the bulk of that get exercised only upstream in narayana testing?

tomjenkinson · 2021-11-22T15:21:45Z

transactions/WFLY-15659-Transaction-SlotStore-config-options.adoc

+=== Testing By
+// Put an x in the relevant field to indicate if testing will be done by Engineering or QE. 
+// Discuss with QE during the Kickoff state to decide this
+* [ ] Engineering


I think it will be the case that Engineering tests this so I guess this would be checked?

/cc @mmusgrov

jhalliday · 2022-01-17T16:09:38Z

Time to pick this up again after my PTO. If there are no more questions, I guess the next step is to redraft the PR with all the additional material from the Q&A here so we can move on to implementation?

… values using environment variables JIRA: https://issues.redhat.com/browse/WFCORE-5489 Signed-off-by: Jeff Mesnil <jmesnil@redhat.com>

…rties to configure the managed server JVMs Adding QE contact, Tester role and minor document update [WFCORE-2806] Fix Enginering tick

Added draft proposal for WFLY-15659 : Transaction SlotStore config op…

e1214e4

…tions

mmusgrov approved these changes Nov 16, 2021

View reviewed changes

bstansberry reviewed Nov 17, 2021

View reviewed changes

tomjenkinson reviewed Nov 22, 2021

View reviewed changes

jmesnil and others added 20 commits January 19, 2022 14:30

[WFCORE-5489] As a developer, I want to override management attribute…

f0ba50d

… values using environment variables JIRA: https://issues.redhat.com/browse/WFCORE-5489 Signed-off-by: Jeff Mesnil <jmesnil@redhat.com>

[WFCORE-5483] Provide a LoginModule compatible security realm.

42ffdbe

HAL-1710: Analysis document

6240b07

HAL-1710: Add affected resources

86dd0ba

HAL-1737: Analysis document

7857f40

HAL-1737: Add affected resources

63da8a6

HAL-1625: Analysis document

c8d8b72

HAL-1625: Add affected resources

7fd89bc

HAL-1597: Analysis document

c6400ff

HAL-1597: Add requirements

714f07c

HAL-1597: Add affected resources

712beb2

HAL-1645: Prepare analysis document

be033a4

HAL-1645: Analysis document

3c809de

HAL-1645: Fix copy/paste issues

33bf94c

HAL-1645: Add affected resources

ca71b77

HAL-1729: Add analysis document

7e35319

HAL-1729: Adjust title

f6dba32

HAL-1729: Adjust tester

0aa1746

HAL-1729: Specify the exact preview

0b07514

Add analysis document for HAL-1580

896e326

hpehl and others added 9 commits January 19, 2022 14:30

Add section 'Testing By'

0a08719

Fix issue URL

ba43ed5

HAL-1580: Update analysis document

95ae6b0

HAL-1580: Change document heading

14ab694

HAL-1580: Add affected resources

91fdc07

[WFCORE-2806] Add the ability to resolve standard server system prope…

af11448

…rties to configure the managed server JVMs Adding QE contact, Tester role and minor document update [WFCORE-2806] Fix Enginering tick

[WFCORE-2806] Add WildFly related issue which tracks the test cases

f0708e5

[WFCORE-2806] Minor fix wording

40bb0c5

Updated proposal for WFLY-15659 : Transaction SlotStore config

c9d5f56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added draft proposal for WFLY-15659 : Transaction SlotStore config #446

Added draft proposal for WFLY-15659 : Transaction SlotStore config #446

jhalliday commented Nov 16, 2021 •

edited by bstansberry

Loading

mmusgrov Nov 16, 2021

mmusgrov Nov 16, 2021

bstansberry Nov 17, 2021

jhalliday Nov 23, 2021

mmusgrov Nov 16, 2021

bstansberry Nov 17, 2021

bstansberry commented Nov 16, 2021

bstansberry Nov 17, 2021

jhalliday Nov 23, 2021

bstansberry Nov 17, 2021

jhalliday Nov 23, 2021

bstansberry Nov 17, 2021

jhalliday Nov 23, 2021

bstansberry Nov 17, 2021

jhalliday Nov 23, 2021

bstansberry Nov 17, 2021

jhalliday Nov 23, 2021

tomjenkinson Nov 22, 2021

tomjenkinson Nov 23, 2021

jhalliday commented Jan 17, 2022


		=== Nice-to-Have Requirements

		Extend the sever dependency model to allow use of the Persistent Memory library, mashona, to support SlotStore use on pmem hardware. Optionally the components consuming it could simply bundle their own copies, trading version flexibility vs. footprint.


		The current options include filesystem based using file-per-transaction or append-only log (reusing code from HornetQ / ActiveMQ Artemis), or a JDBC database.

		Narayana upstream now also offers the SlotStore, a filesystem based store that employs an efficient memory mapping approach. Additionally and uniquely, this store can utilise Persistent Memory (pmem) hardware where available for very fast transaction logging.


		Extending the server's transaction management model to allow configuration of these options would allow users to access the new functionality of this component.

		Although the SlotStore code is part of Narayana, the mashona library used to support use on pmem is independent and may also be utilised by other components requiring similar hardware support e.g. Infinispan and messaging. For this reason, it may be suited to packaging as a separate module.


		=== Hard Requirements

		Extend the server management model to facilitate configuration of the new SlotStore transaction log type.


		* Testing of the SlotStore itself can be accomplished by using the same transaction tests that exercise existing store types, but changing the server config to use the new store type.

		* Testing of the pmem options will require appropriate hardware, though this can be simulated by system configuration (similar to a RAM disk)


		* Testing of the new server configuration options will require new tests, patterned on those for existing store configurations.

		* Testing of the SlotStore itself can be accomplished by using the same transaction tests that exercise existing store types, but changing the server config to use the new store type.

Added draft proposal for WFLY-15659 : Transaction SlotStore config #446

Are you sure you want to change the base?

Added draft proposal for WFLY-15659 : Transaction SlotStore config #446

Conversation

jhalliday commented Nov 16, 2021 • edited by bstansberry Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bstansberry commented Nov 16, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jhalliday commented Jan 17, 2022

jhalliday commented Nov 16, 2021 •

edited by bstansberry

Loading