Merge pull request openjdk#14 from bernd-aws/master

AzeemJiva · web-flow · commit a353db0fdf8b · 2020-04-27T16:12:52.000-07:00
More REAMDE rephrasing.
diff --git a/README.md b/README.md
@@ -4,13 +4,17 @@
 
 Heapothesys (/hɪˈpɒθɪsɪs/) is a JVM garbage collector benchmark developed by the Amazon Corretto team. 
 
-Announcing the open-source Heapothesys benchmark, a synthetic workload which simulates fundamental application characteristics that affect garbage collector latency. The benchmark creates and tests GC load scenarios defined by object allocation rates, heap occupancy, and JVM flags, then reports the resulting JVM pauses. OpenJDK developers can thus produce reference points to investigate capability boundaries of the technologies they are implementing. Heapothesys does not provide accurate predictions of real application behavior, as there are many unaddressed factors that also affect garbage collection performance and other JVM latencies (e.g., sharing available CPU power with the application, fragmentation effects, more dynamic and varied object demographics). Heapothesys can be regarded as a micro-benchmark that is intentionally not trying to model any application behavior besides creating time pressure for the collector. Otherwise we would measure effects of specific ways in which certain applications behave and others don’t. This constrained behavior is special in its own way, too, but it is near one end of the spectrum and thus it provides a boundary for what to expect. In particular, we can gain a rough idea of how different collector implementations perform when certain basic stress factors are dialed up and their leeway to act shrinks. This can shed some light on garbage collector choices and tuning options for applications with various load projections and latency expectations.
+The Amazon Corretto team introduces the open-source Heapothesys benchmark, a synthetic workload which simulates fundamental application characteristics that affect garbage collector latency. The benchmark creates and tests GC load scenarios defined by object allocation rates, heap occupancy, and JVM flags, then reports the resulting JVM pauses. OpenJDK developers can thus produce reference points to investigate capability boundaries of the technologies they are implementing. Heapothesys is intended to be further enhanced to better model and predict additional application behaviors, (e.g., sharing available CPU power with the application, fragmentation effects, more dynamic and varied object demographics, OS scheduling symptoms). The application behavior that it currently simulates is narrowly specialized in its own way, but it is also intentionally minimalistic to provide boundary cases for what to expect. We aim to gain a rough idea of how different collector implementations perform when certain basic stress factors are dialed up and the collector’s leeway to act shrinks. With some cautious optimism, this setup can shine light on garbage collector choices and tuning options for application load projections and latency expectations.
 
-Heapothesys is a starting point to explore GC performance in multiple dimensions. It however, does not attempt to fully model a real application. It focuses on predictively arranging two primary factors that impact collector stress:
+Heapothesys focuses on these two primary factors that are directly responsible for collector stress by increasing the urgency with which it has to act and thus play an important role when investigating GC behavior:
 
 * The Java heap object allocation rate. 
 * The Java heap occupancy, i.e. the total size of live objects, as determined by complete transitive object graph scanning by the collector. 
 
+To configure Heapothesys to reach a high sustained allocation rate, there are two parameters to vary: the number of worker threads and the object size range. On multi-core hosts, the most important one is the number of worker threads (-t <number of worker threads>, default: 4). The object size range is given by the minimum object size (-n <minimum object size in byte>, inclusive, default: 128 byte) and maximum object size (-x <maximum object size in byte>, exclusive, default: 1 KB). When creating a new object, Heapothesys picks a random size between these two. The larger the allocated objects, the easier it is to achieve higher allocation rates. Using smaller objects, the constructed reference graphs become more complex. This provides limited experimentation with different allocation profiles. By default, Heapothesys makes an educated guess based on the number of available CPU cores and the specified heap size.
+
+As an experimental feature, Heapothesys makes dynamic changes to the created object graph in order to exercise the memory read and write barriers typical of concurrent garbage collectors. Before beginning its main test phase, it stores long-lived objects in a hierarchical list of object groups. In order to exercise garbage collector marking phases, higher group objects randomly reference objects in the next lower group to create a somewhat complex and randomized reference graph. This graph does not remain static: Heapothesys constantly replaces a portion of it and reshuffles references between the objects in it. You can control the long-lived object replacement ratio by specifying the -r option (<ratio of objects being replaced per minute>, default: 50). The default value means that 1/50 of objects will be replaced per minute. The reshuffled object reference ratio (-f <ratio of objects get reshuffled>, default: 100) default value means that when replacement happens, 1/100 of inter-object references are reshuffled.
+
 To predict heap occupancy and allocation rates, Heapothesys makes its own calculations based on knowledge of JVM-internal object representations, which depend on the JVM implementation in use. These are currently specific to the HotSpot JVM for JDK 8 or later. The calculations seem to agree with what HotSpot GC logs indicate as long as the following parameter is used correctly. Heapothesys cannot automatically detect when the JVM uses compressed object references, i.e., 32-bit object references in a 64-bit JVM, aka “compressedOops”. You need to set the parameter “-c” to false when running Heapothesys with a 32 GB or larger heap or with a collector that does not support “compressedOops”.
 
 Heapothesys, while written from scratch, inherits its basic ideas from Gil Tene’s [HeapFragger](https://github.com/giltene/HeapFragger) workload. HeapFragger has additional features (e.g., inducing fragmentation and detecting generational promotion), whereas Heapothesys concentrates on accurately predicting the resulting allocation rate. Additionally, we thank to Gil for his [jHiccup](https://www.azul.com/jhiccup/) agent, which we utilize to measure JVM pauses.