POC: test non-boxed, primitive collection storage with reflection #1484

l0rinc · 2016-08-12T11:13:15Z

I created a POC (not to be merged), demonstrating non-boxed primitive storage for Vector.

That is, if a Vector is created with a primitive wrapper type (e.g. Vector<Integer>), the internal array will be of the corresponding primitive type (e.g. int[]) instead of the boxed type (e.g. Integer[]).
All operations on the Vector's internal array are implemented using reflection or via specialized checks.

The memory usage for unboxed primitive storage is very promising, i.e. uses up to ~4x less memory than ArrayList or boxed Vector:

for 1024 elements
`java.util.ArrayList` uses `18.6 KB` (`0 bytes` overhead, `0.0` bytes overhead per element)
`javaslang.collection.BoxedVector` uses `19.3 KB` (`664 bytes` overhead, `0.6` bytes overhead per element)
`javaslang.collection.Vector` uses `4.7 KB` (`-14240 bytes` overhead, `-13.9` bytes overhead per element)

Boxed, no reflection (original implementation):

Operation   Impl                Params            Score  ±  Error  Unit
Create      slang_persistent      1024      550,181.653  ±  6.17% ops/s
Head        slang_persistent      1024  273,522,239.620  ± 15.05% ops/s
Get         slang_persistent      1024      253,750.035  ±  1.97% ops/s

Boxed, with reflection - without primitive conversion:

Operation   Impl                Params            Score  ± Error   Unit
Create      slang_persistent      1024      235,575.521  ± 2.90%  ops/s
Head        slang_persistent      1024   21,594,009.129  ± 0.77%  ops/s
Get         slang_persistent      1024       11,254.908  ± 1.70%  ops/s

Primitive, with reflection - note: uses ~4x less memory:

Operation   Impl                Params            Score  ± Error   Unit
Create      slang_persistent      1024       23,179.890  ± 1.36%  ops/s
Head        slang_persistent      1024   14,401,390.967  ± 2.22%  ops/s
Get         slang_persistent      1024        8,707.254  ± 1.84%  ops/s

i.e. ~10-50 times slower.

However, simple specialization (e.g. if Vector were an interface with e.g. IntVector/CharVector children) reveals (note: uses ~4x less memory):

Operation Impl                  Params            Score  ± Error   Unit
Create    slang_persistent        1024      279,322.949  ± 3.44%  ops/s
Create    slang_persistent_int    1024      655,092.717  ± 1.46%  ops/s

Head      slang_persistent        1024  168,508,482.722  ± 0.83%  ops/s
Head      slang_persistent_int    1024  269,753,246.425  ± 0.91%  ops/s

Get       slang_persistent        1024       78,920.869  ± 0.87%  ops/s
Get       slang_persistent_int    1024      252,564.494  ± 2.06%  ops/s

which is only ~60% of the original speed with boxed end result (i.e. a get that returns an Integer).
If we provide specialized methods (which could be accessed via casting to IntVector), it would have the same speed as the original (using a lot less memory).

EDIT:
specialized for most primitives and the numbers are pretty good! (26cbb13#diff-c6c1876105d111251f38e0916d4d99a4R75)
Also, I found a way of avoiding boxing at the end (making te whole solution viable), take a look :)

Operation  Impl                 Params            Score  ± Error   Unit
Create     slang_persistent       1024      353,200.887  ± 7.15%  ops/s
Create     slang_persistent_int   1024      711,118.523  ± 1.79%  ops/s

Head       slang_persistent       1024  144,142,491.461  ± 1.36%  ops/s
Head       slang_persistent_int   1024  207,905,574.556  ± 0.95%  ops/s

Get        slang_persistent       1024      106,121.968  ± 1.50%  ops/s
Get        slang_persistent_int   1024      214,445.932  ± 1.49%  ops/s

Iterate    slang_persistent       1024      116,656.843  ± 1.49%  ops/s
Iterate    slang_persistent_int   1024    2,163,161.600  ± 1.42%  ops/s

l0rinc · 2016-08-12T11:16:01Z

related to #1449

l0rinc · 2016-08-12T11:27:58Z

javaslang/src/main/java/javaslang/collection/Arrays2.java

+    private static final Class<?>[] WRAPPERS = {Boolean.class, Byte.class, Character.class, Double.class, Float.class, Integer.class, Long.class, Short.class, Void.class};
+    private static final Class<?>[] PRIMITIVES = {boolean.class, byte.class, char.class, double.class, float.class, int.class, long.class, short.class, void.class};
+
+    public static Class<?> toPrimitive(Class<?> wrapper) {


change this to

public static Class<?> toPrimitive(Class<?> wrapper) { return wrapper; }

to disable primitive conversion

l0rinc · 2016-08-12T14:11:03Z

@danieldietrich, @zsolt-donca, @eduardmanas, @ruslansennov, @gpanther, opinions?

l0rinc · 2016-08-12T15:18:07Z

javaslang/src/main/java/javaslang/collection/Arrays2.java

+        return newTrailing;
+    }
+
+    static Object newInstance(Class<?> type, int size) {


separated all the reflective calls

danieldietrich · 2016-08-13T12:42:50Z

Hi Lorinc,

thank you for the investigation and the resulting insights. Javaslang will not provide specializations for generic types (at least in the core module). My vision is to leverage http://openjdk.java.net/jeps/218 to get around the current restrictions of generics.

We could think about an additional module (javaslang-specialized?) that contains special versions of our collections but I think it is not worth the effort - it will lead to a maintenance hell because all changes to the core have to be duped to that module. Who will maintain it over time?

From the very beginning of Javaslang I decided to live with the shortcomings of the language (null, Objects vs. primitives, ...). That's the reason there are no specializations.

DD

l0rinc · 2016-08-13T15:14:30Z

javaslang-benchmark/src/test/java/javaslang/collection/VectorBenchmark.java

@@ -319,363 +111,34 @@ public int slang_persistent() {
        public int slang_persistent_int() {
            int aggregate = 0;
            for (int i : RANDOMIZED_INDICES) {
-                aggregate ^= slangPersistent.getInt(i);
+                int[] leafUnsafe = (int[]) slangPersistent.getLeafUnsafe(i);


getting the array that contains the given index enables the user to avoid boxing :)
package private for now, could be used internally in CharSeq or BitSet

The memory storage for unboxed primitive storage is very promising, i.e.: for 32768 elements `java.util.ArrayList` uses `638.4 KB` (`0 bytes` overhead, `0.0` bytes overhead per element) `javaslang.collection.Vector` uses `148.7 KB` (`-501392 bytes` overhead, `-15.3` bytes overhead per element) for 1024 elements `java.util.ArrayList` uses `18.6 KB` (`0 bytes` overhead, `0.0` bytes overhead per element) `javaslang.collection.Vector` uses `4.7 KB` (`-14240 bytes` overhead, `-13.9` bytes overhead per element) for 32 elements `java.util.ArrayList` uses `504 bytes` (`0 bytes` overhead, `0.0` bytes overhead per element) `javaslang.collection.Vector` uses `240 bytes` (`-264 bytes` overhead, `-8.2` bytes overhead per element) But the cost of reflection makes it 10-100x slower :( Boxed, no reflection: Target Operation Impl Params Count Score ± Error Unit slang_persistent VectorBenchmark Create slang_persistent 32 6 45,151,443.097 ± 16.79% ops/s VectorBenchmark Create slang_persistent 1024 6 550,181.653 ± 6.17% ops/s VectorBenchmark Create slang_persistent 32768 6 15,204.714 ± 14.61% ops/s VectorBenchmark Head slang_persistent 32 6 281,538,398.064 ± 0.90% ops/s VectorBenchmark Head slang_persistent 1024 6 273,522,239.620 ± 15.05% ops/s VectorBenchmark Head slang_persistent 32768 6 281,954,132.384 ± 1.19% ops/s VectorBenchmark Get slang_persistent 32 6 17,018,951.664 ± 8.07% ops/s VectorBenchmark Get slang_persistent 1024 6 253,750.035 ± 1.97% ops/s VectorBenchmark Get slang_persistent 32768 6 2,845.576 ± 2.53% ops/s Boxed with reflection: Target Operation Impl Params Count Score ± Error Unit slang_persistent VectorBenchmark Create slang_persistent 32 6 14,356,024.977 ± 1.47% ops/s VectorBenchmark Create slang_persistent 1024 6 236,487.447 ± 3.82% ops/s VectorBenchmark Create slang_persistent 32768 6 7,086.487 ± 0.87% ops/s VectorBenchmark Head slang_persistent 32 6 22,301,704.436 ± 0.85% ops/s VectorBenchmark Head slang_persistent 1024 6 21,896,434.896 ± 1.25% ops/s VectorBenchmark Head slang_persistent 32768 6 21,572,719.717 ± 0.29% ops/s VectorBenchmark Get slang_persistent 32 6 691,636.638 ± 1.36% ops/s VectorBenchmark Get slang_persistent 1024 6 11,311.820 ± 0.85% ops/s VectorBenchmark Get slang_persistent 32768 6 225.748 ± 1.31% ops/s Primitive with reflection: Target Operation Impl Params Count Score ± Error Unit slang_persistent VectorBenchmark Create slang_persistent 32 6 690,013.657 ± 1.16% ops/s VectorBenchmark Create slang_persistent 1024 6 22,438.292 ± 2.39% ops/s VectorBenchmark Create slang_persistent 32768 6 699.917 ± 1.76% ops/s VectorBenchmark Head slang_persistent 32 6 14,328,490.550 ± 1.44% ops/s VectorBenchmark Head slang_persistent 1024 6 14,192,942.897 ± 2.13% ops/s VectorBenchmark Head slang_persistent 32768 6 12,364,363.839 ± 1.96% ops/s VectorBenchmark Get slang_persistent 32 6 451,149.351 ± 1.98% ops/s VectorBenchmark Get slang_persistent 1024 6 8,785.965 ± 0.83% ops/s VectorBenchmark Get slang_persistent 32768 6 188.556 ± 1.81% ops/s

Operation Impl Params Score ± Error Unit Create slang_persistent 1024 353,200.887 ± 7.15% ops/s Create slang_persistent_int 1024 711,118.523 ± 1.79% ops/s Head slang_persistent 1024 144,142,491.461 ± 1.36% ops/s Head slang_persistent_int 1024 207,905,574.556 ± 0.95% ops/s Get slang_persistent 1024 106,121.968 ± 1.50% ops/s Get slang_persistent_int 1024 214,445.932 ± 1.49% ops/s Iterate slang_persistent 1024 116,656.843 ± 1.49% ops/s Iterate slang_persistent_int 1024 2,163,161.600 ± 1.42% ops/s

l0rinc · 2016-08-13T15:27:28Z

javaslang-benchmark/src/test/java/javaslang/collection/VectorBenchmark.java

@@ -89,7 +89,7 @@ public Object slang_persistent() {

        @Benchmark
        public int slang_persistent_int() {
-            final int head = slangPersistent.intHead();
+            final int head = ((int[]) slangPersistent.getLeafUnsafe(0))[0];


non-boxed head access

l0rinc · 2016-08-13T15:40:15Z

@danieldietrich, it seems I managed to work around the problem that specialization can have, i.e. I don't have to repeat any algoriths, it's contained in general getter/setter/newInstance/getLength methods, as seen in the code (no need for specialized subclasses).

Also, not everybody can wait another 4 years for automatic specialization to appear, especially that this POC proved to be a viable alternative :).
Will try to apply it to the whole Vector impl to see what benefits/drawbacks it has.
Please bare with me :).

l0rinc · 2016-08-14T17:56:53Z

My first full primitivization attempt for Vector: 212242a#diff-5f4a536088f7f19a51ebbef20cf7b3ef

The code looks basically the same, the only significant change in the algorithm is the way we access the array, i.e. instead of a[b] = c we have setAt(a, b, c).
Will investigate the reason why append and prepend are slower now, but otherwise the primitive internals are a lot faster for many methods (e.g. create and iterate are 3-5x faster, using 4x less total memory) :)

@danieldietrich, @ruslansennov, opinions :)?

danieldietrich added the in progress label Aug 12, 2016

l0rinc mentioned this pull request Aug 12, 2016

Based CharSeq on Vector instead of String #1480

Closed

l0rinc changed the title ~~POC: test primitive storage with reflection~~ POC: test non-boxed, primitive collection storage with reflection Aug 12, 2016

l0rinc reviewed Aug 12, 2016
View reviewed changes

l0rinc mentioned this pull request Aug 12, 2016

Investigate BitSet optimizations #1486

Closed

l0rinc reviewed Aug 12, 2016
View reviewed changes

l0rinc reviewed Aug 13, 2016
View reviewed changes

Pap Lőrinc added 4 commits August 13, 2016 18:18

Unrelated refactorings to simplify the review

12459e6

Created specialized int alternative instead of reflection

d3e61f4

l0rinc reviewed Aug 13, 2016
View reviewed changes

l0rinc closed this Aug 14, 2016

danieldietrich removed the in progress label Aug 14, 2016

l0rinc deleted the specialized branch August 15, 2016 08:55

danieldietrich added the feature label Aug 18, 2016

danieldietrich added this to the 2.1.0 milestone Aug 18, 2016

danieldietrich added the reverted/closed/duplicate label Aug 18, 2016

danieldietrich added the «vavr-collection» label Nov 14, 2016

danieldietrich removed this from the vavr-0.9.0 milestone Oct 23, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

POC: test non-boxed, primitive collection storage with reflection #1484

POC: test non-boxed, primitive collection storage with reflection #1484

l0rinc commented Aug 12, 2016 •

edited

Loading

l0rinc commented Aug 12, 2016

l0rinc Aug 12, 2016

l0rinc commented Aug 12, 2016

l0rinc Aug 12, 2016

danieldietrich commented Aug 13, 2016

l0rinc Aug 13, 2016 •

edited

Loading

l0rinc Aug 13, 2016

l0rinc commented Aug 13, 2016

l0rinc commented Aug 14, 2016 •

edited

Loading

POC: test non-boxed, primitive collection storage with reflection #1484

POC: test non-boxed, primitive collection storage with reflection #1484

Conversation

l0rinc commented Aug 12, 2016 • edited Loading

l0rinc commented Aug 12, 2016

l0rinc Aug 12, 2016

Choose a reason for hiding this comment

l0rinc commented Aug 12, 2016

l0rinc Aug 12, 2016

Choose a reason for hiding this comment

danieldietrich commented Aug 13, 2016

l0rinc Aug 13, 2016 • edited Loading

Choose a reason for hiding this comment

l0rinc Aug 13, 2016

Choose a reason for hiding this comment

l0rinc commented Aug 13, 2016

l0rinc commented Aug 14, 2016 • edited Loading

l0rinc commented Aug 12, 2016 •

edited

Loading

l0rinc Aug 13, 2016 •

edited

Loading

l0rinc commented Aug 14, 2016 •

edited

Loading