-
Notifications
You must be signed in to change notification settings - Fork 261
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize StringBuilder/StringBuffer serialization #908
Conversation
java/fury-core/src/main/java/io/fury/serializer/Serializers.java
Outdated
Show resolved
Hide resolved
java/fury-core/src/main/java/io/fury/serializer/Serializers.java
Outdated
Show resolved
Hide resolved
Hi @pandalee99 , thanks for taking time to contribute to fury, I left some comments, could you please take a look at it? |
java/fury-core/src/main/java/io/fury/serializer/StringSerializer.java
Outdated
Show resolved
Hide resolved
StringBuilder didn't provide a zero-copy constructor, can we optimize write in this PR only to reduce complexity. For example, we can keep protocol consistent with StringSerializer by : static Tuple2<ToByteFunction, Function> builderCache;
private static synchronized Tuple2<ToByteFunction, Function> getBuilderFunc() {
if (builderCache == null) {
Function getValue = (Function) Functions.makeGetterFunction(
StringBuilder.class.getSuperclass(), "getValue");
if (Platform.JAVA_VERSION > 8) {
ToByteFunction getCoder = (ToByteFunction) Functions.makeGetterFunction(
StringBuilder.class.getSuperclass(), "getCoder");
builderCache = Tuple2.of(getCoder, getValue);
} else {
builderCache = Tuple2.of(null, getValue);
}
}
return builderCache;
}
public static abstract class AbstractStringBuilderSerializer<T> extends Serializer<T> {
protected final ToByteFunction getCoder;
protected final Function getValue;
protected final StringSerializer stringSerializer;
public AbstractStringBuilderSerializer(Fury fury, Class type) {
super(fury, type);
Tuple2<ToByteFunction, Function> builderFunc = getBuilderFunc();
getCoder = builderFunc.f0;
getValue = builderFunc.f1;
stringSerializer = fury.getStringSerializer();
}
@Override
public void write(MemoryBuffer buffer, T value) {
if (Platform.JAVA_VERSION > 8) {
byte coder = getCoder.applyAsByte(value);
byte[] v = (byte[]) getValue.apply(value);
buffer.writeByte(coder);
buffer.writeBytesWithSizeEmbedded(v);
} else {
char[] v = (char[]) getValue.apply(value);
if (StringSerializer.isAscii(v)) {
stringSerializer.writeJDK8Ascii(buffer, v, v.length);
} else {
stringSerializer.writeJDK8UTF16(buffer, v, v.length);
}
}
}
}
public static final class StringBuilderSerializer extends AbstractStringBuilderSerializer<StringBuilder> {
public StringBuilderSerializer(Fury fury) {
super(fury, StringBuilder.class);
}
@Override
public StringBuilder read(MemoryBuffer buffer) {
return new StringBuilder(stringSerializer.readJavaString(buffer));
}
} Same thing can be done for |
Thank you, but I found a problem, the problem is, if using this way to get char[], then the final result is wrong, the length of char[] will be 19, the content will be "strnullnullnullnull..." This could be a jdk bug |
java/fury-core/src/main/java/io/fury/serializer/Serializers.java
Outdated
Show resolved
Hide resolved
java/fury-core/src/main/java/io/fury/serializer/Serializers.java
Outdated
Show resolved
Hide resolved
java/fury-core/src/main/java/io/fury/serializer/Serializers.java
Outdated
Show resolved
Hide resolved
java/fury-core/src/main/java/io/fury/serializer/StringSerializer.java
Outdated
Show resolved
Hide resolved
java/fury-core/src/main/java/io/fury/serializer/Serializers.java
Outdated
Show resolved
Hide resolved
java/fury-core/src/main/java/io/fury/serializer/Serializers.java
Outdated
Show resolved
Hide resolved
java/fury-core/src/main/java/io/fury/serializer/Serializers.java
Outdated
Show resolved
Hide resolved
java/fury-core/src/main/java/io/fury/serializer/Serializers.java
Outdated
Show resolved
Hide resolved
@pandalee99 Can we add a new public static Object makeGetterFunction(Method method) {
return makeGetterFunction(method, method.getReturnType());
}
public static Object makeGetterFunction(Method method, Class<?> returnType) {
MethodHandles.Lookup lookup = _JDKAccess._trustedLookup(method.getDeclaringClass());
try {
// Why `lookup.findGetter` doesn't work?
// MethodHandle handle = lookup.findGetter(field.getDeclaringClass(), field.getName(),
// field.getType());
MethodHandle handle = lookup.unreflect(method);
return _JDKAccess.makeGetterFunction(lookup, handle, returnType);
} catch (IllegalAccessException ex) {
throw new RuntimeException(ex);
}
} then creating accessor by : Method getCoder = StringBuilder.class.getSuperclass().getDeclaredMethod("getCoder");
ToIntFunction<CharSequence> o = (ToIntFunction<CharSequence>) makeGetterFunction(lookup, lookup.unreflect(getCoder), int.class);
System.out.println(o);
System.out.println(o.applyAsInt(new StringBuilder("abc")));
System.out.println(o.applyAsInt(new StringBuilder("abc你好"))); |
thanks. |
java/fury-core/src/main/java/io/fury/serializer/Serializers.java
Outdated
Show resolved
Hide resolved
java/fury-core/src/main/java/io/fury/serializer/Serializers.java
Outdated
Show resolved
Hide resolved
java/fury-core/src/main/java/io/fury/serializer/Serializers.java
Outdated
Show resolved
Hide resolved
java/fury-core/src/main/java/io/fury/serializer/Serializers.java
Outdated
Show resolved
Hide resolved
commit Co-authored-by: Shawn <shawn.ck.yang@gmail.com>
commit Co-authored-by: Shawn <shawn.ck.yang@gmail.com>
so good! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you so much for your contribution. It's not an easy work for such optimization
* Optimize StringBuilder/StringBuffer serialization * try to optimize StringBuilder * first to Check code Style * hidden * hidden * bug fix and check code style * delete excess code and add buffers to try testing * fix * try to fix problem * fix function * code fix * code fix again * Update java/fury-core/src/main/java/io/fury/serializer/Serializers.java commit Co-authored-by: Shawn <shawn.ck.yang@gmail.com> * Update java/fury-core/src/main/java/io/fury/serializer/Serializers.java commit Co-authored-by: Shawn <shawn.ck.yang@gmail.com> --------- Co-authored-by: pankoli <pankoli@tencent.com> Co-authored-by: Shawn <shawn.ck.yang@gmail.com>
…923) * add codegen invocation annotation * optimize collection serialization protocol by homogeneous info * implement interpreter optimized collection read/write * refine jit if/comparator exprs * implement jit collection optimization * add tests * update depth uo make generics push work * fix collection opt jit * add collection nested opt tests * write decl class for meta share * use walkpath to reuse classinfo/holder * fix get classinfo * inline classinfo to get smaller code size * split methods into small methods * add non final object type tests * misc fix * add missing header * fix class resolver test * fix jit method split * update classinfo only for not decl type * Fix method split for collection jit * add map with set elements test * Optimize StringBuilder/StringBuffer serialization (#908) * Optimize StringBuilder/StringBuffer serialization * try to optimize StringBuilder * first to Check code Style * hidden * hidden * bug fix and check code style * delete excess code and add buffers to try testing * fix * try to fix problem * fix function * code fix * code fix again * Update java/fury-core/src/main/java/io/fury/serializer/Serializers.java commit Co-authored-by: Shawn <shawn.ck.yang@gmail.com> * Update java/fury-core/src/main/java/io/fury/serializer/Serializers.java commit Co-authored-by: Shawn <shawn.ck.yang@gmail.com> --------- Co-authored-by: pankoli <pankoli@tencent.com> Co-authored-by: Shawn <shawn.ck.yang@gmail.com> * Bump release versin to 0.1.2 (#924) * [Doc] add basic type java format doc (#928) add basic type java format doc * [Java] speed test codegen speed by avoid duplicate codegen (#929) * speed test codegen speed by avoid duplicate codegen * fix cache * fix cllass gc * use a standalone lock for every key * refine gc trigger * skip cache for furyGC tests * fix gc tests * lint code * add collection serialization java design doc * update doc * update doc * debug ci * Workaround G1ParScanThreadState::copy_to_survivor_space crash * add iterate array bench results * add benchmark suite * fix jvm g1 workaround * add CollectionSuite header * fix crash * skip unnecessary compress number --------- Co-authored-by: PAN <46820719+pandalee99@users.noreply.github.com> Co-authored-by: pankoli <pankoli@tencent.com>
What do these changes do?
To optimize the code, I avoiding the unnecessary string conversion and directly operating on the internal character array and length of the StringBuffer. This would involve modifying the write() and read() methods to work directly with the StringBuffer's internal data structure.
Related issue number
Closes #94
Check code requirements