-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generate Hibernate ORM InstantiationOptimizers to avoid reflection #43767
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I added a comment below.
Though I would still love @franz1981 to explain why bytecode generation is to be preferred to reflection: I understand why java.reflect
is slow, but I'd expect java.lang.invoke.MethodHandle
, at least, to be just as effective/flexible as custom generated code like this, especially since it will likely lead to megamorphic calls. Otherwise I'm probably missing some constraints that make it impossible for method handles to perform adequately.
.../java/io/quarkus/hibernate/orm/runtime/service/bytecodeprovider/RuntimeBytecodeProvider.java
Outdated
Show resolved
Hide resolved
1fe3bdc
to
6f715dc
Compare
I have added some comments to the original issue exactly on the generation at startup part (which probably is not what we want).
Considered the granularity of the existing optimized accessor API (which work with sets of properties, in batch) the generation part can be branchy, at worse, or very optimized in case the fields requested are a known set, granting both inlining to happen and saving some branch. |
+1 on that. Maybe let's work towards a few very ugly prototypes (e.g. with an entity model hardcoded into the Quarkus Hibernate ORM extension) and a benchmark before we spend significant time on either solution. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks. @franz1981 should we merge as-is, or run a benchmark first, since the only goal of this change is to improve performance?
I know it's not fixing the problem you noticed, but it's related.
I suppose the best way to artificially highlight entity instantiation performance in a benchmark is to perform a single query with lots of returned entities, each with only a single attribute (its ID). You can even test it without a network access by calling EntityManager#getReference(Class entityClass, Object id)
many times.
Actually, SessionFactoryImplementor sessionFactory = em.getEntityManagerFactory().unwrap(SessionFactoryImplementor.class);
Object entityInstance = sessionFactory.getMappingMetamodel().findEntityDescriptor(entityClassName)
.getRepresentationStrategy().getInstantiator().instantiate(sessionFactory); |
Even with bytecode enhancement, which results in the actual entity classes being used as proxies? That's weird, no? Should we fix it? |
Bytecode proxies are not actually the same as the entity class, it's subclasses (named like They get instantiated differently, by the QuarkusProxyFactory#getProxy method, so they do not go through the instantiation optimizers. It's not a bug, I believe it always worked like this. |
I am at devoxx but:
|
e368395
to
f44a690
Compare
Right, I did not mean to imply it's a bug. "Should we improve it?", if you prefer.
Wild. I'd have expected Hibernate ORM to use the enhanced entity class, just with all fields un-initialized. Since the enhanced class can actually handle lazy initialization -- and, at least since a recent patch from Andrea, can handle arbitrary fields being marked as lazy, to support entity graphs. Does what I'm saying at least make sense? I suppose there are reasons not to do it that way, though. |
This comment has been minimized.
This comment has been minimized.
I have checked what quarkus + hibernate generate at build time at using https://github.com/quarkusio/quarkus-quickstarts/tree/3.15/hibernate-orm-quickstart and adding |
Status for workflow
|
Yes, bytecode enhancement happens at build time, both in Quarkus and (in most cases) in vanilla Hibernate ORM. That's possible because bytecode enhancement relies on relatively simple metadata (basically just annotations). Instantiation optimizers require even less metadata (basically just the list of managed classes) so they can be generated at build time too. Access optimizer generation, however, happens at bootstrap in vanilla Hibernate ORM, and cannot happen at build time at the moment, even in Quarkus, because it requires advanced metadata that is only available later, at static init. |
Given that I am ignorant on this topic (but eager to learn) - what about specific (hopefully worthy?) cases which can make it possible? They can even exists? |
Immutable classes can still have all sorts of complicated attributes that would require this advanced metadata to handle them properly. I do not think trying to improve performance in specific cases is worth the effort: even just relyably detecting the few "supportable" cases would be significant work, because that amounts to building that advanced metadata. Not to mention that we've been introducing lots of complicated, duplicated code in Quarkus these past few years, and I'm not keen to continue that. If there's a need of improvements, let's implement them upstream... We can start with the easiest solution, which would be mehtod handles, and working our way towards whatever is best. We could for example start adapting Hibernate ORM to build the optimizer based on Metadata, which we could then consider building at build-time just for the purpose of generating optimizers, and as a third step we'd reuse that metadata during static init. Though I'd be tempted to first evaluate the performance of all approaches we discussed (method handles, generated bytecode, ...) using dirty prototypes (e.g. harcoding the metadata of a particular model at build time), so that we can evaluate what's at stake and how to prioritize the effort. |
Method handles is going to require some exploration on the Hibernate ORM side first, that can be of course be pushed if we think it might benefit performance overall, though the inlining requiring I have a "dirty prototype" almost ready that builds access optimizers at static-init time, if we only want to measure runtime performance we might also use that. I would like to also work on a benchmark just the instantiation optimizers in the context of this PR, but I would need some help creating a meaningful benchmark in the Quarkus context from you guys. |
Let's just start with whatever benchmark @franz1981 was using when he reported the reflection usage as a problem? We can always work our way to more focused benchmarks from there... if necessary. |
This will surely avoid reflective access for mapped object allocation, and all the bytecode generation is done at build-time, so performance impact should be very minor. I'll wait to hear from @franz1981 if we want to go ahead, or if there's anything else I can do for this. |
As it stands right now is fine to me, need to hear @geoand as well for the generation part - I would likely to see what is produced, but I can run it myself as soon as I come back from devoxx |
I've actually added javadoc to the method which does bytecode generation with an example of what's produced, For example, the generated bytecode for an entity package com.example;
import org.hibernate.bytecode.spi.ReflectionOptimizer;
public class MyEntity$QuarkusInstantiator implements ReflectionOptimizer.InstantiationOptimizer {
public Object newInstance() {
return new MyEntity();
}
} I've checked by setting |
Perfect, thanks @mbladel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM and thanks again, well done 👍
Let's also get an approval from @yrodiere |
I'm busy with (much) higher priority stuff right now unfortunately. If you want to merge this, fine, but again I'd appreciate benchmarks and actual numbers to justify the added complexity, @franz1981 . While @mbladel's work looks very nice, it's still more code and thus more potential for bugs, and this could just perform worse for all I know (hint: I know nothing :) ). |
It could be possible to write a benchmark in https://github.com/hibernate/hibernate-orm-benchmark by using the code produced by quarkus build while adding Said that, @yrodiere although I'm aware that being cautious is important (so let's do it), there is some very "difficult to measure" effects in reflection which make it saving is a good choice regardless i.e.:
where the non-reflective cost become negligible (the nearly invisible tiny bar leftmost) if compared to the checks performed by the reflective ones These changes usually are not granted to deliver "big" value alone (if not on microbenchs) but can easily pile-up or just become more relevant if combined with other factors - which makes it hard to judge the ratio of complexity vs effectiveness. |
Sounds good to me, I'll try to prepare the basic structure and see if I can get some significant results (FYI I'm away for a few days, I'll update you when I'm back on my PC). |
@franz1981 sorry for the delay, I'm back and I managed to prepare a benchmark that measures the difference with the optimized vs non-optimized (i.e. reflection) cases: hibernate/hibernate-orm-benchmark#13. The code itself is very simple, still I'd really appreciate if you could confirm whether my approach is correct. @yrodiere you can find my initial findings in terms of performance numbers and flamegraphs in the PR I linked. |
Hey @mbladel , thanks for the benchmark. Do I understand correctly from the conversation at hibernate/hibernate-orm-benchmark#13 that the performance gains are questionable? If so, do you still want to merge this PR? |
@yrodiere indeed benchmarks showed very marginal performance gains from this optimization, mainly due to poor megamorphic interface calls when the number of entities increased, that would indicate little to no advantage from this change. While performance technically was never worse (actually it was always very slightly better in the "most common" case, but we're talking 1-2%), in some cases - specifically when the number of mapped entity classes is very low - we still saw improvements. On the other hand, this behavior would be aligned with both Wildfly's and Hibernate's native bytecode enhancement, and the complexity added is minimal IMO. At this point I'd leave the decision up to you and @franz1981 who initially raised the problem with reflection in #43692 since you know a lot more about Quarkus than I do. |
In that case I'll close without merging; as minimal as the added complexity is, we already have more than enough complexity as things are, and debugging generated bytecode is no fun. Thanks a lot for investigating @mbladel! |
I would ask @zakkak if there are other advantages for Native image instead - given that some reflective information could be dropped now as not necessarym |
Dropping reflective information is not a goal on its' own. If the code being registered is actually used then it's fine. The question here would be whether we eliminate any run-time code (by doing things at build-time) and if so how much of it? Does it affect run-time performance? If not, a potential native executable size decrease would probably not justify the change. It would be interesting to see the corresponding results of hibernate/hibernate-orm-benchmark#13 from a native compilation. @mbladel how hard would that be? |
This would allow to avoid having reflective
It does, but as I said results showed no losses in performance and marginal gains in some cases.
I wouldn't know, haven't really worked with native image much and I'm not sure how jmh behaves with that. Also, the benchmark itself is a simple Hibernate-only application, not a Quarkus one, so not sure how I would go about testing that. |
Yes I saw that. I meant in native mode, should have made it clear. Thanks for the info @mbladel . |
Partially addresses #43692
This change allows Hibernate to avoid using reflection when instantiating managed classes by creating
org.hibernate.bytecode.spi.ReflectionOptimizer.InstantiationOptimizer
s at build-time, and making them available at runtime.Access optimizers are not part of this change as we don't have enough information to generate them at build-time, see the linked issue for more details.