Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JPA: how to handle Id fields #225

Closed
jqno opened this issue Nov 23, 2018 · 30 comments
Closed

JPA: how to handle Id fields #225

jqno opened this issue Nov 23, 2018 · 30 comments

Comments

@jqno
Copy link
Owner

jqno commented Nov 23, 2018

This is a question that comes up from time to time in the issue tracker and the mailinglist (for example #89 and #221). Since I don't have much experience with JPA / Hibernate, I'd like this issue to be a place where people more knowledgeable than me can help me find the right behaviour for EqualsVerifier.

JPA has the @Id annotation that signals that the field is a primary key in the database. It feels like a missed opportunity that EqualsVerifier doesn't do anything with that. But what should it do?

  • If an @Id annotation is present, should EqualsVerifier then check that only this id field is used in equals, enforcing a primary-key based equals implementation?
  • Should it do just the opposite, enforicing a content-based equals implementation?
  • Should it check that the primary key is used when the id value is not 0 or null, and check that the content is used when the entity is new, basically combining the two previous options? Currently, there's Warning.IDENTICAL_COPY_FOR_VERSIONED_ENTITY that does more or less the same thing, but it might be a bit confusing the way it's implemented now.
  • Maybe it should do something else entirely?

Do id fields always have the @Id annotation in the first place? I don't know.

Are there other relevant annotations that EqualsVerifier should take into account?

If you see this issue and have an opinion on the matter, please write it down in a comment below! If you know a link to a good blog post on the subject, please also share that below.

@xenoterracide
Copy link

xenoterracide commented Nov 24, 2018

when the id value is not 0

I wouldn't check for a magic number, as that may actually vary per implementation, and some @Ids aren't numbers at all (UUID). I believe we have entities in our system with magic numbers, 0 or -1,

@dfa1
Copy link
Contributor

dfa1 commented Nov 24, 2018

Some interesting insights from @vladmihalcea in this blog post: as far as I understand the best approach would be to use the business key, a unique, non-null and non-updatable column. In Hibernate (not JPA) it is possible to map such business key(s) using @NaturalId.

@xenoterracide
Copy link

In JPA it is possible to map such business key(s) using @NaturalId.

This is a hibernate specific annotation, not part of JPA

@vladmihalcea
Copy link

You can also use the entity identifier for equals, but you need to make sure you use a constant hashCode, as explained in this article.

@xenoterracide
Copy link

I'm no expert on this, but I think the correct answer may be a 2 or 3 modal... SURROGATE, NATURAL, SURROGATE_OR_NATURAL. Let the user choose which mode. The reason I say this is that sometimes there may be no natural key, other times there may be no surrogate, this can happen within the same database, and will definitely happen amongst many products. Now my outstanding questison would be.

  • How does SURROGATE_OR_NATURAL work? is it the same as VERSIONED_ENTITY?
  • is there an attempt at autodetection or do you have to specify when creating the test.
    • I think if @Id exists and you've said one with SURROGATE then the surrogate field can be auto identified, the inverse may be true for NATURAL, auto exclude this field.

I'm not sure that asking if "surrogate or natural" is the right for all hibernate entities, because I think the answer is it depends.

@dfa1
Copy link
Contributor

dfa1 commented Nov 24, 2018

Perhaps a policy of "natural then surrogate" would be a less surprising behavior? In that regard, if a class contains both @NaturalId and @Id fields, then @NaturalId has precedence and @Id would be ignored.

@jqno
Copy link
Owner Author

jqno commented Nov 25, 2018

Having read your responses and also both @vladmihalcea's blog posts (they're very enlightening, thanks for sharing!), I feel that EqualsVerifier should indeed default to using a business/natural key.

I don't mind supporting @NaturalId, even if it's Hibernate-specific and not a JPA standard, so if EV detects this annotation, it would take precedence. EqualsVerifier will assume that only this field is used in the equals contract, and all other fields are not used.

If no @NaturalId is found, it would assume all (non-transient) fields are used in the equals contract. I think it should also exclude the field with the @Id annotation in this case, since it's most likely a surrogate key that has no business value. Just like how in @vladmihalcea's blog, he doesn't use the @Id private Long id field in his equals methods. EqualsVerifier would then show an error when it is used in equals. Would you agree with that?

There would be a Warning.SURROGATE_KEY (does that name make sense?), that will tell EqualsVerifier to assume only the @Id field is used in equals. In this case it would also enforce that the hashCode is a constant value, as described in @vladmihalcea's second blog post.

If I'm feeling particularly motivated, I could also add a check that if any field used in equals has a @ManyToOne annotation, the FetchType must be EAGER. Would this be useful? Or is it a rare edge case that was just added to the blog for completeness's sake? If it is useful, are there other mappings where this check should be done apart from @ManyToOne? @OneToOne maybe?

If all this works well, I could investigate whether Warning.IDENTICAL_COPY_FOR_VERSIONED_ENTITY still makes sense. I think it could probably be removed.

What do you guys think?

@dfa1
Copy link
Contributor

dfa1 commented Nov 25, 2018

I would be pretty happy even without @ManyToOne support. Thanks very much for pushing and implementing all this stuff 🥇

@xenoterracide
Copy link

https://stackoverflow.com/q/5031614/206466 found this while trying to find whether the @Version Optimistic Locking annotation should be included/excluded in equals, haven't really found anything on that, though. I think it's probably not supposed to be checked as part of equals.

Another quandry I have, which is more of @OneToMany and natural keys, sometimes checking the natural key has a different issue, where, equals/hashcode should be fast, if you have a lazy object this could be expensive (though I would argue maybe stop having lazy entities and instead have entity projections that are complete). In this case you can have a lazy fetch entity, but if you're checking the Id and you've done the coding right, it won't fetch the entity just to check equality, because it will already has the id (meaning the load won't be done).

UGH!!! lol

@jqno
Copy link
Owner Author

jqno commented Nov 26, 2018

@dfa1 Well, I still have to actually build all of it of course, that might take a little while 😉. I take it you agree with the proposal?

@xenoterracide I agree that @Version fields don't look like they should be used in equals, but I'll try to read a little more about it before I decide anything. I could also always do that in a later release.

Regarding your @OneToMany quandry: that's exactly why I proposed to enforce that FetchType should be EAGER: see my previous post. That should solve the issue, right?

@dfa1
Copy link
Contributor

dfa1 commented Nov 26, 2018

@dfa1 Well, I still have to actually build all of it of course, that might take a little while 😉. I take it you agree with the proposal?

yes, I do!

@xenoterracide
Copy link

Regarding your @onetomany quandry: that's exactly why I proposed to enforce that FetchType should be EAGER: see my previous post. That should solve the issue, right?

by default yes, but you can change what the fetch is on query.

@jqno
Copy link
Owner Author

jqno commented Nov 26, 2018

Yeah, but then EqualsVerifier fails.

(Maybe it should only fail if the field is actually used in equals).

@xenoterracide
Copy link

Yeah, but then EqualsVerifier fails.

(Maybe it should only fail if the field is actually used in equals).

let me rephrase, you can change fetches at runtime in you jpql/criteria queries. The definition on the annotation is simply the default for the class.

@jqno
Copy link
Owner Author

jqno commented Nov 26, 2018

Ohhh, that explains a lot. Yeah, that sucks. Can't do a lot about it, either.

@xenoterracide
Copy link

another consideration is for jpa entities should we force property access through accessors? that way when calling equals you're for sure interacting with the proxy? this is that optimization I found for id

    @Id
    @Access( AccessType.PROPERTY )
    @GeneratedValue( strategy = GenerationType.IDENTITY )
    @Column( name = ID, columnDefinition = "serial", updatable = false, nullable = false, unique = true )
    private Integer id;

so for example if one has a @OneToMany( fetch = LAZY ) on this... I should be able to do

foo.getBars().stream().map( o -> o.getId() ).collect(toList());

without incurring a fetch... but if I called any other method on this class it will cause the collection to be fetched.

Ohhh, that explains a lot. Yeah, that sucks. Can't do a lot about it, either.

yeah just wanted to make sure we were on the same page

@xenoterracide
Copy link

xenoterracide commented Nov 26, 2018

might be an idea to have equalsverifier enforce... "simple properties only", in other words you can depend on Strings and "primitive like" objects, but not complex custom classes in equals (except maybe that id impl shown above)

@jqno
Copy link
Owner Author

jqno commented Nov 26, 2018

Wait, I don't understand what you're saying here. You're asking whether EqualsVerifier should enforce that property access goes through an accessor method. That's a good question, I'm not sure if it should (and I'm also not sure if it can).

I don't see how the optimization for id has to do with that. Or is this a separate subject?

Remember, I don't know the first thing about JPA, I've never used it. You have to take me by the hand here.

@xenoterracide
Copy link

xenoterracide commented Nov 26, 2018

Sorry, they're both different and the same. By default Hibernate (other providers do other things) creates proxies for your objects on load/save from the database, this is to support that lazy loading. In order to trigger a lazy load you have to call a method on the proxy (gotcha's here with detached proxies...). So if you don't put that "property access" on id, calling getId will always cause the proxy to fetch if it hasn't already.

To test this, you'll have to create a proxy, and ensure that equals works on simple properties ( I think for an aggregate class like Bar, you have to hand equalsverifier an instance anyways?)

@MappedSuperclass
public class AbstractEntityBase<SELF extends AbstractEntityBase<?>> implements Persistable<Integer> {
    public static final String ID = "id";

    @Id
    @Access( AccessType.PROPERTY ) // if I don't set this, and annotate fields getId is not safe either
    @GeneratedValue( strategy = GenerationType.IDENTITY )
    @Column( name = ID, columnDefinition = "serial", updatable = false, nullable = false, unique = true )
    private Integer id;

    @Column 
    private String name; // normally I don't put this in the abstract, done for simplicity

    @Override
    public boolean equals( Object o ) {
        if ( this == o ) {
            return true;
        }
        if ( o instanceof AbstractEntityBase ) {
            AbstractEntityBase that = (AbstractEntityBase) o;
            @SuppressWarnings( "unchecked" )
            var equals = that.canEquals( this )
                    && this.canEquals( that )
                    && this.fieldEquals( (SELF) that );
            return equals;
        }
        return false;
    }

    /**
     * subclasses should implement checking for instanceof their class
     *
     * @param that
     * @return
     */
    protected boolean canEquals( AbstractEntityBase that ) {
        return that != null;
    }

    /**
     * subclasses should implement checking super || properties
     *
     * @param that
     * @return
     */
    protected boolean fieldEquals( SELF that ) {
        return !( this.isNew() && that.isNew() ) && Objects.equals( this.getId(), that.getId() );
    }

    protected int hashFields() {
        return Objects.hashCode( this.getId() );
    }

    @Override
    public int hashCode() {
        return this.hashFields();
    }

    @Override
    public String toString() {
        return String.format( "Entity of type %s with id: %s", this.getClass().getSimpleName(), this.getId() );
    }

    @Override
    public Integer getId() {
        return this.id;
    }

    @SuppressWarnings( "unused" ) // jpa
    public void setId( int id ) {
        this.id = id;
    }

    @Override
    public boolean isNew() {
        return getId() == null;
    }

    public String getName() {
        return name;
    }

    public static List<Long> ids( final Stream<? extends AbstractEntityBase> identifiables ) {
        return identifiables
            .map( Identifiable::getId )
            .filter( Objects::nonNull )
            .distinct()
            .collect( Collectors.toList() );
    }

    public static List<Long> ids( final Collection<? extends AbstractEntityBase> identifiables ) {
        return ids( identifiables.stream() );
    }
}
class Foo extends AbstractEntityBase {


      @OneToOne( fetch=LAZY ) // presume even if these are eager, one can change that when querying
      private Bar bar;

      @OneToMany( fetch=LAZY ) 
      private Collection<Bar> bars;

      public Collection<Bar> getBars() {
            return bars;
      }
      
}
class Bar extends AbstractEntityBase {

    protected boolean fieldEquals( SELF that ) {
        return Objects.equals( this.getName(), that.getName() );
    }     
}

on Foo.fieldEquals if implemented as

bar can be lazy, and will trigger a lazyload on getName()
protected boolean fieldEquals( SELF that ) {
    return Objects.equals( this.getName(), that.getName() ) && Objects.equals( this.getBar(), that.getBar() );
}     
safe property access, and only checks ids
protected boolean fieldEquals( SELF that ) {
    return Objects.equals( this.getName(), that.getName() )
          && Objects.equals( this.getBar().getId(), that.getBar().getId() )
          && Objects.equals( ids( this.getBars() ), ids( that.getBars() ) )
}     

@xenoterracide
Copy link

^ I think I'm done writing that for now... though I may update the classes there for sake of example

@xenoterracide
Copy link

basically I'm thinking that we should ensure that an objects equals matches a proxy of itself with lazy loaded objects uninstantiated, or maybe only instantiated via id (I'm not sure on that if we're going NATURAL by default though).

@jqno
Copy link
Owner Author

jqno commented Nov 27, 2018

Still not sure I understand 100% what you're asking.

  1. I can probably make EqualsVerifier check that equals calls getX accessors instead of referencing x fields directly, when it detects a JPA entity. Would it be a good thing if EqualsVerifier does that?

  2. I can check that FetchType is marked as EAGER on every field. I understand that at runtime, this can be overridden. This is completely out-of-scope for EqualsVerifier, though: I can't see what a program does or doesn't do at runtime. So, knowing this, would it still be a good thing if EqualsVerifier checks for FetchType.EAGER?

  3. Is there anything I'm still missing here? And is there something specific I can do about that? Would 1. and 2. be pointless unless we address this as well?

basically I'm thinking that we should ensure that an objects equals matches a proxy of itself with lazy loaded objects uninstantiated, or maybe only instantiated via id (I'm not sure on that if we're going NATURAL by default though).

  1. What does it mean for an object's equals to "match a proxy of itself with lazy loaded objects uninstantiated"? I'm not going to add a dependency to Hibernate, so we'll have to simulate that behaviour. Does it mean that Foo's bar field is null? Because EqualsVerifier already does checks like that.

@xenoterracide
Copy link

xenoterracide commented Dec 5, 2018

sorry for the late reply

  1. maybe? what do other people think
  2. probably?
  3. ?
  4. proxy, https://docs.oracle.com/javase/8/docs/technotes/guides/reflection/proxy.html
    basically this is standard equals instanceof interlocking checks, but just making sure that LAZY fields aren't part of equals, and that the instanceof stuff is being done right when dealing with a proxy rather than just a normal subclass. a lazy field might be null until the getter is called on the proxy (not the field), in reality any field could be null until the getter is called depending on configuration. When dealing with a proxy, the final class is not the same as the concrete class, the methods may be implemented to do more than the concrete class, the fields may not even be there that.id might not even be there

also, still kind of in favor of a modal system where you have to select the mode(s) before testing.

@jqno
Copy link
Owner Author

jqno commented Dec 5, 2018

@dfa1, what do you think?

@dfa1
Copy link
Contributor

dfa1 commented Dec 5, 2018

I see all the points raised by @xenoterracide: they are real problems but usually I address them in other tests specific to Hibernate/JPA (i.e. to enforce certain custom policies about caching, cascading, etc).

I would be very happy even with a simple, explicit support for annotations:

 EqualsVerifier.forClass(Foo.class).fieldsAnnotatedWith(org.hibernate.annotations.NaturalId.class).verify();
 EqualsVerifier.forClass(Foo.class).fieldsAnnotatedWith(javax.persistence.Id.class).verify();

@jqno
Copy link
Owner Author

jqno commented Dec 5, 2018

So if I understand you correctly, these points are all nice-to-haves?

I'll just start with implementing my initial proposal and see if I can pick up any of these points along the way. If not, they could always be added later.

Personally, I'm not really in favour of a modal approach, or a system where the support for annotations is made explicit. EqualsVerifier has always been very opinionated. For instance, if it sees a @NonNull annotation, it just knows what to do. I very much like that behaviour, because I believe this also guides the user towards a better implementation of equals. Choosing a mode, or adding a fieldsAnnotatedWith flag, is something a user can forget, in which case the user will be guided towards a sub-optimal implementation, and that's something I want to avoid by making EqualsVerifer opinionated.

Thanks to both of your input, I have now been able to form that opinion about Hibernate id's for EqualsVerifier. And of course it will always be possible to disagree and make EqualsVerifier behave differently, by adding a flag here or suppressing a warning there.

@dfa1
Copy link
Contributor

dfa1 commented Dec 5, 2018

So if I understand you correctly, these points are all nice-to-haves?

Not really, I see them as general problem with hibernate/JPA but I don't expect to detect this kind of issues by using this library.

Choosing a mode, or adding a fieldsAnnotatedWith flag, is something a user can forget, in which case the user will be guided towards a sub-optimal implementation, and that's something I want to avoid by making EqualsVerifer opinionated.

Wonderful, this is even better :-)

@xenoterracide
Copy link

So if I understand you correctly, these points are all nice-to-haves?

I'll just start with implementing my initial proposal and see if I can pick up any of these points along the way. If not, they could always be added later.

I think that's fine, I've just been trying to make sure I cover as many parts of the problem with jpa/hibernate as possible since you're starting from 0 ;) just so you have as complete of a context as we can give.

@jqno
Copy link
Owner Author

jqno commented Dec 6, 2018

Thanks to you both!
I'll just go ahead and start coding then 😉
(Just as a heads-up, it might take a while. I don't have a lot of spare time for EqualsVerifier at the moment.)

@jqno
Copy link
Owner Author

jqno commented Dec 24, 2018

I've just released a first version of this with EqualsVerifier 3.1.

It supports the @Id and @NaturalId annotations as discussed; defaulting to a business/natural key. You can suppress Warning.SURROGATE_KEY if you prefer to have a surrogate key. I was still a little uneasy about the constant hashCode for surrogate keys, so I decided not to actively enforce it. (You can always suppress Warning.STRICT_HASHCODE.) You can place the annotations both on the fields, or on their corresponding accessor methods.

The other things we discussed, such as attempting to check for FetchType.EAGER and checking that the accessor methods are used in equals instead of the fields themselves, I haven't implemented yet. Same for the constant hashCode. I may implement these later. (If you want, you can open a ticket for one or more of them: that will put them higher on my list of priorities.)

I've also not changed anything about the behaviour of Warning.IDENTICAL_COPY_FOR_VERSIONED_ENTITY, even though I'm not very happy with how it works now I know more about the subject. Maybe another time.

Please let me know what you think of this!

Merry christmas!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants