Skip to content

Refactor JobRepository for greater clarity and consistency. [BATCH-340] #3235

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
spring-projects-issues opened this issue Feb 5, 2008 · 12 comments

Comments

@spring-projects-issues
Copy link
Collaborator

Lucas Ward opened BATCH-340 and commented

The SimpleJobRepository and it's related Dao's plays a huge part in the architecture, since it completely controls how the domain objects are persisted. Over the last few months, with a lot of added features and minor tweaks, the repository and dao's have gotten a bit out of hand. The Dao's have ballooned in size, and should somehow be split up (even if internally) so that they can be easier to understand. There is also issues with how *Execution domain objects are created, since they were originally created by user, then 'saved'. Things have gotten a bit different lately, with more of a move towards the repository creating even executions, but it's still not consistent.


Affects: 1.0.0.m4

Issue Links:

  • BATCH-384 consolidate map dao implementations
@spring-projects-issues
Copy link
Collaborator Author

Lucas Ward commented

Robert,

I'll be tied up with some other things this week(and next) and won't be able to tackle this one, so I'm assigning to you. Also, I think I've stared at the same repository so many times over the last year, a fresh look at it might be helpful.

@spring-projects-issues
Copy link
Collaborator Author

Robert Kasanicky commented

A few notes from reading through the repository code (and it's clients):

  1. there's no Job entity => isn't the name 'JobRepository' confusing? (Also JobDao and StepDao)

  2. single repository for accessing 4 different entities => 'GlobalBatchMetadataRepository' is more revealing? Any class that has reference to it can (by design) change just anything, e.g. step executor can create job instances

  3. why should a step executor have a reference to repository at all (even if it was StepExecutionRepository)? Shouldn't it just call stepExecution.save() (it is not supposed to deal with executions that were not passed to it)?

  4. (2 & 3) => we are far from the Principle of Least Privilege

  5. Or methods (findOrCreate, saveOrUpdate) feel dirty. I don't have the DDD book with me, but I believe it argues against those - creation and update are two significantly different events in entity's lifecycle

These observations make me feel like we should significantly rethink how persistence is handled. I don't want to make any non-trivial changes to current implementation without having a clear idea how things are supposed to work.

To get started I would consider having separate repositories for different entities and making sure every class possessing a repository reference desperately needs it (this is where privilege minimalism guides me).

What do you think?

@spring-projects-issues
Copy link
Collaborator Author

Dave Syer commented

I am all in favour of minimal privileges. So we could analyse the JobRepository (I think the name is fine - but BatchMetaDataRepository would also work) and see if it can be broken down. I bet there isn't much mileage in it though, but I'd be keen to see a break down of who uses which methods.

The *Dao interfaces could certainly be refactored in the interests of modularity, and still hidden behind the JobRepository.

My view of Or methods is that they are a necessary evil - however there have been a couple of cases where we re-named one because the create operation became atomic for business reasons (i.e. there is never a save(), only an update()). There may be more of those cases - someone already pointed out that the Job should not insert StepExecutions for steps it is not going to run, so there is potentially some cleaning up to do in the Job implementation(s) as well.

I also don't particularly like to see domain entities (e.g. StepExecution) with repository references either - it takes us too far from the point where they can be sent over the wire.

@spring-projects-issues
Copy link
Collaborator Author

Robert Kasanicky commented

StepInstance holds a reference to its last execution, both in domain model and database schema - is there a strong reason for this?

My impression is it is an unnecessary convenience that pollutes both design and implementation, so I'd like to remove it (at least from the db schema) unless proved valuable.

@spring-projects-issues
Copy link
Collaborator Author

Dave Syer commented

I think it is a convenience, but more than just convenient. The value is that I will often, if not constantly, want to know form looking at a step instance, "when did it last run". I think the de-normalised data model in this case is completely justifiable. if you can write an SQL statement in one line that gives me the same view on the data, I might change my mind.

@spring-projects-issues
Copy link
Collaborator Author

Lucas Ward commented

Dave,

i think that's about the same thing I said during my last conversation to Robert. I'm still a little reluctant as to whether or not a good sql statement could be created. He did, however, make some good comments about how the lastExecution should perhaps not be part of the StepInstance, and should perhaps be something like:

jobRepository.getLastExecution(stepInstance)

It seems like this would remove some of the 'circular dependency' issues with the repository and daos, but would require before mentioned sql statement. Either that, or splitting the dao's by instance and execution won't work. I'm interested to see how it works out.

One other possibility would be to load all executions as a list that could be easily sorted by date. Assuming there isn't 1000 executions, it wouldn't be a problem.

@spring-projects-issues
Copy link
Collaborator Author

Dave Syer commented

I guess I can live with jobRepository.getLastExecution(stepInstance).

@spring-projects-issues
Copy link
Collaborator Author

Robert Kasanicky commented

The first shot on the discussed sql statement would be following (just to prove feasibility of using sql):

SELECT * FROM step_execution WHERE step_instance_id = 1 and start_time = (SELECT max(start_time) FROM step_execution WHERE step_instance_id = 1)

However, I see nothing wrong with filtering the latest execution in java as

  1. large counts are not expected
  2. executions needn't be held in memory all in the same time - it's sufficient to iterate over the result set and keep reference to the so-far-latest execution
  3. repository can cache the latest execution if desired

@spring-projects-issues
Copy link
Collaborator Author

Wayne Lund commented

I argued on Robert's side in our last meeting. I think the dynamic selection of getLastExecution() should be fine. I also argue that if our data for arriving at the last execution is incorrect we have other problems.

  1. there's no Job entity => isn't the name 'JobRepository' confusing? (Also JobDao and StepDao)

<Agree. I included a picture of the hierarchy in my notes to Lucas with a copy to you. >

  1. single repository for accessing 4 different entities => 'GlobalBatchMetadataRepository' is more revealing? Any class that has reference to it can (by design) change just anything, e.g. step executor can create job instances

<I've never been a big fan of "Repository" as a higer order DAO. I made this comment in a session with Eric Evans one time. My issue with it is that I've worked too long in 4GL type environments where Repository truly meant the meta data for your applications. When I saw it being used more like DAO behavior it goes against the grain. Nevertheless, since we've chosen to be as close to DDD as possible I can live with that. But just to be clear, Evans says, "Repository: a mechanism for encapsultating storage, retrieval, and search behavior which emulates a collection of objects." Sounds like a DAO to me.

I do like the concept of how Evan's repositories work, and we used them a lot in Smalltalk (e.g. "They allow easy substitution of a dummy implementation, for use in testing (typically using an in-memory collection). This was easy to model with Smalltalk where Classes are objects and we could inherit the in-memory repository behavior easily. It's also really easy to implement with Ruby and Groovy. I also actually like the chapter on Repository but have never been convinced that Repository is much more than a synonym for DAOs and Eric makes the point about not fighting your frameworks My point about all that is to say that I feel like the way we use Repository and DAO are overlapping terms. The Repository wraps our DAOs but acts as a DAO.

I agree wtih Dave though that there's not much mileage in changing much at this point, especially given the small size of our Repository needs. To your point on #2, Robert, about GlobalBatchMetaDataRepository, I don't think it says much more than JobRepository. And since our persistence is hierchical anyway I'm not too hung up about Repostiory=Entire Database Persistence, despite my earlier comments. >

  1. why should a step executor have a reference to repository at all (even if it was StepExecutionRepository)? Shouldn't it just call stepExecution.save() (it is not supposed to deal with executions that were not passed to it)?

<stepExecution.save() is more like the ActiveRecord pattern if you mean that underneath it simply calls stepDao.saveOrUpdate(stepExecution). >

  1. (2 & 3) => we are far from the Principle of Least Privilege

<no comment. not sure what it means in this context>
5. Or methods (findOrCreate, saveOrUpdate) feel dirty. I don't have the DDD book with me, but I believe it argues against those - creation and update are two significantly different events in entity's lifecycle

<As far as the or methods, point #5, I was the one who suggested this to Lucas many moons ago. My logic was that this was standard Hibernate programming for us, of which I've done a lot of over the past 4-5 years. We always have tended for the saveOrUpdate behavior, which may be easier to do for ORMs. I don't think they have much to do with DDD. >

@spring-projects-issues
Copy link
Collaborator Author

Lucas Ward commented

Robert,

it looks like you're about done with this issue, and the other remaining tasks appear to be covered by other jira issues. If so, can you resolve this one?

@spring-projects-issues
Copy link
Collaborator Author

Robert Kasanicky commented

Still expecting more changes to be made to the repository, but it is probably better to track them separately.

@spring-projects-issues
Copy link
Collaborator Author

Dave Syer commented

Assume closed as resolved and released

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant