Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement experimental basic job execution progress reporting hook #157

Merged
merged 5 commits into from
Jan 13, 2025

Conversation

nulltoken
Copy link
Collaborator

@nulltoken nulltoken commented Jan 4, 2025

Pull request description

Fix #147

The idea behind this proposal is to provide a non stable reporting mechanism.
While allowing me to scratch my own #147 itch and potentially receiving potential feedback/feature request from other early bleeding edge adopters, this should also allow us to slowly evolve this toward the requirements of #36 of an more feature rich observer.

PR meta checklist

  • Pull request is targeted at main branch for code
  • Pull request is linked to all related issues, if any.

Code PR specific checklist

  • My code follows the code style of this project and AspNetCore coding guidelines.
  • My change requires a change to the documentation.
    • I have updated the documentation accordingly.
  • I have updated the appropriate sub section in the CHANGELOG.md.
  • I have added, updated or removed tests to according to my changes.
    • All tests passed.

@nulltoken nulltoken changed the title [WIP] Implement basic job execution progress reposrting hooks [WIP] Implement basic job execution progress reporting hooks Jan 4, 2025
@nulltoken nulltoken changed the title [WIP] Implement basic job execution progress reporting hooks [WIP] Implement experimental basic job execution progress reporting hooks Jan 4, 2025
@nulltoken nulltoken requested a review from linkdotnet January 4, 2025 11:57
Directory.Build.props Outdated Show resolved Hide resolved
@nulltoken nulltoken force-pushed the ntk/monitor branch 4 times, most recently from 018d842 to f2a28ca Compare January 5, 2025 10:35
@linkdotnet
Copy link
Member

I took a similar approach some time ago. Also @falvarez1 started working on this.
I can remember that I tried to use Mediator-Pattern for this.

Especially because if a job doesn't get scheduled or executed you also need some hook. (For example: Multiple jobs are scheduled at the same time, but only one can be executed).

As far as I can tell, only somewhat "successful" runs will be registered in the current implementation.

We might also want to remove JobState as we have now two things somewhat reflecting the same state.

Copy link

codecov bot commented Jan 5, 2025

Codecov Report

Attention: Patch coverage is 96.61017% with 4 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/NCronJob/Observer/ExecutionProgress.cs 92.30% 1 Missing and 1 partial ⚠️
.../NCronJob/Observer/JobExecutionProgressObserver.cs 95.74% 1 Missing and 1 partial ⚠️
Files with missing lines Coverage Δ
src/NCronJob/Execution/StartupJobManager.cs 87.50% <100.00%> (+2.88%) ⬆️
src/NCronJob/NCronJobExtensions.cs 97.67% <100.00%> (+0.30%) ⬆️
src/NCronJob/Registry/IInstantJobRegistry.cs 92.45% <100.00%> (+0.96%) ⬆️
src/NCronJob/Registry/JobRun.cs 94.73% <100.00%> (+10.32%) ⬆️
src/NCronJob/Scheduler/JobQueueManager.cs 91.80% <ø> (ø)
src/NCronJob/Scheduler/JobWorker.cs 93.38% <100.00%> (+0.09%) ⬆️
src/NCronJob/Observer/ExecutionProgress.cs 92.30% <92.30%> (ø)
.../NCronJob/Observer/JobExecutionProgressObserver.cs 95.74% <95.74%> (ø)

@nulltoken
Copy link
Collaborator Author

nulltoken commented Jan 5, 2025

Especially because if a job doesn't get scheduled or executed you also need some hook. (For example: Multiple jobs are scheduled at the same time, but only one can be executed).

@linkdotnet The current approach is hooked on the JobRun state machine. However, when no JobRun is created, nothing is indeed published.

As far as I can tell, only somewhat "successful" runs will be registered in the current implementation.

That's not the intent. Every state change should be reported.

In one of the latest pushes, I've added a test that showcases an Expired case (as part of this, I discovered that Expired wasn't considered as a final state and fixed it).

Edit: Just covered the Cancelledand Faulted cases as well

Documentation will be eventually hand rolled before merge.

-->
<GenerateDocumentationFile>false</GenerateDocumentationFile>
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Savagely hacked on purpose. To be reverted before merge

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. For what did you need the this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I'm (shamelessly) more interested in getting feedback from the CI server test runs on different platforms than documenting the public types.

All the "Missing XML comment for publicly visible type or member xxxx" error messages were making my "Developer Experience" a lot less rich 😉

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure how to properly handle that.

Opting out might lead to missing XML annotations. It is very strict (well at least not if you use DEBUG locally, but that doesn't help in your case)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Savagely hacked on purpose. To be reverted before merge

Once you think the API is ok. I'll write the xml documentation and revert this change. Before the merge.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted as documentation has been added.

@nulltoken
Copy link
Collaborator Author

We might also want to remove JobState as we have now two things somewhat reflecting the same state.

If that ok with you I'd rather tackle that in a follow up PR. I attempted to do that, but in the end this generated a lot of diff noise that made this proposal harder to read.

Besides, there's a entry JobState.Crashed that doesn't seem to be leveraged anywhere, and I wasn't sure what to do with it 🤔

@nulltoken nulltoken force-pushed the ntk/monitor branch 2 times, most recently from bb1e006 to 32df99b Compare January 5, 2025 22:19
@linkdotnet
Copy link
Member

If that ok with you I'd rather tackle that in a follow up PR. I attempted to do that, but in the end this generated a lot of diff noise that made this proposal harder to read.

Yes sure - I was just interested about the second type that (to some extent) reflects the same nature.

Another idea I toyed around (and maybe you had similar thoughts) is to have a mediator in the middle and do all the things just by messaging, removing a bit of the coupling. Ultimately I didn't go through as it seemed a lot of noise when I spiked something together and the overall amount of objects would be higher than for example your current approach.

@nulltoken
Copy link
Collaborator Author

nulltoken commented Jan 6, 2025

Another idea I toyed around (and maybe you had similar thoughts) is to have a mediator in the middle and do all the things just by messaging, removing a bit of the coupling. Ultimately I didn't go through as it seemed a lot of noise when I spiked something together and the overall amount of objects would be higher than for example your current approach.

As a lot of things happens between all the types, I was willing to disrupt the current code in the least possible way in order to prevent any regression. Hence, why I tiptoed and end up plugging in the JobRun state machine.

The trajectory would be to cover the current proposal with as many tests as possible in order to ensure it works as expected, and merge it as is.

The tests would later provide us with a safety net would you want to revamp the internal messaging architecture.

@nulltoken
Copy link
Collaborator Author

Just added some additional test coverage with regards to StartupJobs.

@linkdotnet Do you see any other existing use cases you'd like to see covered?

@nulltoken
Copy link
Collaborator Author

@linkdotnet rebased

@linkdotnet
Copy link
Member

Hey @nulltoken - I am currently a bit busy. I try to have a closer look at the end of the week / weekend.
Sorry for the inconvenience

@nulltoken nulltoken force-pushed the ntk/monitor branch 4 times, most recently from 5b22a27 to ee59993 Compare January 7, 2025 23:12
@linkdotnet
Copy link
Member

One case would be, that jobs are not executed. Right now, we are using just timeouts for that, which isn't ideal to be honest (in many cases). If we can "see" that a job isn't scheduled for a reason, i think that would make some tests more verbose and easier to understand.

@linkdotnet Can you please orient me to some pieces of code relevant to that (in the lib and in the tests)?

For example: https://github.com/NCronJob-Dev/NCronJob/blob/main/src/NCronJob/JobExecutionContext.cs#L44

Skipping dependent Jobs. Right now, we are using measure like this:

(await WaitForJobsOrTimeout(1, TimeSpan.FromMilliseconds(500))).ShouldBe(false);

See: https://github.com/NCronJob-Dev/NCronJob/blob/main/tests/NCronJob.Tests/RunDependentJobTests.cs#L74

@linkdotnet
Copy link
Member

@nulltoken On an unrelated side note, do you have any GitHub Sponsor page, Kofi or so?

@nulltoken
Copy link
Collaborator Author

@nulltoken On an unrelated side note, do you have any GitHub Sponsor page, Kofi or so?

I do not.

@nulltoken nulltoken mentioned this pull request Jan 11, 2025
@nulltoken
Copy link
Collaborator Author

For example: https://github.com/NCronJob-Dev/NCronJob/blob/main/src/NCronJob/JobExecutionContext.cs#L44

Skipping dependent Jobs. Right now, we are using measure like this:

This one is a whole different beast. JobRuns of dependent jobs aren't even created yet when the skipping is done.
Logged as #160

@nulltoken
Copy link
Collaborator Author

@linkdotnet I think I'm done from the code standpoint on this first step and would welcome a review.

It fulfills the need of #147 and reports the start/completion of an orchestration (whether it would be made of simple job or a root job along with its dependencies).

It also contains the following fixes (covered by the tests):

  • Consider an Expired job state as a final state
  • Prevents the further state changes of job already in its final state

Identified additional features have been logged in separate features:

@nulltoken nulltoken requested a review from linkdotnet January 12, 2025 14:27
@linkdotnet
Copy link
Member

@linkdotnet I think I'm done from the code standpoint on this first step and would welcome a review.

It fulfills the need of #147 and reports the start/completion of an orchestration (whether it would be made of simple job or a root job along with its dependencies).

It also contains the following fixes (covered by the tests):

  • Consider an Expired job state as a final state
  • Prevents the further state changes of job already in its final state

Identified additional features have been logged in separate features:

First of all: Thanks very much for the PR! Really nice.

I guess it was the best decision to track those two things independently and not tackle them in this PR.

That said - let's bring this in. You might wanna add this to the CHANGELOG.md and maybe add some sentences and use cases to the documentation.

@nulltoken
Copy link
Collaborator Author

You might wanna add this to the CHANGELOG.md and maybe add some sentences and use cases to the documentation.

👍 Will work on that and report when it's done.

@nulltoken
Copy link
Collaborator Author

You might wanna add this to the CHANGELOG.md and maybe add some sentences and use cases to the documentation.

👍 Will work on that and report when it's done.

@linkdotnet Done

@nulltoken nulltoken changed the title [WIP] Implement experimental basic job execution progress reporting hooks Implement experimental basic job execution progress reporting hook Jan 13, 2025
@nulltoken nulltoken requested a review from linkdotnet January 13, 2025 11:44
var reporter = app.Services.GetRequiredService<IJobExecutionProgressReporter>();

// ...enlist a new subscriber to it...
IDisposable subscriber = reporter.Register(Subscriber);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe using var subscriber would be more idiomatic for .NET devs

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer keeping the IDisposable type under the light. That's a rather important aspect of the design.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate a bit on this?

Handling the Dispose explicitly can lead to many pitfalls.

Especially what if,during Register and Dispose there is an exception? The example should be somewhat idiomatic and lead developers to a pit of success.

Copy link
Collaborator Author

@nulltoken nulltoken Jan 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate a bit on this?

This example isn't the final way this should be implemented. It is contrived.

In the real life, there are few chances the subscriber will be a private function.
Chances are it'll rather be a type built and injected in the DI container. In that design, the type itself should be disposable (taking care of disposing the registered subscribers) and the overall disposal would be trigerred by the DI container.

However, as I was willing to build a not too lengthy example, I went with this.

While working on #36, it's even possible we'll have to build something similar (or an abstraction of it) and make it available to the users so that they don't have to reinvent the wheel.

Especially what if,during Register and Dispose there is an exception? The example should be somewhat idiomatic and lead developers to a pit of success.

This is an experimental feature. It's not finished. It may not work and/or change. It comes with rough corners and sharp edges 😉

The final developer experience will be ready when this feature is deemed stable.

The only goal of that sample is to showcase how the pieces move together and give a chance to potential early adopters to "play" with it.

It's even described as such in the previous paragraph ("Below a very simple approach....").

If you prefer, I can emphasize this with some words putting under the light this isn't a "copy-and-paste-production-ready" code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't get me wrong, this is totally fine to merge. That is why I already approved.

This is an experimental feature. It's not finished. It may not work and/or change. It comes with rough corners and sharp edges

True - but for me, it was more about how the user code is written. That is totally independent of our stability with the feature.

The code handles the disposable in a suboptimal way - but I get your verbosity here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True - but for me, it was more about how the user code is written. That is totally independent of our stability with the feature.

The code handles the disposable in a suboptimal way

I agree and we'll refine later that part of the documentation

  • but I get your verbosity here.

Thanks for coping with me 😉

Don't get me wrong, this is totally fine to merge. That is why I already approved.

Let's get this in then!

Copy link
Member

@linkdotnet linkdotnet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nulltoken nulltoken force-pushed the ntk/monitor branch 2 times, most recently from 55caa71 to fd4c7e0 Compare January 13, 2025 12:04
@nulltoken nulltoken merged commit b814db8 into main Jan 13, 2025
4 checks passed
@nulltoken nulltoken deleted the ntk/monitor branch January 13, 2025 18:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make it easy to run integration tests against complex job orchestrations
2 participants