Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Passing the scheduled start time to jobs #297

Closed
AdrianTeng opened this issue Mar 20, 2018 · 22 comments
Closed

Feature request: Passing the scheduled start time to jobs #297

AdrianTeng opened this issue Mar 20, 2018 · 22 comments
Milestone

Comments

@AdrianTeng
Copy link

AdrianTeng commented Mar 20, 2018

Currently there are no way for the job to know what is the scheduled start time (only the current time by doing datetime.now()). This feature would be useful for time sensitive jobs (e.g. extra handling if the job is delayed)

@DeeVeX
Copy link

DeeVeX commented Mar 23, 2018

@AdrianTeng
Copy link
Author

@DeeVeX Nope. From what I understand from the docs, start_date is for setting the CronTrigger's first trigger time. Whereas I want to know for each job, what is the scheduled trigger time, and passing that into the job

@DeeVeX
Copy link

DeeVeX commented Mar 25, 2018

Oh that would be nice.
I could use the same thing for a jobs last run to do cleanups!

@zhipcui
Copy link

zhipcui commented Aug 29, 2018

Does anyone working on this?

@viiicky
Copy link

viiicky commented Sep 7, 2018

Yes, this feature is useful to have for time sensitive jobs. I also need this in my current project. Is there any work going on this? I am also open to submit PR for this if I get some light on which direction to move on.

Till then a workaround that I can think of is:

  1. While adding job(add_job()) to scheduler pass an explicit identifier and store it somewhere.
  2. When the job runs, fetch job(get_job()) from scheduler making use of the stored id and then do:
    job.next_run_time - trigger's value
    Something of that sort should work.

But I am not sure if it is full proof. And also, what would happen in the scenarios when using DB store and the scheduler goes down and jobs are missed and then the scheduler comes up. Would the calculation be still correct in that case

Either way waiting for this feature...

@viiicky
Copy link

viiicky commented Sep 7, 2018

Also I guess calling it scheduled_time/nominal_time instead of scheduled_start_time would be less confusing.
@AdrianTeng @DeeVeX

@viiicky
Copy link

viiicky commented Sep 7, 2018

Mine use case by the way is explained below:

Suppose a job is scheduled to poll data from a server every 10 minutes.
Now the request that is made to server might have a start_time and end_time to be mentioned in the request being made. Something like: fetch_messages_from_smtp_server(start_time, end_time) or an equivalent REST API maybe.

So, I will add_job to scheduler mentioning the start time say 7th Sept 2018 12:00:00 UTC and an interval of 10 minutes.
And inside the job definition currently I would do something on these lines (psuedocode):

start_time = datetime.utcnow()
end_time = start_time + 10 mins
fetch_messages_from_smtp_server(start_time, end_time)

Now the catch here is, in an ideal world, I would expect the value of start_time to be as follows in the subsequent runs:
7th Sept 2018 12:00:00 UTC
7th Sept 2018 12:10:00 UTC
7th Sept 2018 12:20:00 UTC
.
.
.
But chances are that these values might be delayed because of several reasons, in which case I might get undesired results from the server.

For example if the actual first run time is 7th Sept 2018 12:00:30 UTC instead of planned/scheduled 7th Sept 2018 12:00:00 UTC, my request would be asking for all the messages between 7th Sept 2018 12:00:30 UTC to 7th Sept 2018 12:10:30 UTC in which case I would miss the messages for the first 30 secs and would possibly get some extra messages because of last 30 secs. And then if the next run runs on the exact scheduled time, I would have duplicate messages for the first 30 secs as they were already retrieved in the last run(issue of last run not this run). etc.

If there is a way where I could access the actual scheduled_time/nominal_time instead of calling datetime.utcnow for each run, this would not be a problem.

@richwifunds
Copy link

I need this feature too for measuring latency. Seems like it would be very easy to implement.

Just add { "scheduled_run_time": run_time } or something like that to **kwargs in line 125 of the BaseExecutor class - "retval = job.func(*job.args, **job.kwargs)".

Anyway, not sure the best way to do it, but since the value is right there, including it in the job.func call should be easy enough.

@vision57
Copy link

I think this is a widely required feature in such cron like scheduler, but is's quite strange that neither apscheduler nor quartz provide this.

@agronholm
Copy link
Owner

This will be implemented in v4.0 in such a manner that the target function can receive all sorts of information about the job.

@richwifunds
Copy link

Any update on this?

@richwifunds
Copy link

@agronholm - has this been implemented and if not do you still plan to implement it and when?

@agronholm
Copy link
Owner

Preliminary work has been done but I'm mostly focusing on the highlight feature (data store sharing) now. It will take some time before 4.0 is in any usable state.

@5uw1st
Copy link

5uw1st commented Jul 21, 2020

I hope add a new event(JOB_ENENT_START) for this

@Dzeri96
Copy link

Dzeri96 commented May 16, 2022

Are there any news on this? Do you need contributors since the function doesn't seem to be moving forward?

@agronholm
Copy link
Owner

There scheduled start time will not be passed to the scheduled function directly, but will be available through a context variable. I'm not sure if this will make into 4.0.0a1 but it should make into the first beta.

@Dzeri96
Copy link

Dzeri96 commented May 16, 2022

@agronholm Thanks for the quick response. Is there an ETA on v4? I've just started using this library.

@agronholm
Copy link
Owner

I've stopped giving out ETAs as they have passed me by one by one. It's best to just follow #465 for progress updates. The first alpha only requires a couple more pushes, as soon as I can muster the willpower. I have quite a few other projects to maintain too, and those take their own share of my free time.

@Dzeri96
Copy link

Dzeri96 commented May 16, 2022

Alright, thanks for the update. I'm guessing that the onboarding process might be so difficult at the moment that it's not worth asking for contributions from your side.

@agronholm
Copy link
Owner

If you really need this feature, you need to fork the project and modify your executor of choice to add this information to context variable (or threadlocal). That way the eventual transition to v4.0 should be relatively painless. I'm unfortunately not accepting contributions on 4.0 code until the code base is stable enough.

@Dzeri96
Copy link

Dzeri96 commented May 16, 2022

I'll just write a function that parses the DB and inserts missing data for now. Looking forward to the release though.

@agronholm
Copy link
Owner

Implemented in v4.0.0a1 via the contextvar apscheduler.current_job.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants