Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cron job stops after certain hours. #232

Closed
mek-omkar opened this issue May 15, 2016 · 49 comments
Closed

cron job stops after certain hours. #232

mek-omkar opened this issue May 15, 2016 · 49 comments

Comments

@mek-omkar
Copy link

mek-omkar commented May 15, 2016

var job = new CronJob({
cronTime: '/1 * * * * *',
onTick:function() {
console.log(new Date, 'tick triggered');
},
onComplete: function(){/
*/},
start: true,
runOnInit: true
});
job.start();

this job hanged & not seeing any logs after certain hours of execution. i am using nodejs 6.1.0 & cron 1.1.0

is cron supports every second jobs && if we not mentioned timezone which time zone takes for the job?

Thanks

@frabbit
Copy link

frabbit commented May 18, 2016

I can confirm this bug, it was a pain to debug, because i thought the problem was on my side.

@TJNevis
Copy link

TJNevis commented Jun 9, 2016

I had the same issue...for me, it runs for a few days just fine - I have an hourly cron - and 2 days ago it stopped with no errors in the node log.

@KKKSzili
Copy link

I can confirm as well, i have cronjob running each second, but hangs after a while. The issue for me is critical because i am switching off water pump (when i do not have enough water in my well).
Using on raspberry pi with node v4.2.1 and cron 1.1.0 .

//Implementation
new CronJob('* * * * * *', function() {
     //My code here
},null,true,'Europe/Budapest',null,true);

Any help would be apreciated.
Thank you!

@jeanmatthieud
Copy link

Same problem here. I did not find any reason for that to happen in my code (every error is logged with sentry, and nothing particular happen)

@colmaengus
Copy link

I have seen this problem also. We have an every minute cron job that runs for days and days and then just stops. I'm tracking the duration between ticks and its normally 60 seconds. The last tick before it stopped was 147 seconds. Maybe this has something to do with the root cause ?

@mek-omkar
Copy link
Author

yes i too suspecting it @colmaengus

@Hithim
Copy link

Hithim commented Jul 31, 2016

Hello guy's.
I've also caught this bug, it's really hard to reproduce it, for me it happens when there is intense CPU and memory allocation.
I've figured out for now that it stops task here: https://github.com/ncb000gt/node-cron/blob/master/lib/cron.js#L453 .
So for some reason timeout become a negative value, I've tested with */1 * * * * * pattern.

@colmaengus
Copy link

We are seeing this quite a lot now. Adding some extra logs it looks to be when node is running so slowly that the next tick time comes around while you are trying to figure out what that next time should be.
start of getTimeout generated "now" as 12:45:01 at 12:45:08.329
exit of _getNextDateFrom logged next tick to be 12:46:00 at 12:45:52.372
exit of sendAt logged next tick to be 12:46:00 at 12:46:00.399

Apart from the discrepancy of some seconds in the logs it would appear that _getNextDateFrom took so long to run that the next time to tick had come around already. The cron job was set to run every minute so there should have only been max 60 iterations through the while loop so I don't know what would cause the extreme slowdown. Apart from the node.js process in general being starved of cpu cycles.

[2016-08-18 12:45:08.329] [INFO] scheduler - getTimeout: 2016-08-18T12:45:01+01:00 [2016-08-18 12:45:52.372] [INFO] scheduler - _getNextDateFrom: 2016-08-18T12:46:00+01:00 [2016-08-18 12:46:00.399] [INFO] scheduler - sendAt: 2016-08-18T12:46:00+01:00 [2016-08-18 12:46:01.032] [INFO] scheduler - timeout:-1

@akhare22sandeep
Copy link

akhare22sandeep commented Oct 5, 2016

We are also facing same issue . Our job runs every second and it works perfectly for a day or max 2 days but after that we don't see any logs , connections to DB are also lost . it just hangs .... Restarting the job again works fine. till now i was thinking its code issue but other people are also facing the same issue. Please let us know if there is work around for this or otherwise we will have to change the module. Help is appreciated

@ncb000gt
Copy link
Member

ncb000gt commented Oct 7, 2016

This issue is the same as #231 - I'm looking into this now.

Sorry for the delayed response. I've had no time to look into my open source projects. Thanks for digging into it - I'll share anything I find here.

@anthonywebb
Copy link

Keep us posted, thanks!

@soundslocke
Copy link

Glad to see some more info being uncovered. This has been going on a while, see #141. A fix was attempted with #147.

@medisoft
Copy link

Is this fixed in 1.3? With 1.1 I still having the problem.

@Shayko94
Copy link

Unfortunately error is still there. Cron just stops after certain hours without any error message :(

@colmaengus
Copy link

As a workaround I'm using the onComplete event and if it is meant to be still running I call start again.

@gonscenna
Copy link

Error is still there on version 1.3

@Alessy
Copy link

Alessy commented Jul 19, 2018

I really don't understand why in start function if timeout < 0 function stop is called.

@ncb000gt
Copy link
Member

I've merged in a few prs that should help with this. Please let me know if you're still having the issue.

@Alessy negative timeouts aren't valid. I could change the behavior to just keep going, but it would likely cause a skip. Really this should either send a warning to the console or throw. Which behavior would you prefer to see?

@ncb000gt
Copy link
Member

Closing for now. If this is still an issue we can take it as a new issue.

@hebo-hebo
Copy link

Anyone knows which commit is for this fix?

@hebo-hebo
Copy link

hebo-hebo commented Oct 17, 2018

@gonscenna

Error is still there on version 1.3

Does 1.4.1 make a difference in your case? Thanks

@dptole
Copy link

dptole commented Dec 4, 2018

@hebo-hebo Same problem here.

The cronjob._timeout property is very similar to the return of the setTimeout function. Sometimes it contains properties that indicates a dead cronjob._timeout that will never run. One of those properties are:

cronjob._timeout._idlePrev = null
cronjob._timeout._idleTimeout = -1

In my experience this problem seems to happen when there are CPU intensive operations happening. The node-cron tries to create a new setTimeout but fails and can't recover. Thats my hypothesis at least.

My temporary solution was to create another cronjob that ressurects dead cronjobs. This is how I try to find dead cronjobs

// FILE: cronjob-a.js
let cronjob = new CronJob(CRON_TIME, CRON_JOB_FUNCTION);

// FILE: ressurect-cronjobs.js
for(const cronjob of GET_ALL_CRONJOBS()) {
  if(cronjob._timeout._idleTimeout < 0)
    RESSURECT_FUNCTION(cronjob);
}

@ncb000gt
Copy link
Member

ncb000gt commented Dec 4, 2018

@dptole Interesting. It would be great to have something like this built into the library, but if it can't get the timeout then it likely wont succeed at getting a new one a few ticks later and it doesn't make sense to use a timeout to try to get a new timeout.

For now it makes sense to me to use the approach that you are here and I'll mull this around a bit.

@colmaengus
Copy link

How about the following ?

  1. Add a check to determine if a cron job is period or one-shot
  2. Add an internal _restart function that does stop/start without calling onComplete
  3. If periodic cron and timeout goes below 0 then call _restart() instead of stop()

I'm doing pretty much this in an external wrapper.

@dptole
Copy link

dptole commented Dec 5, 2018

@colmaengus correct me if I'm wrong but what I grasped from what @ncb000gt said was:

  • This is a good provisory solution but;
  • The fact that sometimes the setTimeout/setInterval fail doesn't make our wrappers a candidate for production code.

Unless you are doing these checks without the setTimeout/setInterval functions (although I think it is being done internally).

In my opinion, if the setTimeout/setInterval failing are core bugs, the ultimate solution would require issuing a new process to keep an eye on that. But that would required a lot more work to fix an issue that happens in very specific use cases.

Maybe the README.md should be updated to warn people of this issue could help... maybe an issue should be created on the nodejs core library... I don't know. But a final solution, I think, should come from v8/nodejs releases.

@ncb000gt
Copy link
Member

ncb000gt commented Dec 6, 2018

@colmaengus @dptole I'd be very skeptical that timers and intervals were somehow not working in node code. I would definitely assume it's the library before suspecting the node implementation.

@colmaengus I think that sounds reasonable. It should already check to see if the job is a one shot or periodic job and there is some checking to determine if the timeout is too large. Clearly, we need the opposite.

@hebo-hebo
Copy link

Saw it happened three times on my two different servers. Is there a utility like pstack or jstack, which can peek into the node.js application to see where it got stuck. Or would it be possible to emit log messages when it runs into this situation (no more launch of cron jobs)? So we can confirm the root cause for sure.

@zhangxiang958
Copy link

hello guys, this bug is still not fixed in 2.0.3 ? or Anyone knows which commit is for this fix?

i'm using version 1.1.0, and found cronjob will be stoped when set in per second:

(new CronJob('* * * * * *', () => {}, null, true, 'Asia/Shanghai')).start()

anyone know why this bug happend? because it is really hard to reproduce it, i had tried to create 5000 process in my server want to keep CPU and memory busy, but still not reproduce it, anyone know how to reproduce this bug?

@ncb000gt
Copy link
Member

ncb000gt commented Jan 8, 2019

@zhangxiang958 The module is at 1.6.0. I'd recommend trying that version.

As far as reproduction, that's part of the problem. People are/were hitting it, but it's not easy to recreate in a test.

I made some changes to the module with the latest version in the middle of December. Try that version and let me know if you're still running into this.

@hebo-hebo
Copy link

My two servers are using the old cron version version '1.1.0'. There are 4 tasks configured and each runs every four minutes. The cron does not fire any more jobs after 25 days. Already happened 3 times on my two servers. Curious to know if there was any bug like that?

@ncb000gt
Copy link
Member

@hebo-hebo There have been reports like that in the past. The cases have varied related to timezones to a couple other possible causes. So far, in the latest version we haven't seen reports of this. So, presumably the issue is resolved. Please let us know if you see this behavior on the latest version. Thanks!

@eranbetzalel
Copy link

Out production tests shows that a Job failed to execute due to node-cron job that did not 'tick'. The cron defined as "14 */1 * * * *", successfully executed every minute, but stopped at 4am for some reason.
I had to fallback to setInterval as I can't trust node-cron to be resilient enough.

@ncb000gt
Copy link
Member

ncb000gt commented Apr 4, 2019

@eranbetzalel You can do what you feel is best obviously. Sorry that the job didn't execute. Which version of the module were you using? What were the conditions on the system, high load and what kind of processing was the ontick handling for you?

@eranbetzalel
Copy link

You're right, forgot to mention that.

Version 1.6.0

I didn't see any CPU high-load in Google's CPU graphs.

The on tick ran a lambda expression that run some job execution...

@ncb000gt
Copy link
Member

ncb000gt commented Apr 4, 2019

@eranbetzalel The latest version is 1.7.0 to fix an issue related to DST found in GH-408.

Given the time frame I suspect this may be what happened. Would you be interested in confirming?

@eranbetzalel
Copy link

I'll look into it whenever I'll have some free time, probably not in the near future.

@ncb000gt
Copy link
Member

ncb000gt commented Apr 4, 2019

@eranbetzalel ok. regardless, thanks for letting me know you ran into an issue.

@aaxc
Copy link

aaxc commented Feb 2, 2021

Problem is still there, just to let you all know

@abrar71
Copy link

abrar71 commented Aug 8, 2021

I can also confirm the issue still exists

@ChrisvanChip
Copy link

Just experienced this issue, still needed to fix.

@Sir-hennihau
Copy link

Another one to confirm this exists.

I thought my server had a memory leak and I was investigating why it crashed. Took me weeks to fix. In the end I replaced node-cron and all my problems disappeared. Sorry to say that, but that's a really bad bug, especially if no error is thrown when the server crashes.

@LucCADORET
Copy link

I think I experienced this bug today also. Hard to know since I couldn't set a breakpoint and debug, but everything matches: 1 second cron job, no error thrown.

@iamkhalidbashir
Copy link

Same problem for mere on AWS tiny instances with low cpu and ram

@spandey1296
Copy link

spandey1296 commented Jan 20, 2023

I have scheduled cron on the server but it gets stopped automatically after 1-2 days or sometimes later.
@ncb000gt pls help into it.
using "cron": "1.7.1",
"cron-parser": "^3.5.0",

@spandey1296
Copy link

@eranbetzalel You can do what you feel is best obviously. Sorry that the job didn't execute. Which version of the module were you using? What were the conditions on the system, high load and what kind of processing was the ontick handling for you?

@TinyDinosaur
Copy link

TinyDinosaur commented May 25, 2023

Hey guys, this is still hapening on version 3.0.0. Same symptoms, no exceptions, no warnings nothin, it simply stops executing the cronjob. This usually happens between 4-6 hours for me.

@intcreator
Copy link
Collaborator

that's crazy seeing as we're only on version 2.3.1. what do you have installed in your package.json?

@TinyDinosaur
Copy link

TinyDinosaur commented May 27, 2023

Omg I have to say I was wrong. This was happening on the library node-cron, and I had two tabs opened and I made the comment on the wrong one. I'm so sorry for the trouble. All is well, this is the one that actually works, Again, sorry.

@intcreator
Copy link
Collaborator

no worries haha. feel free to switch if you want

@kelektiv kelektiv locked as resolved and limited conversation to collaborators Jun 7, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests