"status" command results in error with too many jobs #234

yhal003 · 2012-03-21T05:00:50Z

our old friend "413 Entity Too Large" happens because method runs for too long.

makkus · 2012-04-02T06:00:25Z

What do you suggest? Kinda hard to solve this....

yhal003 · 2012-04-02T07:56:09Z

can we break request in batches of (say) 50 jobs per call?

On Mon, Apr 2, 2012 at 6:00 PM, Markus Binsteiner <
reply@reply.github.com

wrote:

What do you suggest? Kinda hard to solve this....

Reply to this email directly or view it on GitHub:
#234 (comment)

makkus · 2012-04-11T00:24:13Z

Not easily at all. I'd say this should be considered if we decide to (re-)add batch support to gricli/grisu, but on its own I think it'd be too big a change for only a limited number of users. Those users could for example use a local backend, which would speed up things for them anyway, since a local backend is quicker than a ws-based one...

vladimir-mencl-eresearch · 2012-04-19T02:26:52Z

I remember I was getting that one quite a few times when running the large jobs. Would vote to have a proper fix ... when Gricli was managing a large number of jobs, things were falling apart (random errors). But haven't tried running a large batch recently....

makkus · 2012-04-19T03:08:46Z

Like I said: we'd need to implement proper, stable batch support. At the moment we are using loads of single jobs created by outside scripts to deal with batches of jobs. It's just not possible to cater for that in a way that is viable.

If we had batch support in Grisu we could "hide" them in the list of jobs and only list the "parent" job. And get more details on that if necessary.
But until we have that, having too many (whatever that number is) jobs active in grisu is just not supported and something like this is nowhere to be seen in the nesi milestones.
Happy to implement batch support. But, as always, it's a matter of priorities.

vladimir-mencl-eresearch · 2012-04-19T04:31:18Z

Aha. OK, I agree proper batch support would be the real solution.

I'm just not sure we should settle for saying "Grisu doesn't support large amounts of jobs" - that just makes our infrastructure flaky....

yhal003 · 2012-04-19T04:41:29Z

We did a lot of work to make backend stable. It should support 10,000 jobs
per day or more without that much trouble. and gricli is fine to work with
thousands of jobs too even now, it is just the status command...

On Thu, Apr 19, 2012 at 4:31 PM, vladimir-mencl-eresearch <
reply@reply.github.com

wrote:

Aha. OK, I agree proper batch support would be the real solution.

I'm just not sure we should settle for saying "Grisu doesn't support large
amounts of jobs" - that just makes our infrastructure flaky....

Reply to this email directly or view it on GitHub:
#234 (comment)

makkus · 2012-04-19T04:51:11Z

Ah, right. I see. Sorry, misunderstood.

Totally forgot about this command :-)

Yes, I think that should be possible. Will do.

makkus · 2012-04-19T04:55:05Z

Hm. Actually. Thinking about it, not all that easy, will require some change to the serviceinterface.

What about having a status command in the API? I guess that would be useful, and it could be processed on the backend itself. Might have to play with how to implement it (wether to use cached job statuses and such), but that would be easier....

yhal003 · 2012-04-19T05:03:49Z

but that command would still take a lot of time when user has lots of jobs.
Or you want to have some "status" state that gets updated when user calls
other methods, and "status" method just returns that value/data structure?

On Thu, Apr 19, 2012 at 4:55 PM, Markus Binsteiner <
reply@reply.github.com

wrote:

Hm. Actually. Thinking about it, not all that easy, will require some
change to the serviceinterface.

What about having a status command in the API? I guess that would be
useful, and it could be processed on the backend itself. Might have to play
with how to implement it (wether to use cached job statuses and such), but
that would be easier....

Reply to this email directly or view it on GitHub:
#234 (comment)

makkus · 2012-04-19T20:23:55Z

Not sure I understand what you mean. You are saying, whenever another call is made (or every 5 minutes), all job statuses should be updated, and when the status call is made, only a current snapshot of all jobs (with partly cached/outdated statuses) is used?

yhal003 · 2012-04-19T20:36:30Z

yeah, maybe not every 5 minutes, but on event notifications (we do have
working event notifications now, right?)

On Fri, Apr 20, 2012 at 8:23 AM, Markus Binsteiner <
reply@reply.github.com

wrote:

Not sure I understand what you mean. You are saying, whenever another call
is made (or every 5 minutes), all job statuses should be updated, and when
the status call is made, only a current snapshot of all jobs (with partly
cached/outdated statuses) is used?

Reply to this email directly or view it on GitHub:
#234 (comment)

makkus · 2012-04-19T20:48:00Z

On 20/04/12 08:36, yhal003 wrote:

yeah, maybe not every 5 minutes, but on event notifications (we do have
working event notifications now, right?)

Not sure, if, then only for GT5Submitter. Would require a change to the
JobSubmitter interface I guess.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"status" command results in error with too many jobs #234

"status" command results in error with too many jobs #234

yhal003 commented Mar 21, 2012

makkus commented Apr 2, 2012

yhal003 commented Apr 2, 2012

makkus commented Apr 11, 2012

vladimir-mencl-eresearch commented Apr 19, 2012

makkus commented Apr 19, 2012

vladimir-mencl-eresearch commented Apr 19, 2012

yhal003 commented Apr 19, 2012

makkus commented Apr 19, 2012

makkus commented Apr 19, 2012

yhal003 commented Apr 19, 2012

makkus commented Apr 19, 2012

yhal003 commented Apr 19, 2012

makkus commented Apr 19, 2012

"status" command results in error with too many jobs #234

"status" command results in error with too many jobs #234

Comments

yhal003 commented Mar 21, 2012

makkus commented Apr 2, 2012

yhal003 commented Apr 2, 2012

makkus commented Apr 11, 2012

vladimir-mencl-eresearch commented Apr 19, 2012

makkus commented Apr 19, 2012

vladimir-mencl-eresearch commented Apr 19, 2012

yhal003 commented Apr 19, 2012

makkus commented Apr 19, 2012

makkus commented Apr 19, 2012

yhal003 commented Apr 19, 2012

makkus commented Apr 19, 2012

yhal003 commented Apr 19, 2012

makkus commented Apr 19, 2012