JOBS shouldn't be a sub-resource of Processes #69

matthias-mueller · 2020-05-11T12:22:02Z

The current proposal is to expose jobs as a sub-resource of a process:

GET /processes/{process-id}/jobs/{job-id}/results

I think jobs should be on the same level as processes, e.g.:

GET /jobs/{job-id}/results

Why?

Algorithms(=processes) should be managed independently from actual comuptations(=jobs)
The process description might be updated after job execution/completion, keeping the job as a sub-resource will be misleading, because the process definition has changed
The process might be deleted from the server, but the job's results are still stored
Jobs are usually pooled at the service level, not per-process/algorithm

The text was updated successfully, but these errors were encountered:

bpross-52n · 2020-05-11T14:06:01Z

Hi Matthias, I see you point. The idea was that you send a POST request to /processes/{process-id}/jobs to create new job. Do you suggest to change this execute endpoint, too?

matthias-mueller · 2020-05-12T13:59:53Z

I would probably post new jobs to /jobs and expect a link to the newly created job (/jobs/{job-id}) in the response.

bpross-52n · 2020-05-14T09:47:09Z

Got it, this would require us to re-introduce the process id in the execute JSON. I would postpone discussion of this to after the release of the draft documents.

ghobona · 2020-06-15T16:58:09Z

SWG decision on 2020-06-15 that this will be discussed after the draft document has been released for Public Comment. Consistent with the suggestion in #69 (comment)

m-mohr · 2020-07-28T09:51:03Z

I agree with this, we do it very similarly in openEO: https://api.openeo.org/#tag/Batch-Jobs
Once you go into process chaining (#47) the per process jobs will be a problem anyway.

fmigneault · 2020-09-15T15:49:56Z

Both variants to retrieve jobs and their underlying results are also supported by CRIM's ADES/EMS.

GET /jobs/{jobId}
GET /processes/{id}/jobs/{jobId}

Adding the POST directly on jobs would only require to provide process ID via the body rather than by path.

Contrary to @m-mohr, I don't feel per-process jobs references are a problem during chaining, as that chain should already know about chained processes being executed one after the other anyway (we are executing workflows in this manner without issue). Each job can easily maintain a reference to the process that created it, so both links should be equivalent.

christophenoel · 2020-09-16T07:08:04Z

I'm open to this different approach, but an argument in favour of the process sub-resources is that the Job POST request itself depends on the specific process.

Nevertheless "The process might be deleted from the server, but the job's results are still stored" is also a good point :)

bpross-52n · 2020-10-30T14:00:38Z

We discussed this in the SWG telecon on Monday, October 19th.

We propose to add an additional endpoint /processes/jobs that lists the jobs independent of the process identifier.

@matthias-mueller would you accept this solution?

matthias-mueller · 2020-10-30T14:41:30Z

Listing job-IDs in some place is only a small part of this issue (see all of comment 1 and 3). - More important is the overall process from job creation, via job monitoring, result retrieval and job dismission. Which endpoints are involved in these steps and where do you do you POST new jobs?

pvretano · 2020-10-30T16:05:44Z

Hmm ... perhaps two top-level resource, one for process management and one for job management @matthias-mueller proposes would not be bad. Just thinking out loud here ...

For the /processes resource:

GET /processes gets the list of processes
GET /processes/{processId} get the process description
POST /processes create a new process (aka Transactional WPS)
PUT /processes/{processId} updates a process definition
DELETE /processes/{processId} deletes the process

For the /jobs resource:

GET /jobs - gets a list of all jobs
GET /jobs/{processId} gets a list of jobs for a specific {processId}
GET /jobs/{processId}/{jobId} get the status of a specific job
GET /jobs/{processId}/{jobId}/results get the results of a job
POST /jobs/{processId} create a new job
DELETE /jobs/{processId}/{jobId} deletes or cancels a job

For the /job resource, the {processId} identifier must be from the list of identifiers obtained by accessing the /processes resource.

This organization has the added side effect of making process invocation more natural with a GET in addition to being able to invoke the job with a POST.

Example: http://www.someserver.com/ogcapi/jobs/MyProcess?mode=async&input01=val01&input02=val02&bbox=1,2,3,4&output=tiffImage

Just thinking out loud base on @matthias-mueller comment above. Comments?

sptillma · 2020-10-30T16:36:15Z

Hi all, I was the one that proposed putting the /jobs resource under the /processes root. I did this because many of us have “jobs” coming from all directions – even some geoprocessing that is not involved in OGC definitions at all. I like the idea of a jobs management being pulled to a higher level, but I would also like to be able to see /processes/jobs to help identify those defining OGC specific jobs. Just thought I would explain the logic behind the proposal. Regards, Stan Tillman Executive Manager – Technical, ERDAS APOLLO OGC Technical/Planning Committee Representative Hexagon M: +1 256.653.6420 stan.tillman@hexagon.com<mailto:steven.mcdaniel@hexagon.com> From: Panagiotis (Peter) A. Vretanos <notifications@github.com> Sent: Friday, October 30, 2020 11:06 AM To: opengeospatial/wps-rest-binding <wps-rest-binding@noreply.github.com> Cc: Subscribed <subscribed@noreply.github.com> Subject: Re: [opengeospatial/wps-rest-binding] JOBS shouldn't be a sub-resource of Processes (#69) This email is not from Hexagon’s Office 365 instance. Please be careful while clicking links, opening attachments, or replying to this email. Hmm ... perhaps two top-level resource, one for process management and one for job management @matthias-mueller<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmatthias-mueller&data=04%7C01%7C%7C5d30c3ffd14341cf0d8908d87cedb33a%7C1b16ab3eb8f64fe39f3e2db7fe549f6a%7C0%7C0%7C637396707636588921%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=4mtooMeh6uVz3J2SrQeIuK%2FIIYS69DLi5C3%2BQt3i57Q%3D&reserved=0> proposes would not be bad. Just thinking out loud here ... For the /processes resource: * GET /processes gets the list of processes * GET /processes/{processId} get the process description * POST /processes create a new process (aka Transactional WPS) * PUT /processes/{processId} update a process definition * DELETE /processes/{processId} deletes all the processes For the /jobs resource: * GET /jobs - get all gobs * GET /jobs/{processId} get a list of jobs for a specific {processId} * GET /jobs/{processId}/{jobId} get the status of a specific job * GET /jobs/{processId}/{jobId}/results get the results of a job * POST /jobs/{processId} create a new job * DELETE /jobs/{processId}/{jobId} deletes or cancels a job For the /job resource, the {processId} identifier must be from the list of identifiers obtained by accessing the /processes resource. This organization has the added side effect of making process invocation more natural with a GET in addition to being able to invoke the job with a POST. Example: http://www.someserver.com/ogcapi/jobs/MyProcess?mode=async&input01=val01&input02=val02&bbox=1,2,3,4&output=tiffImage<https://eur02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.someserver.com%2Fogcapi%2Fjobs%2FMyProcess%3Fmode%3Dasync%26input01%3Dval01%26input02%3Dval02%26bbox%3D1%2C2%2C3%2C4%26output%3DtiffImage&data=04%7C01%7C%7C5d30c3ffd14341cf0d8908d87cedb33a%7C1b16ab3eb8f64fe39f3e2db7fe549f6a%7C0%7C0%7C637396707636593909%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=iKOcsT4%2FURmVeDLq1iaT3rMHWbFqz3Iqb8fCnXPbdJQ%3D&reserved=0> Just thinking out loud base on @matthias-mueller<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmatthias-mueller&data=04%7C01%7C%7C5d30c3ffd14341cf0d8908d87cedb33a%7C1b16ab3eb8f64fe39f3e2db7fe549f6a%7C0%7C0%7C637396707636598901%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=hkDhB3MRUYIbEvZSTx6hTrazoWqBYeU7dOjB%2BVbBCKw%3D&reserved=0> comment above. Comments? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fopengeospatial%2Fwps-rest-binding%2Fissues%2F69%23issuecomment-719642951&data=04%7C01%7C%7C5d30c3ffd14341cf0d8908d87cedb33a%7C1b16ab3eb8f64fe39f3e2db7fe549f6a%7C0%7C0%7C637396707636603894%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=BYsBAK8cVpmYJnKwQmn81Fkchkzt4plScv9PgUaWD2A%3D&reserved=0>, or unsubscribe<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAB77FIC6QGAWVXQ5TKU66VTSNLP6RANCNFSM4M53SHQA&data=04%7C01%7C%7C5d30c3ffd14341cf0d8908d87cedb33a%7C1b16ab3eb8f64fe39f3e2db7fe549f6a%7C0%7C0%7C637396707636608883%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=nzR8c1xbQQpdrWr%2FT0tgMYMv4AGklBN%2FgtwznmXVroQ%3D&reserved=0>.

pvretano · 2020-10-30T17:07:25Z

@sptillma OGC API does not impose any requirement on the path ahead of the / so one could arrange to put all the OGC-specific resources in one sub-tree and other non-OGC stuff in other sub-trees. For example, at CubeWerx, we segregate the OGC API resource from other resources by placing an "ogcapi/" in the path: https://eratosthenes.pvretano.com/cubewerx/cubeserv/default/ogcapi/wpstest/processes
The "wpstest" path element maps to the dataset / data store / distribution as per DCAT so /ogcapi// is the landing page as per OGC API Common and after that ... well we all know the story! ;)

fmigneault · 2020-10-30T19:34:39Z

I find having /jobs/{processId}/... is extremely counter-intuitive, since jobs is placed first, one would expect to have a jobId after and not a process reference.
I would much rather have something like /jobs/{jobId}?processId={processId} if a root /job was absolutely needed while also providing some processId via the path.

matthias-mueller · 2020-10-30T21:11:45Z

I think there are two distict types of scenarios/setups to consider:

If you operate a WPS service with static set of immutable process definitions, you and your clients probably have some sympathy to associate each job with the corresponding processId because it is a permanent realation in this kind of setup and it seems natural to craft the API into this direction.
If you acknowledge concepts like WPS-T or operate a WPS service that changes it's process offerings frequently (in particular removal of process-ids of modification of the related process descriptions and I/O parameters) or just do not communicate the update policy of your process offerings to your clients, you an your clients will probably prefer a more generic layout that does not suggest a permanent relationship between processIds and jobs. In that case it seems more natural to have separate endpoints for processes and jobs to separate concerns and avoid false expectations on the client side.

christophenoel · 2020-11-02T07:29:27Z

Hi Peter, I think somebody reported that the job resource might exist while the corresponding process resource has been deleted (or even deleted, then replaced with a new process with same id). I would suggest the following: * GET /jobs - get all gobs * POST /jobs/ create a new job * DELETE /jobs/{jobId} deletes or cancels a job * GET /jobs/{jobId} get the status of a specific job * GET /jobs/{jobId}/results get the results of a job * GET /jobs/{processId} get a list of jobs for a specific {processId} Regards, Christophe.

pvretano · 2020-11-02T07:37:07Z

@spacebel Yes, DELETE /jobs/{jobId}; miss-typed that.
GET /jobs/{jobId} ... sure.
GET /jobs/{jobId}/results ... sure.
However, I am not seeing POST /jobs. How does the server know which process is being invoked? Is the process id encoded in the execute request? As far as I remember, that is not currently the case. I think it needs to be POST /jobs/{processId} no?

christophenoel · 2020-11-02T08:15:45Z

Dear Peter, I have no clear cut opinion, but from a REST perspective, you might consider that job is the resource and thus, jobs/ POST add a new job resource with processId inside. I consider jobs /processId as an alias/shortcut for advanced info. Just some thoughts…

pvretano · 2020-11-02T08:21:17Z

@spacebel yeah I think you are right.
So POST /jobs and the execute request contains the process id or POST /jobs?processId=MyProcess.

matthias-mueller · 2020-11-02T08:41:47Z

... or put it in the body of the POST request? (But maybe you already ruled that one out in previous discussions)

pvretano · 2020-11-02T08:43:35Z

@matthias-mueller yes, that is what I meant by "... the execute request contains the process id ...".

bpross-52n · 2020-11-02T08:54:42Z

... or put it in the body of the POST request? (But maybe you already ruled that one out in previous discussions)

I would not say it was ruled out. In fact, the process id was part of the execute json earlier. But it was removed, as it was kind of duplicate due to the execute endpoint, which included the process id. It would not be a big change to re-add the id to the execute json.

matthias-mueller · 2020-11-02T08:57:43Z

Got it. The nice thing of a self-sufficient jobs endpoint would be that could be used as a means to add asynchronous processing to other parts of the OGC API (e.g. long running WFS/WCS operations). Not sure if that one is still on the agenda but it used to pop up randomly when we discussed WPS with other SWGs.

fmigneault · 2020-11-02T19:23:10Z

@spacebel
I find all of the routes in #69 (comment) are valid except:

GET /jobs/{processId} get a list of jobs for a specific {processId}

There is no real safe way to differentiate between processId and jobId of GET /jobs/{jobId} in that case, especially if they are both UUIDs.

I also see POST /jobs as a completely valid use case, whether processId is provided via query parameter or body is equivalent, but I think via body is better since inputs of the process are provided this way anyway. Makes sense to have the process ID matching the inputs specified together.

christophenoel · 2020-11-03T07:26:47Z

Hi Francis,

I see your point, even if processId is not a UUID.

What do you think about adding an optional filter parameter to the HTTP GET /jobs ?

/jobs?parentProcess={processId}

fmigneault · 2020-11-03T14:17:08Z

Yes. That would work. Maybe simply /jobs?processId={processId}, or maybe even just identifier (as in older WPS).

bpross-52n · 2020-11-10T11:22:21Z

In our last telecon, we agreed that it would be ok to remove the process id from the /jobs endpoint.

The question is whether the /jobs endpoint should be moved

(1) to the root level, i.e. the same level as /processes

{root}/processes
{root}/jobs

or (2) under /processes

{root}/processes
{root}/processes/jobs

Some discussion points:
In case of (1), the {root}/jobs endpoint could be used with other OGC APIs that offer asynchronous functionality. Eventually this could be moved to API Common. On the other hand, the /jobs endpoint at root level could interfere with already existing endpoints of vendor-specific APIs that are used for job control.
Approach (2) would couple the /jobs endpoint more tightly to the API - Processes and could be used when no other OGC API used the jobs concept.

christophenoel · 2020-11-10T12:23:58Z

Not clear to me why 2 could be used with other OGC APIs offering asynchronous funcitonalities ? Could you please elaborate a little ?

bpross-52n · 2020-11-10T12:59:53Z

Sorry, but I am not sure that I understand the question correctly. The baseline is that

(1) {root}/jobs could also be used by other OGC APIs that offer asynchronous functionality

whereas

(2) {root}/processes/jobs could be used when the other OGC APIs do not offer asynchronous functionality or at least do not make use the jobs concept or URL endpoint

christophenoel · 2020-11-10T14:06:22Z

Ok, I got it.

pvretano · 2020-11-10T14:13:00Z

@bpross-52n I don't think that interference with vendor-specific endpoint is an issue. That argument could be made for any endpoint off the root (conformance, collections, etc.). My feeling is that vendors will segregate APIs using upstream path elements. For example, for CubeWerx all ogcapi endpoints live under ".../ogcapi/{datastore}/".
@bpross-52n with your answer to @spacebel are you suggesting that we support both /jobs and /processes/jobs? Generally I find adding too many degrees of freedom like that gets confusing.

bpross-52n · 2020-11-10T14:46:19Z

@pvretano I agree with you. I do not suggest to use both approaches. I merely wanted to start a discussion here about which one we should choose.

sptillma · 2020-11-10T14:50:50Z

@pvretano I'm not sure your argument regarding the interference with vendor-specific endpoints is valid to say "because it doesn't affect things the way we do it" :). I brought this argument up trying to resolve conflicts that "might" happen in the future. However, during the last meeting, you gave the best argument for /jobs when you said it might be moved into Common and apply across other standards as well. In that case, we define a standard that can be pointed to justify top level definition. But if it is self-contained to only apply to OGC API Processes, then I would argue it should stay under /processes. I'm good either way as long as we have merit behind the decision.

bpross-52n · 2021-01-15T13:28:08Z

The respective changes should be merged now.

bpross-52n added the draft.4 label May 14, 2020

bpross-52n added the change request label Jun 8, 2020

bpross-52n added 1.0-draft.5 Draft version for after the public review and removed draft.4 labels Aug 17, 2020

fmigneault mentioned this issue Sep 15, 2020

[Feature] Conformance with OGC-API Processes crim-ca/weaver#200

Closed

1 task

matthias-mueller mentioned this issue Nov 2, 2020

Privacy for Job IDs #104

Closed

bpross-52n closed this as completed Jan 15, 2021

pvretano mentioned this issue Apr 22, 2021

Improve consistancy for the jobs ressources #177

Closed

MartinPontius mentioned this issue Jan 3, 2022

jobs should be a top-level resource geopython/pygeoapi#838

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JOBS shouldn't be a sub-resource of Processes #69

JOBS shouldn't be a sub-resource of Processes #69

matthias-mueller commented May 11, 2020

bpross-52n commented May 11, 2020

matthias-mueller commented May 12, 2020

bpross-52n commented May 14, 2020

ghobona commented Jun 15, 2020

m-mohr commented Jul 28, 2020

fmigneault commented Sep 15, 2020

christophenoel commented Sep 16, 2020 •

edited

Loading

bpross-52n commented Oct 30, 2020

matthias-mueller commented Oct 30, 2020 •

edited

Loading

pvretano commented Oct 30, 2020 •

edited

Loading

sptillma commented Oct 30, 2020 via email

pvretano commented Oct 30, 2020

fmigneault commented Oct 30, 2020

matthias-mueller commented Oct 30, 2020

christophenoel commented Nov 2, 2020 via email •

edited

Loading

pvretano commented Nov 2, 2020 •

edited

Loading

christophenoel commented Nov 2, 2020 via email •

edited

Loading

pvretano commented Nov 2, 2020

matthias-mueller commented Nov 2, 2020

pvretano commented Nov 2, 2020

bpross-52n commented Nov 2, 2020

matthias-mueller commented Nov 2, 2020

fmigneault commented Nov 2, 2020

christophenoel commented Nov 3, 2020 •

edited

Loading

fmigneault commented Nov 3, 2020

bpross-52n commented Nov 10, 2020

christophenoel commented Nov 10, 2020

bpross-52n commented Nov 10, 2020

christophenoel commented Nov 10, 2020

pvretano commented Nov 10, 2020

bpross-52n commented Nov 10, 2020

sptillma commented Nov 10, 2020

bpross-52n commented Jan 15, 2021

JOBS shouldn't be a sub-resource of Processes #69

JOBS shouldn't be a sub-resource of Processes #69

Comments

matthias-mueller commented May 11, 2020

bpross-52n commented May 11, 2020

matthias-mueller commented May 12, 2020

bpross-52n commented May 14, 2020

ghobona commented Jun 15, 2020

m-mohr commented Jul 28, 2020

fmigneault commented Sep 15, 2020

christophenoel commented Sep 16, 2020 • edited Loading

bpross-52n commented Oct 30, 2020

matthias-mueller commented Oct 30, 2020 • edited Loading

pvretano commented Oct 30, 2020 • edited Loading

sptillma commented Oct 30, 2020 via email

pvretano commented Oct 30, 2020

fmigneault commented Oct 30, 2020

matthias-mueller commented Oct 30, 2020

christophenoel commented Nov 2, 2020 via email • edited Loading

pvretano commented Nov 2, 2020 • edited Loading

christophenoel commented Nov 2, 2020 via email • edited Loading

pvretano commented Nov 2, 2020

matthias-mueller commented Nov 2, 2020

pvretano commented Nov 2, 2020

bpross-52n commented Nov 2, 2020

matthias-mueller commented Nov 2, 2020

fmigneault commented Nov 2, 2020

christophenoel commented Nov 3, 2020 • edited Loading

fmigneault commented Nov 3, 2020

bpross-52n commented Nov 10, 2020

christophenoel commented Nov 10, 2020

bpross-52n commented Nov 10, 2020

christophenoel commented Nov 10, 2020

pvretano commented Nov 10, 2020

bpross-52n commented Nov 10, 2020

sptillma commented Nov 10, 2020

bpross-52n commented Jan 15, 2021

christophenoel commented Sep 16, 2020 •

edited

Loading

matthias-mueller commented Oct 30, 2020 •

edited

Loading

pvretano commented Oct 30, 2020 •

edited

Loading

christophenoel commented Nov 2, 2020 via email •

edited

Loading

pvretano commented Nov 2, 2020 •

edited

Loading

christophenoel commented Nov 2, 2020 via email •

edited

Loading

christophenoel commented Nov 3, 2020 •

edited

Loading