Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support proper bulk update via PATCH #1959

Open
ghost opened this issue Sep 21, 2021 · 14 comments · Fixed by #2311
Open

Support proper bulk update via PATCH #1959

ghost opened this issue Sep 21, 2021 · 14 comments · Fixed by #2311
Labels
enhancement a feature, ready for implementation

Comments

@ghost
Copy link

ghost commented Sep 21, 2021

Environment

  • PostgreSQL version: 11
  • PostgREST version: 8.0.0
  • Operating system: Linux

Description of issue

Schema:

drop table if exists example;
create table if not exists example (
	pk integer PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY,
	name   varchar(40) NOT NULL CHECK (name <> ''),
	value	varchar(40) NOT NULL CHECK (value <> '')
);

insert into example (name, value) values ('one', '1');
insert into example (name, value) values ('two', '2');
insert into example (name, value) values ('three', '3');

Postgrest usage:

#!/bin/bash

set -e

data='
    [
        {
            "pk": 1,
            "name": "example"
        },
        {
            "pk": 2,
            "name": "second"
        }
    ]
'

# returns all three
curl \
    -X GET "http://localhost:9000/example"

echo -e ""
echo -e "--"

# null value in column \"value\" violates not-null constraint
curl \
    -X POST "http://localhost:9000/example" \
    -H "Prefer: resolution=merge-duplicates" \
    -d "${data}"

echo -e ""
echo -e "--"

# null value in column \"value\" violates not-null constraint
curl \
    -X POST "http://localhost:9000/example?columns=name" \
    -H "Prefer: resolution=merge-duplicates" \
    -d "${data}"

Bulk update on partial set of columns triggers the NOT NULL constraint of the rest of the columns which are set as NOT NULL. Currently we relaxed the constraints in our database to fix that, and looked at doing this via a stored procedure, however it would be nice to have an official suggestion / fix for this.

@ghost ghost changed the title Bulk Update creates new row and causes conflict with NOT NULL Bulk Update on partial columns causes conflict with NOT NULL of rest of columns Sep 21, 2021
@wolfgangwalther
Copy link
Member

The reason is that you are not doing a bulk update at all. You are doing an INSERT. I guess you wanted to use PUT instead of POST to make it a INSERT ... ON CONFLICT DO UPDATE ..., but that would still yield the same error. The error happens at the INSERT stage, before PostgreSQL ever reaches the UPDATE part.

PUT is not a proper "bulk update".

Related: #1945.

@ghost
Copy link
Author

ghost commented Sep 22, 2021

Thanks for the clarification, makes perfect sense. Is a PATCH the HTTP verb that should be used for this? Should I close this issue and move the discussion over at the Q&A?

@wolfgangwalther
Copy link
Member

Is a PATCH the HTTP verb that should be used for this?

PATCH is the proper way to do a real UPDATE. However, it does not have bulk capabilities.

Should I close this issue and move the discussion over at the Q&A?

Since we already have a Q&A item for that topic, I'd keep this issue. It highlights why PUT is not a full replacement for a bulk update, so we can track this feature request here.

@wolfgangwalther wolfgangwalther changed the title Bulk Update on partial columns causes conflict with NOT NULL of rest of columns Support proper bulk update via PATCH Sep 22, 2021
@wolfgangwalther wolfgangwalther added enhancement a feature, ready for implementation idea Needs of discussion to become an enhancement, not ready for implementation labels Sep 22, 2021
@wolfgangwalther
Copy link
Member

This is currently an idea only. I can see the benefits of being able to do a bulk UPDATE - but I don't see how the REST API interface would look like, yet.

Maybe a query could somehow look like:

UPDATE table
SET column = pgrst_body.column
FROM (SELECT * FROM   json_populate_recordset(NULL::table, $1)) AS pgrst_body
WHERE table.pk_column = pgrst_body.pk_column

@wolfgangwalther
Copy link
Member

but I don't see how the REST API interface would look like, yet.

Actually... Do we currently support PATCHing with an array body? I don't think so. If not, this could be the non-breaking interface:

  • PATCH with json object as body: Current UPDATE with WHERE determined by querystring filters only.
  • PATCH with json array of objects as body: Bulk UPDATE with filters taken from PK columns + request body (as suggested above). Tbd: What happens with querystring filters in this case? Could be either ignored or added additionally.

@steve-chavez
Copy link
Member

Actually... Do we currently support PATCHing with an array body? I don't think so.

An array is accepted, but only the first element is taken into account for the UPDATE.

There was also another option for the query with an UPDATE..FROM, mentioned on #1816

(Not sure which query would be best atm)

@wolfgangwalther
Copy link
Member

There was also another option for the query with an UPDATE..FROM, mentioned on #1816

(Not sure which query would be best atm)

Ah, thanks for linking that other issue, too. I think the queries are basically the same, except for when the json body is parsed. The proposal in #1816 looks like this would be done in haskell, while the one here is doing it on the database side. I think we generally prefer the latter and try to avoid the former, right?

@steve-chavez
Copy link
Member

Bulk UPDATE with filters taken from PK columns

This could work as a first iteration of this feature. If a different payload column is desired as the filter, perhaps we could reuse the on_conflict param. Or like requested on #2123, perhaps the more clear on_constraint param?

@steve-chavez steve-chavez added idea Needs of discussion to become an enhancement, not ready for implementation and removed idea Needs of discussion to become an enhancement, not ready for implementation labels Jun 10, 2022
@steve-chavez
Copy link
Member

steve-chavez commented Jun 11, 2022

Our query already mostly works for this, it's only a matter of when to apply the payload fields as filters.

WITH
pgrst_payload AS ( SELECT '[{"id": 4, "name": "xxx"}, {"id":5, "name": "yyy"}]'::json AS json_data),
pgrst_body AS ( SELECT CASE WHEN json_typeof(json_data) = 'array' THEN json_data ELSE json_build_array(json_data) END AS val FROM pgrst_payload)
UPDATE "test"."projects" SET "name" = x."name" FROM (SELECT * FROM json_populate_recordset (null::"test"."projects" , (SELECT val FROM pgrst_body) )) x
WHERE  "test"."projects"."id" = x.id RETURNING "test"."projects".*;

 id | name | client_id
----+------+-----------
  4 | xxx  |         2
  5 | yyy  |

Tbd: What happens with querystring filters in this case? Could be either ignored or added additionally.

We could add filters(id=gt.4):

WITH
pgrst_payload AS ( SELECT '[{"id": 4, "name": "xxx"}, {"id":5, "name": "yyy"}]'::json AS json_data),
pgrst_body AS ( SELECT CASE WHEN json_typeof(json_data) = 'array' THEN json_data ELSE json_build_array(json_data) END AS val FROM pgrst_payload)
UPDATE "test"."projects" SET "name" = x."name" FROM (SELECT * FROM json_populate_recordset (null::"test"."projects" , (SELECT val FROM pgrst_body) )) x
WHERE  "test"."projects"."id" = x.id AND "test"."projects"."id" > 4 RETURNING "test"."projects".*;

 id | name | client_id
----+------+-----------
  5 | yyy  |

But it doesn't help for knowing when to apply the payload filters. I mean we cannot always apply that condition. If we did it, it could cause confusing situations like:

-- the payload body has no id
WITH
pgrst_payload AS ( SELECT '[{"name": "xxx"}, {"name": "yyy"}]'::json AS json_data),
pgrst_body AS ( SELECT CASE WHEN json_typeof(json_data) = 'array' THEN json_data ELSE json_build_array(json_data) END AS val FROM pgrst_payload)
UPDATE "test"."projects" SET "name" = x."name" FROM (SELECT * FROM json_populate_recordset (null::"test"."projects" , (SELECT val FROM pgrst_body) )) x
WHERE  "test"."projects"."id" = x.id AND "test"."projects"."id" > 4 RETURNING "test"."projects".*;

 id | name | client_id
----+------+-----------
(0 rows)

-- Nothing gets updated

That would also break existing PATCH requests.


I think we should apply the payload filter whenever there are no query param filters. So:

PATCH /projects

[
  {"id": 5, "name": "xxx"},
  {"id": 6, "name": "yyy"}
]

Would generate the query with the id filter applied.

This would break the ability to PATCH the whole table in one request(deliberately or accidentaly), I'd argue that was always unwanted behavior(ref). So this:

PATCH /projects

{"name": "xxx"}

Will always result in 0 updates done(with no error from the db) and a 404 will be returned, same as doing an PATCH /projects?id=eq.non-existent request.

To make that operation possible, we can enforce a limit is present(taking advantage of #2211):

PATCH /projects?limit=10&order=id

{"name": "xxx"}

Only 10 updates will happen.

@steve-chavez

This comment was marked as outdated.

@steve-chavez steve-chavez mentioned this issue Jun 12, 2022
1 task
@steve-chavez steve-chavez added breaking change A bug fix or enhancement that would cause a breaking change and removed idea Needs of discussion to become an enhancement, not ready for implementation labels Jun 12, 2022
@laurenceisla
Copy link
Member

Reopened, now that is #2424 is merged.

@laurenceisla laurenceisla reopened this Aug 12, 2022
@steve-chavez steve-chavez removed the breaking change A bug fix or enhancement that would cause a breaking change label Aug 26, 2022
@steve-chavez

This comment was marked as outdated.

@steve-chavez
Copy link
Member

Reusing the syntax from the discussion on #465 (comment), I think this feature should be implemented in the following way.

Having:

GET /items?id=lte.3

[
            { "id": 1, "name": "Item 1" }
          , { "id": 2, "name": "Item 2" }
          , { "id": 3, "name": "Item 3" }
]

Doing:

PATCH /items?id=lte.3&name=set.$body->name
Prefer: return=representation

[
            { "id": 1, "name": "Updated Item 1" }
          , { "id": 2, "name": "Updated Item 2" }
          , { "id": 3, "name": "Updated Item 3" }
]

Will result on:

[
            { "id": 1, "name": "Updated Item 1" }
          , { "id": 2, "name": "Updated Item 2" }
          , { "id": 3, "name": "Updated Item 3" }
]

After this, #465 should be easier to implement too.

@steve-chavez steve-chavez added the difficulty: medium Haskell task involving PostgreSQL IO label Sep 17, 2023
@taimoorzaeem
Copy link
Collaborator

taimoorzaeem commented Oct 27, 2023

@steve-chavez @laurenceisla
Now that #2926 is merged, POST returns 200 on an update. So, maybe this can be done with POST. e.g

POST /items
Prefer: return=representation, resolution=merge-duplicates

[
            { "id": 1, "name": "Updated Item 1" }
          , { "id": 2, "name": "Updated Item 2" }
          , { "id": 3, "name": "Updated Item 3" }
]
HTTP/1.1 200 OK

[
            { "id": 1, "name": "Updated Item 1" }
          , { "id": 2, "name": "Updated Item 2" }
          , { "id": 3, "name": "Updated Item 3" }
]

@steve-chavez steve-chavez removed the difficulty: medium Haskell task involving PostgreSQL IO label Feb 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement a feature, ready for implementation
Development

Successfully merging a pull request may close this issue.

4 participants