Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow data-modifying CTEs in Ecto #532

Merged
merged 8 commits into from
Jul 3, 2023

Conversation

mpotra
Copy link
Contributor

@mpotra mpotra commented Jun 28, 2023

Apparently Ecto does not interpret UPDATE statements inside CTEs, because it's calling on all/2 for each CTE query.
Sadly this turns all data-modifying CTEs into read only (SELECT) CTEs.

For example, given the following:

WITH update_categories as (
  UPDATE "categories"
  SET desc = 'Root Category'
  WHERE parent_id IS NULL
)
SELECT * from update_categories
;

would be turned into this undesirable SQL

WITH update_categories as (
  SELECT "categories"
  WHERE parent_id IS NULL
)
SELECT * from update_categories
;

Apparently it was somewhat easy to enable support for data-modifying CTEs (UPDATE only) inside Postgres Adapter, by matching cte_query/3 on %Ecto.Query{updates: [_ | _]} and replacing all/2 with update_all/2, which this PR does.

I got stuck at doing this for inserts and deletes, and for writing a quick tests for it (apparently plan/1 in tests complains about references)

Hopefully this provides a good starting point for allowing data-modifying CTEs in Ecto.

@josevalim
Copy link
Member

This looks good to me. We need to write tests in test/ecto/adapters and also implement the same for other adapters.

@mpotra
Copy link
Contributor Author

mpotra commented Jun 28, 2023

Thank you @josevalim for your prompt interest in this! ❤️

This is as far as I got with my attempt at writing a test for the above:

defmodule Category do
  use Ecto.Schema
  schema "categories" do
    belongs_to :parent, __MODULE__

    field :name, :string
    field :desc, :string
  end
end

test "CTE with update statement" do
  cte_query =
    Category
    |> where([s], is_nil(:parent_id))
    |> update([s], set: [desc: "Root category"])
    |> select([s], s)

  query =
    {"update_categories", Category}
    |> with_cte("update_categories", as: ^cte_query)
    |> select([c], c)
    |> plan()

  assert all(query) == ""
end

And I get this error

** (ArgumentError) unsupported type `{0, :desc}`. The type can either be an atom, a string or a tuple of the form `{:map, t}` or `{:array, t}` where `t` itself follows the same conditions.
     code: assert all(query) ==

@greg-rychlewski
Copy link
Member

It looks like the cte query didn't go through the planner so the types are not being resolved. Similar to what you do with the main query you could pipe the cte query into plan(). But it might have to be plan(:update_all) in this case.

@greg-rychlewski
Copy link
Member

Actually sorry I don't think ^ makes sense the planner on the main query should be enough. Maybe there is an issue with the planner not handling update in cte.

@greg-rychlewski
Copy link
Member

Yeah this appears to be it. When the cte is planned it only looks at the :all fields:

https://github.com/elixir-ecto/ecto/blob/master/lib/ecto/query/planner.ex#L924

{planned_query, _params, _key} = cte_query |> attach_prefix(query) |> plan(:all, adapter)

https://github.com/elixir-ecto/ecto/blob/master/lib/ecto/query/planner.ex#L1784

@all_exprs [with_cte: :with_ctes, distinct: :distinct, select: :select, from: :from, join: :joins,
              where: :wheres, group_by: :group_bys, having: :havings, windows: :windows,
              combination: :combinations, order_by: :order_bys, limit: :limit, offset: :offset]

Maybe we need a special CTE operation that is basically ^ plus :updates. But I didn't try it out yet.

@mpotra
Copy link
Contributor Author

mpotra commented Jun 28, 2023

I've just added a test for the feature.

So far all the documentation I found for MySQL and MSSQL doesn't show any possibility of data-modifying CTEs; most likely it's supported by PostgreSQL only.
If anyone else can confirm this, I suppose we can leave this PR to only allow it for PostgreSQL and for all others it would keep the read-only behavior we see today.

@greg-rychlewski you are right. Adding update: :updates to the list here https://github.com/elixir-ecto/ecto/blob/master/lib/ecto/query/planner.ex#L1784 makes my test pass.

LE: this is my commit for the above in ecto: mpotra/ecto@2c144d1

@josevalim
Copy link
Member

Please send a PR for Ecto too! For MySQL/TDS, you can raise if the cte has updates :)

@mpotra
Copy link
Contributor Author

mpotra commented Jun 28, 2023

  • Added PR for ecto to allow updates in CTE expressions
  • Added error raising for Tds and MyXQL
  • Added tests for error raised in Tds and MyXQL

@josevalim
Copy link
Member

This LGTM! Here is what we need to do:

  1. Merge the Ecto PR
  2. Change this repo to depend on Ecto from git
  3. Ship this!

@greg-rychlewski
Copy link
Member

greg-rychlewski commented Jun 29, 2023

I was thinking about how to generalize this to inserts and deletes. I'm not sure about inserts yet but I think we could pretty easily handle deletes and updates together if we add a new :operation option to with_cte. Then we would know to call delete_all vs update_all vs all.

I'm not sure there is a good way to handle insert because it doesn't go through the planner. But maybe it's at least worth handling delete?

Deletes are a pretty good use case for this functionality. Delete from the CTE then take the values from returning and insert it into an archive table.

@greg-rychlewski
Copy link
Member

greg-rychlewski commented Jun 29, 2023

I forgot we allow insert_all from a query. So we could support that right away. And then when we support value lists that would kind of be equivalent to supporting lists of maps.

edit: But there are more complications with conflict/conflict target/returning so maybe this is better evaluated in a separate issue/pr.

@mpotra
Copy link
Contributor Author

mpotra commented Jun 29, 2023

Based on @greg-rychlewski's feedback I've made some changes to elixir-ecto/ecto#4220 (introduce :operation option to with_cte/3) and updated this PR to also support :delete_all.

I haven't done :insert_all for CTEs, because that requires a bit more work as Greg mentioned in the comment above.

Comment on lines 316 to 321
operation = case opts[:operation] do
:update_all -> :update_all
:delete_all -> :delete_all
:insert_all -> :insert_all
_ -> :all
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per my comment on Ecto, we should guarantee the .operation field always exist and it is never nil. Once we tidy up the Ecto side, we can revisit this one. :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just updated the code to ensure .operation and handle nil, now that Ecto PR has been merged

@josevalim
Copy link
Member

@mpotra excellent. Can you please update mix.exs to point to Ecto's main branch so CI passes? :)

@josevalim josevalim merged commit 53d73b7 into elixir-ecto:master Jul 3, 2023
@josevalim
Copy link
Member

💚 💙 💜 💛 ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants