Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DBAL-163: Upsert support in DBAL #1320

Closed
doctrinebot opened this issue Sep 10, 2011 · 18 comments
Closed

DBAL-163: Upsert support in DBAL #1320

doctrinebot opened this issue Sep 10, 2011 · 18 comments
Assignees

Comments

@doctrinebot
Copy link

Jira issue originally created by user @beberlei:

Upsert support in DBAL (replace, insert into usw..)

@doctrinebot
Copy link
Author

Comment created by mvrhov:

FYI: http://www.depesz.com/2012/06/10/why-is-upsert-so-complicated/

@doctrinebot doctrinebot added this to the 2.6 milestone Dec 6, 2015
@billschaller
Copy link
Member

Upsert syntax varies widely between platforms. This would be a lot of work to implement. A beautiful and comprehensive PR with a great test suite and support for all vendors that support the feature is welcome.

@alpharder
Copy link

Current Doctrine seems to be inconsistent because it has implementation of managing DB schema for different platforms, but the Query Builder lacks this.

@ricardofiorani
Copy link

Will this be a thing someday ??

@Ocramius
Copy link
Member

Ocramius commented Mar 5, 2019

@ricardofiorani if you can build and test it, sure.

@chrisharrisonkiwi
Copy link

With Postgres 9.5 upsert support, it would be good to see Doctrine come to the party with this option.
Any chance this is coming or already has some kind of support?
(Cant find any docs on it yet)

@GromNaN
Copy link
Member

GromNaN commented Oct 11, 2021

The necessary SQL queries for several platforms has been implemented in Symfony Cache: https://github.com/symfony/symfony/blob/5.3/src/Symfony/Component/Cache/Adapter/PdoAdapter.php#L352-L381

With a fallback to update or insert if the platform does not support any form of upsert.

@greg0ire
Copy link
Member

greg0ire commented Oct 11, 2021

@GromNaN looks like you can now use the same syntax for both PG and SQLite: https://www.sqlite.org/lang_UPSERT.html and that seems to be better anyway: https://stackoverflow.com/questions/418898/sqlite-upsert-not-insert-or-replace

@greg0ire
Copy link
Member

greg0ire commented Oct 11, 2021

We should start thinking about the interface.

If we were to be consistent with insert(), here is how the usage could look:

$this->connection->upsert(
    'movie',
    ['title' => 'No time to die', 'year' => 2021, 'director' => 'Cary Joji Fukunaga'],
    ['year' => ParameterType::INT] // types default to `ParameterType::STRING`
);
$this->connection->upsert(
    'movie',
    ['title' => 'Dune', 'year' => 2021, 'director' => 'Denis Villeneuve'],
    ['year' => ParameterType::INT] // types default to `ParameterType::STRING`
);

It might be more powerful and convenient to be able to do something like this though:

$this->connection->upsert(
    'movie',
    [
        ['title' => 'No time to die', 'year' => 2021, 'director' => 'Cary Joji Fukunaga'],
        ['title' => 'Dune', 'year' => 2021, 'director' => 'Denis Villeneuve']
    ],
    ['year' => ParameterType::INT] // types default to `ParameterType::STRING`
);

and generate a single SQL statement (or several? Maybe we should have 2 methods?) from that.

@GromNaN
Copy link
Member

GromNaN commented Oct 12, 2021

It would be more like the update method, with a list of key fields instead of a complete criteria. Some DB engine rely on the existing unique constraints (mysql), while other needs the keys to be specified.

An unoptimized implementation can be like this:

function upsert($table, array $data, array $keys, array $types = [])
{
    $criteria = array_filter($data, fn ($value, $key) => in_array($key, $keys, true));
    $affectedRows = $this->update($table, $data, $criteria, $types);
    if (0 === $affectedRows) {
        // Row does not exist, create it.
        try {
            $this->insert($table, $data, $types);
        } catch (UniqueConstraintViolationException $e) {
            // Concurrent update
           $this->update($table, $data, $criteria, $types);
        }
    }
}

@derrabus
Copy link
Member

I'd love to see this. 🤩

@greg0ire
Copy link
Member

greg0ire commented Oct 12, 2021

@GromNaN you mean that if there was a column with no uniqueness constraint, you would update it when UPSERT usually does an INSERT in that kind of case? Wouldn't that be kind of unexpected?

Some DB engine rely on the existing unique constraints (mysql), while other needs the keys to be specified.

Which DB engines need the keys to be specified? Oracle and sqlsrv maybe?

@GromNaN
Copy link
Member

GromNaN commented Oct 12, 2021

I cannot find a formal definition of UPSERT ; the best are on PostgreSQL wiki and MongoDB Documentation

"UPSERT" is a DBMS feature that allows a DML statement's author to atomically either insert a row, or on the basis of the row already existing, UPDATE that existing row instead, while safely giving little to no further thought to concurrency. One of those two outcomes must be guaranteed, regardless of concurrent activity, which has been called "the essential property of UPSERT". Examples include MySQL's INSERT...ON DUPLICATE KEY UPDATE, or VoltDB's UPSERT statement.

The operation will either update the document(s) matched by the specified query or if no documents match, insert a new document. The new document will have the fields indicated in the operation.

The UPSERT operation is normally used with a unique key. We have to recommend not to use non-unique fields as key.

Which DB engines need the keys to be specified? Oracle and sqlsrv maybe?

Yes, and also Postgresql requires the key (INSERT ... ON CONFLICT (id) DO UPDATE SET ...). SQLite support both syntaxes (INSERT OR UPDATE ... and pgsql)

@derrabus
Copy link
Member

On MySQL it would be INSERT … ON DUPLICATE KEY UPDATE … but it should be fine if the conflicting key also appears in the UPDATE clause. Distinguishing between key and non-key fields would also mean that we can anticipate exactly which unique key(s) will cause a conflict.

@greg0ire
Copy link
Member

Distinguishing between key and non-key fields would also mean that we can anticipate exactly which unique key(s) will cause a conflict.

Can you please elaborate? It's unclear to me where/how you would make that distinction and how we would act on it.

@derrabus
Copy link
Member

Can you please elaborate? It's unclear to me where/how you would make that distinction and how we would act on it.

It's unclear to me as well. That was kinda my point. @GromNaN suggested a signature that would distinguish between data and key fields:

It would be more like the update method, with a list of key fields instead of a complete criteria. Some DB engine rely on the existing unique constraints (mysql), while other needs the keys to be specified.

I don't believe that this is feasible. I think, upsert() should have the same signature as insert(), as suggested by @greg0ire already.

@morozov
Copy link
Member

morozov commented Nov 2, 2021

Closing. See #2939 (comment) for details.

@morozov morozov closed this as completed Nov 2, 2021
@github-actions
Copy link

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jul 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests