Skip to content

Add support for batch insert #259

Open
@deblockt

Description

@deblockt

I have try to perform a batch insert using DatabaseClient.
I have not found a solution.

I have a wrokaround using a Statement but to get a statement I should create my own Connection.

Have you planned to support a batch insert using add like with Statement SPI

Activity

istarion

istarion commented on Dec 19, 2019

@istarion

You can obtain connection with ConnectionFactoryUtils helper method:
val connection = ConnectionFactoryUtils.getConnection(connectionFactory).awaitSingle()

deblockt

deblockt commented on Dec 19, 2019

@deblockt
Author

Thanks, so far I have used ConnectionAccessor to get the same connection as DatabaseClient.

mp911de

mp911de commented on Jan 22, 2020

@mp911de
Member

Currently, we don't have support for batching. There are two types of batches that R2DBC supports:

  • Unparametrized via Connection.createBatch()
  • Parametrized via Statement.add()

Unparametrized batches could be supported through execute(List<String>). However, the execute interfaces expose bind(…) methods that do not seem appropriate in that context and we would require introducing another set of interfaces.

Parametrized statements seem straight-forward, but there's a caveat that is due to named parameter resolution. Named parameter support considers if a bound value is a collection type. If so, then the SQL expansion creates a parameter placeholder for each element in the Collection

Example:

client.execute("INSERT INTO foo VALUES(:my_param)")
    .bind("my_param", "a-value")

Resulting SQL (Postgres syntax):

INSERT INTO foo VALUES($1)
client.execute("INSERT INTO foo VALUES(:my_param)")
    .bind("my_param", Arrays.asList("one", "two"))

Resulting SQL (Postgres syntax):

INSERT INTO foo VALUES($1, $2)

For batching, this example makes little sense as binding a collection to an INSERT has very little use. The point I'm trying to make is that if the parameter multiplicity changes across bindings with named parameters processing enabled, the resulting SQL is no longer the same and we cannot use batching.

client.execute("INSERT INTO foo VALUES(:my_param)")
    .bind("my_param", Arrays.asList("one"))
    .add()
    .bind("my_param", Arrays.asList("one", "two"))
gjgarryuan

gjgarryuan commented on Jun 18, 2020

@gjgarryuan

@mp911de

I tried the code sample you provided above

client.execute("INSERT INTO foo VALUES(:my_param)")
    .bind("my_param", Arrays.asList("one"))
    .add()
    .bind("my_param", Arrays.asList("one", "two"))

I assume the client is the DatabaseClient where the DatabaseClient#bind method returns a BindSpec, not a Statement object where the Statement#add is available.

Is it possible to use parameterized statements to form batching via DatabaseClient currently?

In addition, is it possible to use parameterized insert statement with multiple values? For instance, INSERT INTO foo VALUES ("foo", "bar"), ("FOO", "BAR"). Can I do something like:

final List<Object[]> tuples = new ArrayList<>();
tuples.add(new Object[] {"foo", "bar"});
tuples.add(new Object[] {"FOO", "BAR"});

client.execute("INSERT INTO foo VALUES :tuples")
         .bind("tuples", tuples)
spachip

spachip commented on Jul 15, 2020

@spachip

Hi, is there any update on this? I couldn't find the support for batch operations using DatabaseClient. Can you please confirm if the current batch operations support is limited to Statement and Connection objects?

abhinaba-chakraborty-by

abhinaba-chakraborty-by commented on Sep 15, 2020

@abhinaba-chakraborty-by

Hey @mp911de ,
I have a scenario where my table has an autogenerated id column and I need to bulk insert items into db and fetch the generated ids. Is there any way I can achieve that?

This is my table:

CREATE TABLE test_table (
  `id` SERIAL NOT NULL,
  `name` VARCHAR(100) NOT NULL,
  `created_date` DATETIME NOT NULL,
  PRIMARY KEY (`id`)
);

To save a list of items, the code I am using:

String initialSql = "INSERT INTO test_table(`name`,`created_date`) VALUES ";

    List<String> values =
        dummyEntities.stream()
            .map(dummyEntity -> "('" + dummyEntity.getName() + "','"
                + dummyEntity.getCreatedDate().atZoneSameInstant(ZoneId.of("UTC")).toLocalDateTime().toString() + "')")
            .collect(Collectors.toList());

    String sqlToExecute =  initialSql + String.join(",", values);
    client.execute(sqlToExecute)
             .//Then what?

The generated SQL statement (from DEBUG Logs):

2020-09-15 18:59:32.613 DEBUG 44801 --- [actor-tcp-nio-1] o.s.d.r2dbc.core.DefaultDatabaseClient   : Executing SQL statement [INSERT INTO test_table(`name`,`created_date`) VALUES ('Abhi57','2020-09-15T13:29:29.951964'),('Abhi92','2020-09-15T13:29:29.952023')]

I even tried using ConnectionFactory, still no clue

    Mono.from(connectionFactory.create())
        .map(Connection::createBatch)
        .map(batch -> {
          dummyEntities.forEach(dummyEntity -> {
            String sql = String.format("INSERT INTO `test_table` (`name`,`created_date`) VALUES ('%s','%s');", dummyEntity.getName(),
                dummyEntity.getCreatedDate().atZoneSameInstant(ZoneId.of("UTC")).toLocalDateTime().toString());
            batch.add(sql);
          });
          return batch;
        })
        .flatMap(batch -> Mono.from(batch.execute()))
        .//Then what?
mp911de

mp911de commented on Sep 22, 2020

@mp911de
Member

Depending on the database type, you need to tell the database to return generated keys (see the Postgres documentation on RETURNING). Then, extract the generated keys by consuming Result.map((row, metadata) -> row.get("name_of_id_column", Long.class)).

luiccn

luiccn commented on Dec 17, 2020

@luiccn

Any updates on this? It's quite an important feature to be overlooked for so long. Tomorrow it's the 1 year birthday of no batch inserts :(

mp911de

mp911de commented on Dec 17, 2020

@mp911de
Member

Thanks @luiccn for reminding us. Meanwhile, DatabaseClient went into Spring Framework and we need to move this ticket there.

Keep in mind that contributing to open source is essential if you want something to happen sooner than it would take the regular way. We'd be more than happy to work with you on the final design if you find the time to come up with a proposal and a pull request.

aoudiamoncef

aoudiamoncef commented on Jun 7, 2021

@aoudiamoncef

Hi @mp911de,

I'm interested to work on this issue, please could you give me more information about it

Thanks

markusheiden

markusheiden commented on Jul 9, 2023

@markusheiden

Is there still no batching possible, except directly via the Connection?

Using the raw batches at the connection is risky because all escaping has to be done "by hand".

Batches perform way better than multiple inserts, even if using Statement.add().add()....execute(). I wonder why Statement.add() does not use a batch under the hood?

5 remaining items

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      Participants

      @markusheiden@mp911de@istarion@gjgarryuan@deblockt

      Issue actions

        Add support for batch insert · Issue #259 · spring-projects/spring-data-r2dbc