Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[api design] Encoding values for LOAD DATA LOCAL INFILE #1416

Open
Jille opened this issue Apr 22, 2023 · 11 comments
Open

[api design] Encoding values for LOAD DATA LOCAL INFILE #1416

Jille opened this issue Apr 22, 2023 · 11 comments

Comments

@Jille
Copy link

Jille commented Apr 22, 2023

Hi. I'm working on a feature for sqlc to do a bulk insert. sqlc generates a struct with the columns of the table, and I want to do a bulk insert of those.

Now my challenge is how to encode those values into a TSV as supported by https://dev.mysql.com/doc/refman/8.0/en/load-data.html#load-data-field-line-handling.

I wrote a small library for this (https://github.com/hexon/mysqltsv/blob/main/mysqltsv.go) but to correctly encode time.Time I need to know the Location from https://pkg.go.dev/github.com/go-sql-driver/mysql#Config.

What would be the cleanest way to support this?

  • Expose a Config() method on mysqlConn which people could access through https://pkg.go.dev/database/sql#Conn.Raw
  • Expose a Config(v any) function that takes the value from https://pkg.go.dev/database/sql#Conn.Raw (same as the previous, but as a function that takes an argument rather than a method's receiver)
  • RegisterStructsHandler that is responsible for converting structs to TSV and can be used instead of RegisterReaderHandler.
  • Pass the Config to the callback of RegisterReaderHandler (in a backwards compatible way)

I'd love to hear it if you any other suggestions.

@methane
Copy link
Member

methane commented Apr 23, 2023

Most simple and quick way is just user pass the location to sqlc.
User knows dsn or mysql.Config so that user knows the correct loc.

For future, my idea is:

// in mysql driver
type MySQLConn interface {
  private()  // Other module can not implement this interface. We can add more APIs in the future.

  Location() *time.Location  // Get Config.Location of this connection. (This may differ from `time_zone` connection variable.)
  LoadFromReader(r *io.Reader) error // Use this instead of RegisterReaderHandler.
}

// in user code.
var f io.Reader // some data.
err := conn.Raw(func (dc any) error {
  mc := dc.(MySQLConn)
  return mc.LoadFromReader(f)
})

Jille added a commit to Jille/go-sql-driver-mysql that referenced this issue Apr 25, 2023
It is the same as RegisterReaderHandler, but receives a copy of the
Config which is be needed to correctly encode the TSV. (For example
for the Location).

issue go-sql-driver#1416
Jille added a commit to Jille/go-sql-driver-mysql that referenced this issue Apr 25, 2023
It is the same as RegisterReaderHandler, but receives a copy of the
Config which is be needed to correctly encode the TSV. (For example
for the Location).

issue go-sql-driver#1416
Jille added a commit to Jille/go-sql-driver-mysql that referenced this issue Apr 25, 2023
It is the same as RegisterReaderHandler, but receives a copy of the
Config which is be needed to correctly encode the TSV. (For example
for the Location).

issue go-sql-driver#1416
@Jille
Copy link
Author

Jille commented Apr 25, 2023

I feel that would be a burden on the user. It requires them to plumb through the connection settings to every place that needs it.

I like that interface. (Though I'm not entirely sure how LoadFromReader would work. Would that run a LOAD DATA INFILE query itself?) (And it wouldn't be perfect for sqlc, because it would require implementing the Raw method for any middleware.

How do you feel about #1419? That'd allow me to encode correctly.

@methane
Copy link
Member

methane commented Apr 25, 2023

I find my old idea:

#1060 (comment)

@methane
Copy link
Member

methane commented Apr 25, 2023

My old idea doesn't require new interface.

@Jille
Copy link
Author

Jille commented Apr 25, 2023

While I like the idea of avoiding a register+unregister and passing the data to the call, it does have a few downsides:

@methane
Copy link
Member

methane commented Apr 26, 2023

I feel that would be a burden on the user. It requires them to plumb through the connection settings to every place that needs it.

I feel it is far better than #1419. #1419 add ugly API only for getting Config.Loc.

  • Config.Clone() is bit heavy. Why need to clone all config only for a *Location?

  • Config contains password. Why need to add surface to leak password only for a *Location?

  • Many users (including me) just use only UTC. No need to pass location around.

  • Some users using Location just need to pass it to bulk insert. It is not big burden.

    • If it is burden, user can store it in Context.

Jille added a commit to Jille/go-sql-driver-mysql that referenced this issue Apr 26, 2023
It is the same as RegisterReaderHandler, but receives some configuration
settings which are needed to correctly encode the TSV. (For now only the
Location).

issue go-sql-driver#1416
@Jille
Copy link
Author

Jille commented Apr 26, 2023

  • Config.Clone() is bit heavy. Why need to clone all config only for a *Location?

I don't mind changing that. With public libraries I tend to try to make them user-proof and make sure they can't break things by changing an internal value.

  • Config contains password. Why need to add surface to leak password only for a *Location?

Fair enough.

  • Many users (including me) just use only UTC. No need to pass location around.

Me too, sadly sqlc can't assume all users do that.

  • Some users using Location just need to pass it to bulk insert. It is not big burden.

    • If it is burden, user can store it in Context.

I'd totally do that in my own project, but it will break sqlc users that switch from simple INSERT INTOs to a LOAD DATA INFILE and suddenly correctly encoding time.Time's depends on a context variable.

I've updated #1419 with a new proposal, to only pass a struct with the *Location. (A struct so we can later extend it without breaking compatibility.) What do you think?

Alternatively, what do you think of adding an API that handles all the TSV encoding? That API would be really useful for other go-sql-driver/mysql users.

@methane
Copy link
Member

methane commented Apr 26, 2023

I've updated #1419 with a new proposal, to only pass a struct with the *Location. (A struct so we can later extend it without breaking compatibility.) What do you think?

I still hate it. It increase public APIs only for your use case.

Alternatively, what do you think of adding an API that handles all the TSV encoding? That API would be really useful for other go-sql-driver/mysql users.

API compatible to pg or pgx is better than registry.

@Jille
Copy link
Author

Jille commented Apr 26, 2023

Sorry, I'm not sure what you mean with your last sentence. (Particularly, what do you mean with "registry"?)

@methane
Copy link
Member

methane commented Apr 26, 2023

I meant RegisterXxx() APIs. I don't like them and avoid adding more such APIs.

Jille added a commit to Jille/go-sql-driver-mysql that referenced this issue Jun 26, 2023
Based on the design of @methane

We can later add a LoadLocalInfile() method instead of the Register...()
functions.

for issue go-sql-driver#1416
Jille added a commit to Jille/go-sql-driver-mysql that referenced this issue Jun 26, 2023
Based on the design of @methane

We can later add a LoadLocalInfile() method instead of the Register...()
functions.

for issue go-sql-driver#1416
@Jille
Copy link
Author

Jille commented Jun 26, 2023

Ok, I've thought of a way to make your proposed API work with sqlc, so I've sent a PR for that: #1454

Jille added a commit to Jille/go-sql-driver-mysql that referenced this issue Jun 26, 2023
Based on the design of @methane

We can later add a LoadLocalInfile() method instead of the Register...()
functions.

for issue go-sql-driver#1416
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants