Implement sql.Scanner and driver.Valuer interfaces #5

gust1n · 2014-08-12T12:40:34Z

To support mapping a struct to db with e.g. https://github.com/jmoiron/modl. But since they are interfaces they are not specific to that implementation.

gust1n · 2014-08-18T11:53:36Z

Any thoughts of this? Otherwise I might start migrating all my projects to use my fork. But I'd rather use your implementation (and give credit to you)

satori · 2014-08-18T17:38:59Z

@gust1n I'm sorry to have kept you waiting so long. Thank you for your pull request, I think SQL driver support is a great enhancement for go.uuid.

satori · 2014-08-18T21:09:58Z

Generally, UUID can be represented in at least two ways: as a 32 or 36 character ASCII string or as 16 byte array ([16]byte in Go). Although there are some cons and pros on storing UUID as a string, I believe that []byte is the more technically correct database representation for UUID datatype.

Let's take a look at a string representation. The advantages are simple: UUID string is recognizable as UUID and easy to read by human. The disadvantage is mainly that of space, a 36 character string is going to take at least 36 bytes to store.

Let's take a look at database world now.

On the one side, PostgreSQL has builtin UUID datatype, represented as 16 byte array. PostgreSQL command-line client (psql) is able to represent UUIDs in human-readable form (just like String method). In that case Value method could be simply implemented in term of MarshalBinary.

On the other side, MySQL and SQLite lack of builtin UUID datatype. If we are looking at SQLite, for example, there is at least two ways to store UUID: as a TEXT column or BLOB column. The first way will take at least 36 bytes for storage, the second one will take at least 16 bytes for storage. Why does the size matter? SQLite needs to read data from disk during a query, the OS reads from the disk in blocks (4 KB is a common size) and the more rows that can fit in a block, the more likely it is that a row will be cached in memory when SQLite requests it from disk. Bigger rows mean fewer rows per block and fewer rows cached in memory. Although there is a strong disadvantage of BLOB column as its data is not human-readable, I think that Value method still should be implemented in term of MarshalBinary.
The same stays true for MySQL and BINARY(16) column datatype.

According to the information mentioned above, Value method should be just:

return u.MarshalBinary()

As UUID should not be written to database as a string, there is no need in trying to scan it from data input, so Scan method could be simplified to:

return u.UnmarshalBinary(src.([]byte))

Is there anything I missed?

gust1n · 2014-08-18T21:54:07Z

Nice walk through! I'll be testing this tomorrow. The reason for my type switch was from earlier experiences when pg returned a string for the UUID type (yours). But that might have been since the type didn't implement valuer then. Let me test tomorrow.

Skickat från min telefon

18 aug 2014 kl. 23:09 skrev Maxim Bublis notifications@github.com:

Generally, UUID can be represented in at least two ways: as a 32 or 36 character ASCII string or as 16 byte array ([16]byte in Go). Although there are some cons and pros on storing UUID as a string, I believe that []byte is the more technically correct database representation for UUID datatype.

Let's take a look at a string representation. The advantages are simple: UUID string is recognizable as UUID and easy to read by human. The disadvantage is mainly that of space, a 36 character string is going to take at least 36 bytes to store.

Let's take a look at database world now.

On the one side, PostgreSQL has builtin UUID datatype, represented as 16 byte array. PostgreSQL command-line client (psql) is able to represent UUIDs in human-readable form (just like String method). In that case Value method could be simply implemented in term of MarshalBinary.

On the other side, MySQL and SQLite lack of builtin UUID datatype. If we are looking at SQLite, for example, there is at least two ways to store UUID: as a TEXT column or BLOB column. The first way will take at least 36 bytes for storage, the second one will take at least 16 bytes for storage. Why does the size matter? SQLite needs to read data from disk during a query, the OS reads from the disk in blocks (4 KB is a common size) and the more rows that can fit in a block, the more likely it is that a row will be cached in memory when SQLite requests it from disk. Bigger rows mean fewer rows per block and fewer rows cached in memory. Although there is a strong disadvantage of BLOB column as its data is not human-readable, I think that Value method still should be implemented in term of MarshalBinary.
The same stays true for MySQL and BINARY(16) column datatype.

According to the information mentioned above, Value method should be just:

return u.MarshalBinary()
As UUID should not be written to database as a string, there is no need in trying to scan it from data input, so Scan method could be simplified to:

return u.UnmarshalBinary(src.([]byte))
Is there anything I missed?

—
Reply to this email directly or view it on GitHub.

satori · 2014-08-18T22:38:17Z

Oh, I see now, the problem is definitely in lib/pq#263 and lib/pq#209. Pure Go lib/pq library doesn't implement PostgreSQL binary protocol unlike native libpq.

gust1n · 2014-08-19T10:40:20Z

Yeah ok so it seems that using return u.MarshalBinary() for the Valuer interface using lib/pq causes: ERROR: invalid byte sequence for encoding "UTF8": 0xba (probably as you predicted).

So my code actually works since the UUID inserted actually is a string (which postgres doesn't complain about), hence the Scan will take the path case []byte -> len(b) != 16 -> FromString(string(b))

It actually seems the postgres uuid type (http://www.postgresql.org/docs/9.1/static/datatype-uuid.html) is a string, or am I reading it wrong? Should we instead use the approach you suggested for MySQL and SQlite and save as a bytea?

satori · 2014-08-19T12:53:11Z

@gust1n Documentation states only about human-readable representation but not about implementation. Actually UUID is implemented as unsigned char[16] in PostgreSQL:

/* uuid size in bytes */
#define UUID_LEN 16

/* pg_uuid_t is declared to be struct pg_uuid_t in uuid.h */
struct pg_uuid_t
{
    unsigned char data[UUID_LEN];
};

I'll check if u.MarshalBinary() will work for MySQL and SQLite with BINARY(16) and BLOB column types.

gust1n · 2014-08-20T12:34:23Z

I didn't understand, do you want me to check and investigate further or will you do it?

satori · 2014-08-20T13:02:13Z

@gust1n I'll check it in the short term, a bit busy right now with my full-time duties.

satori · 2014-08-25T23:25:02Z

I've investigated on that problem a bit more.

Both MySQL and SQLite feel just fine with BINARY(16) and BLOB columns. During my tests, I was using go-sql-driver/mysql and mattn/go-sqlite3 correspondingly. I've found them able to deal with binary UUID representation.

Albeit PostgreSQL has builtin UUID datatype, driver implementations do not support binary wire protocol, only its text form. I've tested it against jbarham/gopgsqldriver and lib/pq.

The first one is a simple cgo binding for native libpq library and it shouldn't be too difficult to make it use binary format instead of text one. Actually I was able to make it read and scan UUIDs from binary protocol in a just one-line fix. Write support is a bit harder and requires complete library refactoring. Anyway, this project seems to be a bit dead, last commit was pushed almost 3 years ago.

Also I've discovered deafbybeheading/femebe – a library implementing PostgreSQL's FEBE protocol, unfortunately text version again. Looks like it is not too hard to add binary protocol support, it seems to me a bit cleaner protocol implementation than lib/pq has and can be used to reimplement lib/pq on top of it. Besides, femebe's author is one of active lib/pq committers.

Anyway, in my opinion sql.Scanner and driver.Valuer interfaces should be implemented in terms of MarshalBinary and UnmarshalBinary methods and I'd like to merge such pull request with that changes. Inability to handle binary data is definitely a bug in lib/pq.

satori · 2014-08-25T23:38:23Z

Here is an example code I've used during my tests.

deoxxa · 2014-12-02T22:43:20Z

Oh my god, how did I miss this? I'm going to close my PR now haha!

cbandy · 2015-03-01T07:15:54Z

Can this be closed now that #10 is merged?

satori · 2015-03-05T13:59:18Z

This pull request also addresses driver.Valuer interface implementation. I'm reopening it as a reminder.

satori · 2015-10-28T23:22:57Z

driver.Valuer interface implementation have been merged by #16. Thanks to all you guys who participated in discussion. In case of need to use binary representation of UUID user can use .Bytes() as suggested by @cbandy in #10.

Implement sql.Scanner and driver.Valuer interfaces

07fddc4

satori self-assigned this Aug 18, 2014

satori added the enhancement label Aug 18, 2014

Joakim Gustin added 3 commits October 2, 2014 13:33

Merge remote-tracking branch 'upstream/master'

63789f4

Support gogoprotobuf

7c964d1

Better copying buffers

ec959e1

satori mentioned this pull request Dec 24, 2014

Add sql functionality #8

Closed

Joakim Gustin added 2 commits January 21, 2015 22:22

Add support for Marshal to support gogoprotobuf

e69b493

Update gogoprotobuf support

58d26c9

cbandy mentioned this pull request Feb 5, 2015

Implement sql.Scanner interface #10

Merged

gust1n closed this Mar 1, 2015

satori reopened this Mar 5, 2015

satori closed this Oct 28, 2015

satori mentioned this pull request Oct 9, 2016

uuid.Value []byte vs string #38

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement sql.Scanner and driver.Valuer interfaces #5

Implement sql.Scanner and driver.Valuer interfaces #5

gust1n commented Aug 12, 2014

gust1n commented Aug 18, 2014

satori commented Aug 18, 2014

satori commented Aug 18, 2014

gust1n commented Aug 18, 2014

satori commented Aug 18, 2014

gust1n commented Aug 19, 2014

satori commented Aug 19, 2014

gust1n commented Aug 20, 2014

satori commented Aug 20, 2014

satori commented Aug 25, 2014

satori commented Aug 25, 2014

deoxxa commented Dec 2, 2014

cbandy commented Mar 1, 2015

satori commented Mar 5, 2015

satori commented Oct 28, 2015

Implement sql.Scanner and driver.Valuer interfaces #5

Implement sql.Scanner and driver.Valuer interfaces #5

Conversation

gust1n commented Aug 12, 2014

gust1n commented Aug 18, 2014

satori commented Aug 18, 2014

satori commented Aug 18, 2014

gust1n commented Aug 18, 2014

satori commented Aug 18, 2014

gust1n commented Aug 19, 2014

satori commented Aug 19, 2014

gust1n commented Aug 20, 2014

satori commented Aug 20, 2014

satori commented Aug 25, 2014

satori commented Aug 25, 2014

deoxxa commented Dec 2, 2014

cbandy commented Mar 1, 2015

satori commented Mar 5, 2015

satori commented Oct 28, 2015