Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add generic DbCommand.ExecuteScalar<T>() #26511

Open
roji opened this issue Jun 15, 2018 · 24 comments
Open

Add generic DbCommand.ExecuteScalar<T>() #26511

roji opened this issue Jun 15, 2018 · 24 comments
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Data
Milestone

Comments

@roji
Copy link
Member

roji commented Jun 15, 2018

The DbCommand API currently has a non-generic ExecuteScalar() which returns an object. To modernize the API and promote better performing code, we can add DbCommand.ExecuteScalar<T>(). The default implementation would simply call ExecuteReader() and call GetFieldValue<T>(), providing a default implementation that would work on all providers.

The idea is to collect ideas for improving .NET data access APIs, not necessary to implement right away.

@Brar
Copy link

Brar commented Jun 15, 2018

When drafting an initial implementation for this i recognized that in cases where T is a value type, we don't have a way to signal that the query didn't return a value.

For DbCommand.ExecuteScalar() we currently have three alternatives to return:

  1. An instance of the actual object if we got one from the database
  2. DBNull.Value if the database returned NULL
  3. null if the database didn't even return a row

For DbCommand.ExecuteScalar<T>() this would be:

  1. An instance of the actual object if we got one from the database
    2. DBNull.Value if the database returned NULL It can't be DBNull.Value as it wouldn't be compatible with T
  2. default(T) if the database didn't even return a row

In case of the 3rd option there is no way to tell the difference between default(T) and a perfectly valid value that was returned from the database.

We might need something like bool DbCommand.TryExecuteScalar<T>(out T value) to differentiate between 'no row was returned' and 'a row containing NULLwas returned'.

@Brar
Copy link

Brar commented Jun 15, 2018

...and in case we want to provide an async API the next problem would be that async methods cannot have 'out' parameters.

@roji
Copy link
Member Author

roji commented Jun 15, 2018

Agree that returning default(T) if no row was returned isn't a viable option. An alternative to TryExecuteScalar which is cumbersome would be to throw an exception; as a general rule users are aware of the query they are executing (and whether it returns a row or not).

@Brar
Copy link

Brar commented Jun 15, 2018

An alternative to TryExecuteScalar which is cumbersome would be to throw an exception

I think this would be the way to go.
Actually also for option 2 as DbNull.Value isn't compatible with T

@divega
Copy link
Contributor

divega commented Jun 16, 2018

I think the bool-Try pattern has become less cumbersome since we are now able to declare the output variable inline. My vote would be for having both Try and non-Try versions.

@roji
Copy link
Member Author

roji commented Feb 5, 2019

@divega @ajcvickers this issue is very similar to the discussions we've been having about an alternative to DbDataReader.GetFieldValue<T>(). If dotnet/csharplang#2194 is done, then we can have a T? ExecuteNullableScalar<T>() alongside a T ExecuteScalar<T>() which throws.

@Wraith2
Copy link
Contributor

Wraith2 commented Feb 26, 2019

(bool success, T value) = ExecuteScalar<T>() ?

@roji
Copy link
Member Author

roji commented Feb 26, 2019

@Wraith2 that's possible although pretty clunky to use, e.g. how to test if a value was returned and then use it... It seems better to have a bool TryExecuteScalar<T>(out T result), which could be used as follows:

if (cmd.TryExecuteScalar(out var result)) {
    // Do something with result
}

However, the idea is to hold off a bit until the status of dotnet/csharplang#2194 becomes clearer - if that issue makes it through we wouldn't really need these tricks. This also seems pretty non-urgent, as we're discussing what's basically a sugar method.

@roji
Copy link
Member Author

roji commented May 29, 2019

Note conversation happening in npgsql/npgsql#1997 (comment)

@Wraith2
Copy link
Contributor

Wraith2 commented May 29, 2019

  • An instance of the actual object if we got one from the database
  • DBNull.Value if the database returned NULL
  • null if the database didn't even return a row

Aren't cases 2 and 3 really the same thing if you're expecting an instance of T, which is that they're not a T so whether it's null or DBNull.Value it should throw. If you want to use the tristate capability just use the non-generic overload. If users want performance they can pay the development time price for ensuring it is possible.

@roji
Copy link
Member Author

roji commented May 29, 2019

Returning DBNull.Value isn't possible because the method is generic and returns T.

Aside from that, I'm not sure I see a conceptual reason for restricting the API from returning null. There are a lot of APIs out there which return a value or null (e.g. Dictionary's indexer); null really is an OK value to have in your database (or dictionary).

If we we get to a point where we tell people to use a non-generic API just to know whether there's a null, then IMHO our design isn't very good. Again, asking to know about nulls seems like a basic thing, and forcing people to go through a boxing, non-generic interface doesn't seem right.

Once again, if dotnet/csharplang#2194 gets implemented, then we can simply return a T? and be done with it...

If users want performance they can pay the development time price for ensuring it is possible.

What are you referring to? Not using ExecuteScalar and using DbDataReader instead?

@Wraith2
Copy link
Contributor

Wraith2 commented May 29, 2019

Success is getting an instance of whatever T is, right? So what's failure? what does it mean to get a false back from the TryExecuteScalar or whatever we end up with?

It's one of two things as enumerated above, either the query didn't return a result or it returned a database null result. If you expect either of those things to be able to happen (and to be able to cope with them) you probably need to know which one it is and in that case you have to typecheck and null check which means any fast path we introduce is irrelevant to this case, it'll box and cast anyway.

If you don't expect either of those no-data cases to happen then you're always expecting to get back an instance of the type and in that case you can just call the new version and get a fast response or an exception if something violates the rules of "will not return DBNull and will return a row"

If you're a calling database providers you should know your result shape and rules, and if you do you can go faster using the new generic version. If you don't then you continue along the slow path we already have.

@Brar
Copy link

Brar commented May 29, 2019

The way I see It, the classic non generic API will stay around, so it can be used to fill in gaps of the generic API.
The new generic API is mostly about performance because it can't do much more than help you to avoid boxing.
I envision three methods:

  1. the classic object DbCommand.ExecuteScalar()
  2. the new T DbCommand.ExecuteScalar<T>()
  3. the new bool DbCommand.TryExecuteScalar<T>(out T value)

They all have their place and serve different purposes:

  1. Use the classic API if you have no idea what you're going to get. It might return null if the query didn't return a row, it might return DbNull.Value if the query returned NULL and it might return an actual value. There's no way to avoid boxing or even return something else than object if you want to return null, DbNull.Value and a value type in the same field anyways.
  2. Use the simple generic ExecuteScalar method in cases where you know exactly that you will get a value that is not DbNull.Value. This method will throw an exception if your assertion is wrong (no row or NULL case) and this is probably what you'd want.
  3. Use the TryExecuteScalar method if you expect a valid value from the database which might include DbNull.Value. You can still avoid boxing of value types and you also get to know if the query returned NULL. This method will still throw an exception if the query didn't return a row as this probably isn't what you'd expect from a query anyways.

@Brar
Copy link

Brar commented May 29, 2019

Also:
Both new generic methods (2. and 3.) would obviously also throw on query results that are type-incompatible to T.

@roji
Copy link
Member Author

roji commented May 29, 2019

First, I'm not necessarily completely against TryExecuteScalar(). My problem is that it's needlessly clunky when null values are possible, and that it gives no meaning to the Try* pattern. I also don't really understand the objection to a generic ExecuteScalar which would return T? (aside from this current limitation in C# 8 nullability, which should not be a deciding factor. Can you please explain why TryExecuteScalar() seems better to you?

@Wraith2:

So what's failure? what does it mean to get a false back from the TryExecuteScalar or whatever we end up with?

The fact that there isn't an obvious meaning for "failure" doesn't mean it's a free slot we can use to represent null.

If you expect either of those things to be able to happen (and to be able to cope with them) you probably need to know which one it is and in that case you have to typecheck and null check which means any fast path we introduce is irrelevant to this case, it'll box and cast anyway.

I'm not sure what you mean here - why does a null check mean something boxes or that a fast path is irrelevant? For example, let's say I read an int column that can contain nulls, why not receive an int? and check that for nullability? No boxing, very efficient - I don't see the issue...

If you don't expect either of those no-data cases to happen [...]

Null is not a no-data scenario - it's a valid value to have in your database and it should be easy to read and to check for it.

If you're a calling database providers you should know your result shape and rules, and if you do you can go faster using the new generic version. If you don't then you continue along the slow path we already have.

I again don't understand the point about result shape and rules. This is about reading nullable columns; the result shape is nullable. Nothing about that should mean that anyone needs to go slower.

@roji
Copy link
Member Author

roji commented May 30, 2019

@Brar I agree with most of what you say. Specifically, if you don't know the type of expected data (as opposed to its nullability), then you should be using the non-generic API. The same holds for DbDataReader, where you would call the non-generic GetValue() and not generic GetFieldValue<T>().

As I wrote above, the only point where I have trouble is with TryExecuteScalar vs. ExecuteScalar that can return T?. I even agree that there's probably value in a version that throws on null, for cases where the user knows there should never be one.

Both new generic methods (2. and 3.) would obviously also throw on query results that are type-incompatible to T.

Agreed.

@Brar
Copy link

Brar commented May 30, 2019

I also don't really understand the objection to a generic ExecuteScalar which would return T?

There's no objection on my part aside from this current limitation in C# 8 nullability, which has to be a deciding factor unless we know for sure that the feature will come.

@roji
Copy link
Member Author

roji commented May 30, 2019

There's no objection on my part aside from this current limitation in C# 8 nullability, which has to be a deciding factor unless we know for sure that the feature will come.

OK, thanks.

In that case, IMHO it makes sense to wait a bit and see how things develop on that side. At the end of the day, ExecuteScalar<T>() (and TryExecuteScalar<T>()) are only sugar - whatever they do can always be done by calling ExecuteReader() and GetFieldValue<T>(). This means it isn't urgent to figure this out right now - we can do so later when the language nullability picture clarifies...

@kronic
Copy link
Contributor

kronic commented May 30, 2019

You can add more methods

ValueTask<T> DbCommand.ExecuteScalarAsync<T>();
ValueTask<T> DbCommand.ExecuteScalarAsync<T>(CancellationToken);

@roji
Copy link
Member Author

roji commented May 30, 2019

@kronic at the moment we're discussing the general shape of the API - async version(s) would definitely get added once we stabilize. BTW I'm not sure this is a good candidate for returning ValueTask as opposed to Task (pretty much always performs I/O, so never returns synchronously).

@kronic
Copy link
Contributor

kronic commented May 30, 2019

@kronic at the moment we're discussing the general shape of the API - async version(s) would definitely get added once we stabilize. BTW I'm not sure this is a good candidate for returning ValueTask as opposed to Task (pretty much always performs I/O, so never returns synchronously).

@roji This is wrong for example for Sqlite.

@roji
Copy link
Member Author

roji commented May 30, 2019

SQLite us definitely an outlier here - almost all databases involve some sort of networking I/O; according to this logic every single method on the ADO.NET API would return ValueTask just so that it can be used with SQLite.

Thé claim can also be made that sqlite will still typically do I/O to complete the call, and so the overhead of Task is likely to be quite negligible.

But nothing is closed for discussion - in any case it seems premature to think about Task vs. ValueTask before having an agreed-upon API shape.

@bgrainger
Copy link
Contributor

SQLite's lack of async I/O has always felt like a bug, not a feature, to me. Oracle's MySQL Connector/NET also uses 100% synchronous I/O but I don't think we should tailor the ADO.NET API to those providers' inability to perform async I/O.

(I realise that the engineering effort required to support async I/O in SQLite and Connector/NET is very large and so they may both remain synchronous-only for a very long time, but it seems to me that the API should design for the standard/ideal case.)

@roji
Copy link
Member Author

roji commented May 31, 2019

@bgrainger I agree.

FWIW for Sqlite doing async there would mean plumbing that through native, which indeed seems like a lot of effort/complexity with not that much necessary gain. Oracle's driver really has no excuse.

@msftgits msftgits transferred this issue from dotnet/corefx Jan 31, 2020
@msftgits msftgits added this to the 5.0 milestone Jan 31, 2020
@maryamariyan maryamariyan added the untriaged New issue has not been triaged by the area owner label Feb 23, 2020
@ajcvickers ajcvickers modified the milestones: 5.0.0, Future Jun 23, 2020
@ajcvickers ajcvickers removed the untriaged New issue has not been triaged by the area owner label Jun 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Data
Projects
None yet
Development

No branches or pull requests

9 participants