-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose sqlite's sqlite3_create_collation() #2495
Conversation
diesel/src/sqlite/connection/raw.rs
Outdated
let f = match user_ptr.as_mut() { | ||
Some(f) => f, | ||
None => { | ||
//FIXME |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not really sure what to do here if sqlite passes us null pointers. Can it just be assumed that this will not happen (i.e. unwrap()
here)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me unreachable!
in case of None
could be appropriate.
diesel/diesel/src/sqlite/connection/raw.rs
Lines 307 to 309 in 44d0096
unreachable!( | |
"We've written the aggregator above to that location, it must be there" | |
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It wouldn't be, since "A panic! across an FFI boundary is undefined behavior." (https://doc.rust-lang.org/nomicon/ffi.html#ffi-and-panics).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to https://www.sqlite.org/c3ref/create_collation.html we can be sure that we get our pouter to the callback here. Following that reasoning I would use unreachable!
or something similar + a comment explaining why this cannot happen to indicate that this is guaranteed by other things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then I'd either go to "full undefined behavior" territory and use std::hint::unreachable_unchecked()
, or stay on the safe route and call std::process::abort()
(with an eprintln!()
in front).
IMO unreachable!()
gives the impression that we can just panic and it's fine. Which it may eventually be (rust-lang/rust#74990), but at the moment it is not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using comment +eprintln!
+ abort()
sounds like the better solution for me. Can you have a quick lock at the other callbacks for normal and aggregate functions and change that there as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, will do!
I should mention, my use-case is allowing for proper case-insensitive matching, since the |
79a45f1
to
548ece5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general I'm in favor of this change. I've left some comments at places where I think things can be improved. Additionally such a change requires an entry to our changelog.
diesel/src/sqlite/connection/raw.rs
Outdated
let f = match user_ptr.as_mut() { | ||
Some(f) => f, | ||
None => { | ||
//FIXME |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to https://www.sqlite.org/c3ref/create_collation.html we can be sure that we get our pouter to the callback here. Following that reasoning I would use unreachable!
or something similar + a comment explaining why this cannot happen to indicate that this is guaranteed by other things.
|
The question is how often would such a function be used? |
It guess it would only be of potential use when dropping tables or some such. It certainly won't find use from me (at least at the moment). |
I hit a snag while making We can wrap it in a |
I think |
548ece5
to
7af7f5c
Compare
That should address all requested changes. |
7af7f5c
to
d694a55
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks really good, modulo some small stylistic things. Thanks for working on this 👍
} | ||
let args = unsafe { slice::from_raw_parts(value_ptr, num_args as _) }; | ||
let args = build_sql_function_args::<ArgsSqlType, Args>(args); | ||
let mut aggregator = std::panic::AssertUnwindSafe(aggregator); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had some time to think a bit more about this, so here some reasoning why this actually should be OK and users do not need to care about this:
- We catch the panic and get a
Err
- We return an error message to sqlite, so that sqlite will abort executing this aggregate, so the old instance will not be used anymore.
- On the next call to this aggregate function we will create a new instance of
A
, so we cannot have compromised state here either.
(Maybe add that as comment here)
From that point of view: I'm not sure if we really need to have bounds on UnwindSafe
on the corresponding types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I didn't really follow how the aggregator is initialized.
Unwind-safetly is still a useful property though because finalize()
will be called whether or not step()
errored out.
I'm not certain if having the UnwindSafe
-bound actually does much in any case, since this AssertUnwindSafe
is where the unwind-safety rubber hits the road. Though without it we'd need another AssertUnwindSafe
-wrapping in run_aggregator_final_function()
if the bound does not exist. And that may be relevant Drop
impementations, even if we don't use the aggregator any further.
/// The `step()` method is called once for every record of the query. | ||
/// | ||
/// This is called through a C FFI, as such panics do not propagate to the caller. Panics are | ||
/// caught and cause a return with an error value. The implementation must still ensure that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As explained above I think we do not need to have that actually here, because I see no way how the state could be reused here.
Panics across FFI boundaries cause undefined behavior in current Rust. The aggregator functions are callbacks invoked from C (libsqlite). To safe-guard against panics, std::panic::catch_unwind() is used. On panic the functions now return with an error result indicating the unexpected panic occurred. std::panic::catch_unwind() requires types to implement std::panic::UnwindSafe, a marker trait indicating that care must be taken since panics introduce control-flow that is not very visible. Refer to https://doc.rust-lang.org/std/panic/trait.UnwindSafe.html for a more detailed explanation. For SqliteAggregatorFunction::step() we must use std::panic::AssertUnwindSafe, since &mut references are never considered UnwindSafe, and the requirement to ensure unwind-safety is documented on the method. Of note is that in safe Rust, even if the method is not unwind-safe the language still guarantees memory-safety. The marker trait is mainly to prevent logic bugs.
d694a55
to
c2f4e7f
Compare
A: SqliteAggregateFunction<Args, Output = Ret> | ||
+ 'static | ||
+ Send | ||
+ std::panic::UnwindSafe |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed it to both RefUnwindSafe
and UnwindSafe
, since ::step()
passes by reference and ::terminate()
by value. As stated before, I'm not sure if the RefUnwindSafe
-bound for ::step()
provides any practical safety since we just use AssertUnwindSafe
instead to pass the mut&
across and RefUnwindSafe
is only for non-mut
references.
c2f4e7f
to
bf4c952
Compare
This encompasses the pattern of eprintln!() + std::process::abort() and adds a request to open an issue if the message is observed.
This is essentially the same treatment that custom aggregate functions got in ee2f792.
The failed CI job looks to me like a transient failure; it seems MySQL didn't start up correctly there. |
@@ -204,7 +251,8 @@ extern "C" fn run_custom_function<F>( | |||
) where | |||
F: FnMut(&RawConnection, &[*mut ffi::sqlite3_value]) -> QueryResult<SerializedValue> | |||
+ Send | |||
+ 'static, | |||
+ 'static | |||
+ std::panic::RefUnwindSafe, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this is a FnMut
and not a Fn
I'm would guess that RefUnwindSafe
is not the right trait here. We do potentially mutate F
here.
} | ||
Err(_) => { | ||
let msg = format!("{} panicked", std::any::type_name::<F>()); | ||
unsafe { context_error_str(ctx, &msg) }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reason why we need a nested unsafe block here?
context_error_str(ctx, NULL_AG_CTX_ERR) | ||
} | ||
|
||
unsafe fn context_error_str(ctx: *mut ffi::sqlite3_context, error: &str) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should cite https://sqlite.org/c3ref/result_blob.html here, stating that sqlite will copy our error message, so it's fine to free it afterwards. (Just to make that clear for any future person looking at this code.)
Also we maybe can take a NonNull<ffi::sqlite3_context>
here to make this function not unsafe
to use anymore?
A: SqliteAggregateFunction<Args, Output = Ret> + 'static + Send, | ||
Args: FromSqlRow<ArgsSqlType, Sqlite>, | ||
A: SqliteAggregateFunction<Args, Output = Ret> + 'static + Send + std::panic::RefUnwindSafe, | ||
Args: FromSqlRow<ArgsSqlType, Sqlite> + std::panic::UnwindSafe, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why the Args
need to be UnwindSafe
. They get reconstructed everytime the aggregator is called, so a potential panic cannot corrupt anything there.
A: SqliteAggregateFunction<Args, Output = Ret> + 'static + Send, | ||
Args: FromSqlRow<ArgsSqlType, Sqlite>, | ||
A: SqliteAggregateFunction<Args, Output = Ret> + 'static + Send + std::panic::UnwindSafe, | ||
Args: FromSqlRow<ArgsSqlType, Sqlite> + std::panic::UnwindSafe, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not even construct the arguments here, so UnwindSafe
is not required here.
F: FnMut(&RawConnection, &[*mut ffi::sqlite3_value]) -> QueryResult<SerializedValue> | ||
+ Send | ||
+ 'static, | ||
F: Fn(&str, &str) -> std::cmp::Ordering + Send + 'static, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be restricted to RefUnwindSafe
as we cannot mutate the closure.
) | ||
}; | ||
|
||
// It doesn't matter if f is UnwindSafe, since we abort on panic. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think similar to the sql function handler we should just return an error here, instead of abort the program.
This looks almost good, noted some minor improvements otherwise I'm fine with this now. |
@TaKO8Ki This is another candidate for rebasing + fixing the remaining comments if your are interested |
@weiznich |
Likely relevant: rusqlite/rusqlite#839 |
Closed by #2564 |
This PR adds
diesel::sqlite::SqliteConnection::register_collation()
, which wrapssqlite3_create_collation_v2()
(ref https://sqlite.org/c3ref/create_collation.html).The registered collations can be used via the
COLLATE
-constraint on table columns, as is demonstrated in the tests.