Skip to content

Add Send + Sync traits for Datum#5587

Closed
viirya wants to merge 1 commit intoapache:masterfrom
viirya:scalar_send_sync
Closed

Add Send + Sync traits for Datum#5587
viirya wants to merge 1 commit intoapache:masterfrom
viirya:scalar_send_sync

Conversation

@viirya
Copy link
Member

@viirya viirya commented Apr 3, 2024

Which issue does this PR close?

Closes #.

Rationale for this change

While working on apache/iceberg-rust#295, I faced an issue that a Datum (it is a Scalar behind the trait) cannot be moved into a closure.

error[E0277]: `dyn arrow_array::Datum` cannot be sent between threads safely
   --> crates/iceberg/src/arrow/reader.rs:451:17
    |
449 |               PredicateOperator::NotEq => Ok(Box::new(ArrowPredicateFn::new(
    |                                                       --------------------- required by a bound introduced by this call
450 |                   self.projection_mask.clone(),
451 | /                 move |batch| {
452 | |                     let left = batch.column(term_index);
453 | |                     neq(left, literal.as_ref())
454 | |                 },
    | |_________________^ `dyn arrow_array::Datum` cannot be sent between threads safely
    |

Because the function to return Datum may return Result to propagate error, I cannot put it into the closure.

I'm wondering if we can make Datum also Send + Sync?

What changes are included in this PR?

Are there any user-facing changes?

@github-actions github-actions bot added the arrow Changes to the arrow crate label Apr 3, 2024
@tustvold tustvold added the api-change Changes to the arrow API label Apr 3, 2024
@tustvold
Copy link
Contributor

tustvold commented Apr 3, 2024

Are you able to just add these constraints where necessary? This would avoid making a breaking change

@viirya
Copy link
Member Author

viirya commented Apr 3, 2024

Are you able to just add these constraints where necessary? This would avoid making a breaking change

Hmm, but the Datum is basically Scalar. It is a struct not a trait and it is not implemented with Send. Can I add Send as constraint for it outside this crate? I think it doesn't work if I add it like Datum + Send?

@tustvold
Copy link
Contributor

tustvold commented Apr 4, 2024

You can add constraints for marker traits at point of usage

@viirya
Copy link
Member Author

viirya commented Apr 4, 2024

Sorry, I still don't know how can I add Send as a constraint to Scalar outside this crate. 🤔

I can let the function return Datum + Send so Send is a constraint of the returned type. But Scalar doesn't implement Send, how can I make it implement Send outside this crate?

@tustvold
Copy link
Contributor

tustvold commented Apr 5, 2024

I'm confused, Scalar is just a wrapper around a T: Array that implements Datum, whether or not Datum has additional constraints or not has no bearing on whether Scalar is Send?

Perhaps you could provide a small code reproducer?

@viirya
Copy link
Member Author

viirya commented Apr 5, 2024

I'm confused, Scalar is just a wrapper around a T: Array that implements Datum, whether or not Datum has additional constraints or not has no bearing on whether Scalar is Send?

Perhaps you could provide a small code reproducer?

Hmm, do you mean to return Array (a Send) instead of Scalar?

Because Scalar is not a Send, although it is simply a wrapper of Array, I guess it cannot be used as Send automatically? I think it cannot be but let me try it first. Thanks.

@tustvold
Copy link
Contributor

tustvold commented Apr 5, 2024

Send is an auto-trait that will be automatically derived if the contents of the type are themselves Send. The same is true for Sync.

@viirya
Copy link
Member Author

viirya commented Apr 5, 2024

I just tried it. Looks like it works.

Because Scalar is not a Send, although it is simply a wrapper of Array, I guess it cannot be used as Send automatically? I think it cannot be but let me try it first. Thanks.

Interesting. So even Scalar is not a Send, if Rust compiler knows it is a wrapper of Array, it will be automatically a Send?

fn get_arrow_datum(datum: &Datum) -> Result<Box<dyn ArrowDatum + Send>> {
    match datum.literal() {
        PrimitiveLiteral::Boolean(value) => Ok(Box::new(BooleanArray::new_scalar(*value))),
        PrimitiveLiteral::Int(value) => Ok(Box::new(Int32Array::new_scalar(*value))),
        PrimitiveLiteral::Long(value) => Ok(Box::new(Int64Array::new_scalar(*value))),
        PrimitiveLiteral::Float(value) => Ok(Box::new(Float32Array::new_scalar(value.as_f32()))),
        PrimitiveLiteral::Double(value) => Ok(Box::new(Float64Array::new_scalar(value.as_f64()))),
        l => Err(Error::new(
            ErrorKind::DataInvalid,
            format!("Unsupported literal type: {:?}", l),
        )),
    }
}

fn some_func(...) {
    let literal = get_arrow_datum(&literal)?;

    ....
    move |batch| {
        let left = batch.column(term_index);
         lt(left, literal.as_ref())
    },
    ....
}

@viirya
Copy link
Member Author

viirya commented Apr 5, 2024

Send is an auto-trait that will be automatically derived if the contents of the type are themselves Send. The same is true for Sync.

Ah, got it. I thought it needs to explicitly implement it. Learn it. Thank you @tustvold !

@viirya viirya closed this Apr 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api-change Changes to the arrow API arrow Changes to the arrow crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants