Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error evaluating manually constructed physical expression: <column> = left(<literal>, 1) #7959

Closed
suremarc opened this issue Oct 27, 2023 · 2 comments · Fixed by #7967
Closed
Labels
bug Something isn't working

Comments

@suremarc
Copy link
Contributor

Describe the bug

When passed to create_physical_expr, the following expression works as expected on DataFusion 32.0.0 but fails on the master branch:

col("column").eq(left("value".lit(), 1i64.lit()))

The following error appears:

Error: ArrowError(InvalidArgumentError("Cannot compare arrays of different lengths, got 4 vs 1"))

To Reproduce

Compile the following piece of Rust code against the current master branch of DataFusion:

use std::sync::Arc;

use arrow::{array::StringArray, record_batch::RecordBatch};
use arrow_schema::{DataType, Field, Schema};
use datafusion::physical_expr::{create_physical_expr, execution_props::ExecutionProps};
use datafusion_common::{DFSchema, Result};
use datafusion_expr::{col, left, Literal};

#[tokio::main]
async fn main() -> Result<()> {
    let expr = col("letter").eq(left("AAPL".lit(), 1i64.lit()));
    println!("{expr}");

    let schema = Schema::new(vec![Field::new("letter", DataType::Utf8, false)]);
    let df_schema = DFSchema::try_from_qualified_schema("data", &schema)?;
    let p = create_physical_expr(&expr, &df_schema, &schema, &ExecutionProps::new())?;
    println!("{p}");

    let batch = RecordBatch::try_new(
        Arc::new(schema),
        vec![Arc::new(StringArray::from_iter_values(vec![
            "A", "B", "C", "D",
        ]))],
    )?;
    let z = p.evaluate(&batch)?;
    println!("{z:?}");

    Ok(())
}

// letter = left(Utf8("AAPL"), Int64(1))
// letter@0 = left(AAPL, 1)
// Error: ArrowError(InvalidArgumentError("Cannot compare arrays of different lengths, got 4 vs 1"))

Expected behavior

create_physical_expr should produce a valid physical expression from the provided logical expression

Additional context

No response

@suremarc suremarc added the bug Something isn't working label Oct 27, 2023
@alamb
Copy link
Contributor

alamb commented Oct 28, 2023

Thank you @suremarc for the report and the reproducer. It happens for me as well

I did some initial investigation and it seems like left('APPL', 1) gets evaluated into a single element array (rather than a ColumnarValue) which is what is causing the error

The function dispatch code is a bit complicated and it seems like it could be simplified. Maybe someone else has some thoughts on how to do so

@alamb
Copy link
Contributor

alamb commented Oct 29, 2023

Not only did @viirya have thoughts on how to fix this issue, he made a PR to do so #7967 ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants