Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DataFrame TableProvider Circular Reference #2659

Closed
tustvold opened this issue May 30, 2022 · 4 comments · Fixed by #4778
Closed

DataFrame TableProvider Circular Reference #2659

tustvold opened this issue May 30, 2022 · 4 comments · Fixed by #4778
Labels
bug Something isn't working

Comments

@tustvold
Copy link
Contributor

Describe the bug

Currently DataFrame implements TableProvider allowing it to be registered on a SessionContext, this will often be the same SessionContext from which its Arc<RwLock<SessionState>> came. For example

#[tokio::test]
    async fn register_table() -> Result<()> {
        let df = test_table().await?.select_columns(&["c1", "c12"])?;
        let ctx = SessionContext::new();
        let df_impl = Arc::new(DataFrame::new(ctx.state.clone(), &df.plan.clone()));
    
        // register a dataframe as a table
        ctx.register_table("test_table", df_impl.clone())?;

This will result in a circular reference that will prevent destruction of the SchemaProvider and the DataFrame.

To Reproduce

Inspect code

Expected behavior

It should not be possible to introduce circular dependencies. On a more holistic level, I'm not entirely sure what the purpose of this API is. Perhaps it could be removed and replaced with a combination of DataFrame::to_logical_plan and ViewTable (once #2657 is fixed)

Additional context

#2658

FYI @yjshen @xudong963

@tustvold tustvold added the bug Something isn't working label May 30, 2022
@matthewmturner
Copy link
Contributor

FYI some additional context on this is #1698 and #1699. Basically the idea was to enable registering a DataFrame as a table i.e. with a name.

@tustvold
Copy link
Contributor Author

What did you think about this?

Perhaps it could be removed and replaced with a combination of DataFrame::to_logical_plan and ViewTable

@matthewmturner
Copy link
Contributor

Sounds good to me - is your idea to add a method like a ctx.register_dataframe method that would convert the DataFrame to a logical plan and register as ViewTable?

@tustvold
Copy link
Contributor Author

I was thinking more a DataFrame::to_table_provider method, the fact ViewTable currently requires a SessionContext in its constructor is fixed by #2660

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants