-
Notifications
You must be signed in to change notification settings - Fork 875
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change pretty_format_batches
to return Result<impl Display>
#975
Conversation
@jimexist FYI ive started the work on this if you wan to check it out. |
Codecov Report
@@ Coverage Diff @@
## master #975 +/- ##
==========================================
- Coverage 82.31% 82.31% -0.01%
==========================================
Files 168 168
Lines 48761 48763 +2
==========================================
+ Hits 40136 40137 +1
- Misses 8625 8626 +1
Continue to review full report at Codecov.
|
arrow/src/util/pretty.rs
Outdated
@@ -42,6 +43,12 @@ pub fn print_batches(results: &[RecordBatch]) -> Result<()> { | |||
Ok(()) | |||
} | |||
|
|||
pub fn write_batches<W: Write>(buf: &mut W, results: &[RecordBatch]) -> Result<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder what you both would think about implementing Display
instead of a new free function?
Something like
fn displayable(batches: &[RecordBatch]) -> impl Display {
...
}
Then we could use that directly in format calls, like
println!("Batches: {}", displayable(batches));
As well as writing:
write!(&mut w, "{}", displayable(batches));
Basically it feels to me like if we are going to add something new to pretty print it would be nice to make it as flexible as possible.
You can do this displayable trick with a newtype wrapper, for example https://github.com/apache/arrow-datafusion/blob/master/datafusion/src/logical_plan/display.rs#L96-L119
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alamb thanks for comment - that actually got me thinking about this differently.
first, couldnt we just do:
write!(&mut buf, "{}", pretty_format_batches(batches))
if so, im not even sure this PR is needed?
second, assuming that doesnt work could you just explain what the difference is between pretty_format_batches
and the displayable
function you proposed?
@jimexist what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is an excellent point @matthewmturner
second, assuming that doesnt work could you just explain what the difference is between pretty_format_batches and the displayable function you proposed?
The only difference I am aware of is that pretty_format_batches
requires a String
(so allocates some memory and puts the formatted batches there). Thus it is not as efficient
Though now that you mention this, perhaps we could change pretty_format_batches
to something like the following (basically to get rid of the to_string()
:
///! Create a visual representation of record batches
pub fn pretty_format_batches(results: &[RecordBatch]) -> Result<impl Display> {
create_table(results)
}
///! Create a visual representation of columns
pub fn pretty_format_columns(col_name: &str, results: &[ArrayRef]) -> Result<impl Display> {
create_column(col_name, results)
}
This would be a API change because callers would now have to call to_string()
if they wanted a string, so instead of
let s = pretty_format_batches(&batches)?;
It would look more like
let s = pretty_format_batches(&batches)?.to_string();
But the benefit is that the String
would no longer be created unless it was necessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx @alamb, I think I agree and it makes sense.
And just to confirm, is the reason to return a type that implements Display
instead of the Table
or Column
from comfy_table
to give us more flexibility around that in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And just to confirm, is the reason to return a type that implements Display instead of the Table or Column from comfy_table to give us more flexibility around that in the future?
Yes, I think so -- for example when we updated from pretty-table
to comfy-table
or upgraded versions of comfy-table
the arrow
API wouldn't change
im a little confused why a test is failing in CI. locally i get an unused warning on Ill need to look into this more. |
I think that is due to the fact that the So if you change the code with something like diff --git a/arrow/src/util/pretty.rs b/arrow/src/util/pretty.rs
index 144eb6b601..e0f3574ed2 100644
--- a/arrow/src/util/pretty.rs
+++ b/arrow/src/util/pretty.rs
@@ -19,7 +19,7 @@
//! available unless `feature = "prettyprint"` is enabled.
use crate::{array::ArrayRef, record_batch::RecordBatch};
-use std::fmt::{Display, Write};
+use std::fmt::Display;
use comfy_table::{Cell, Table};
@@ -120,6 +120,7 @@ mod tests {
use super::*;
use crate::array::{DecimalBuilder, Int32Array};
use std::sync::Arc;
+ use std::fmt::Write;
#[test]
fn test_pretty_format_batches() -> Result<()> { I pushed a fix to your branch and resolved a conflict (I caused) |
@alamb ugh sry should have picked that up. thank you. |
No problem -- that hit me many times when I was newer to rust and I found it very confusing |
@matthewmturner -- interestingly the error reported in the tests only appeared for me when this branch got merged with master (there was a test with a logical conflict that was added for fixed size list formatting). Fixed (hopefully) in a696026 |
pretty_format_batches
to return Result<impl Display>
Any final thoughts @jimexist ? |
thanks for the change - it looks much more flexible now |
Thanks for your reply. |
DataFusion is effectively using the code from the |
thanks |
Which issue does this PR close?
Closes #951
Rationale for this change
What changes are included in this PR?
Are there any user-facing changes?