-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add truncate option to show and update default width of output #116
Conversation
|
||
try: | ||
columns = os.get_terminal_size().columns | ||
except OSError: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #116 +/- ##
==========================================
+ Coverage 85.55% 85.68% +0.12%
==========================================
Files 93 93
Lines 9469 9477 +8
Branches 1889 1891 +2
==========================================
+ Hits 8101 8120 +19
+ Misses 1039 1025 -14
- Partials 329 332 +3
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
aa1a39d
to
22ffae6
Compare
Deploying datachain-documentation with Cloudflare Pages
|
22ffae6
to
30852a2
Compare
30852a2
to
784964e
Compare
src/datachain/lib/dc.py
Outdated
if columns > 0: | ||
options.extend(["display.width", columns]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
display.width : int
Width of the display in characters. In case python/IPython is running in
a terminal this can be set to None and pandas will correctly auto-detect
the width.
Note that the IPython notebook, IPython qtconsole, or IDLE do not run in a
terminal and hence it is not possible to correctly detect the width.
[default: 80] [currently: 80]
Pandas can auto-detect width. So I don't think you need to set them yourself, or use os.get_terminal_size()
.
However, you probably should move (display.max_columns, None)
to if not truncate
condition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pandas can auto-detect width. So I don't think you need to set them yourself, or use os.get_terminal_size().
If you watch the demo this behaviour is not working for me.
Running the examples/get_started/udfs/parallel.py
script from #111 without the change I get:
Listing gs://datachain-demo: 400 objects [00:01, 339.57 objects/s]
Processed: 400 rows [00:00, 16351.90 rows/s]
Processed: 400 rows [00:42, 9.40 rows/s]
file file file file \
source parent name size
0 gs://datachain-demo dogs-and-cats cat.1.jpg 16880
1 gs://datachain-demo dogs-and-cats cat.1.json 99
2 gs://datachain-demo dogs-and-cats cat.10.jpg 34315
3 gs://datachain-demo dogs-and-cats cat.10.json 100
4 gs://datachain-demo dogs-and-cats cat.100.jpg 28377
5 gs://datachain-demo dogs-and-cats cat.100.json 101
6 gs://datachain-demo dogs-and-cats cat.1000.jpg 5944
7 gs://datachain-demo dogs-and-cats cat.1000.json 102
8 gs://datachain-demo dogs-and-cats cat.1001.jpg 23099
9 gs://datachain-demo dogs-and-cats cat.1001.json 102
10 gs://datachain-demo dogs-and-cats cat.1002.jpg 16999
11 gs://datachain-demo dogs-and-cats cat.1002.json 102
12 gs://datachain-demo dogs-and-cats cat.1003.jpg 13996
13 gs://datachain-demo dogs-and-cats cat.1003.json 102
14 gs://datachain-demo dogs-and-cats cat.1004.jpg 41052
15 gs://datachain-demo dogs-and-cats cat.1004.json 102
16 gs://datachain-demo dogs-and-cats cat.1005.jpg 33372
17 gs://datachain-demo dogs-and-cats cat.1005.json 102
18 gs://datachain-demo dogs-and-cats cat.1006.jpg 23571
19 gs://datachain-demo dogs-and-cats cat.1006.json 102
file file file file \
version etag is_latest last_modified
0 1721494538128219 CNuWtvOKtocDEAE= 1 1970-01-01 00:00:00+00:00
1 1721494541157069 CM2F7/SKtocDEAE= 1 1970-01-01 00:00:00+00:00
2 1721494540482739 CLPxxfSKtocDEAE= 1 1970-01-01 00:00:00+00:00
3 1721494537938657 COHNqvOKtocDEAE= 1 1970-01-01 00:00:00+00:00
4 1721494542320150 CJaEtvWKtocDEAE= 1 1970-01-01 00:00:00+00:00
5 1721494541917698 CIK8nfWKtocDEAE= 1 1970-01-01 00:00:00+00:00
6 1721494540506694 CMasx/SKtocDEAE= 1 1970-01-01 00:00:00+00:00
7 1721494542191364 CISWrvWKtocDEAE= 1 1970-01-01 00:00:00+00:00
8 1721494538757000 CIjH3POKtocDEAE= 1 1970-01-01 00:00:00+00:00
9 1721494539671362 CMKulPSKtocDEAE= 1 1970-01-01 00:00:00+00:00
10 1721494539926153 CIn1o/SKtocDEAE= 1 1970-01-01 00:00:00+00:00
11 1721494538606028 CMyr0/OKtocDEAE= 1 1970-01-01 00:00:00+00:00
12 1721494540292260 CKShuvSKtocDEAE= 1 1970-01-01 00:00:00+00:00
13 1721494538083602 CJK6s/OKtocDEAE= 1 1970-01-01 00:00:00+00:00
14 1721494539902670 CM69ovSKtocDEAE= 1 1970-01-01 00:00:00+00:00
15 1721494541290725 COWZ9/SKtocDEAE= 1 1970-01-01 00:00:00+00:00
16 1721494538590041 CNmu0vOKtocDEAE= 1 1970-01-01 00:00:00+00:00
17 1721494540901128 CIi23/SKtocDEAE= 1 1970-01-01 00:00:00+00:00
18 1721494539610043 CLvPkPSKtocDEAE= 1 1970-01-01 00:00:00+00:00
19 1721494540202751 CP/ltPSKtocDEAE= 1 1970-01-01 00:00:00+00:00
file file path_len
location vtype
0 None 9
1 None -1
2 None 10
3 None -1
4 None 11
5 None -1
6 None 12
7 None -1
8 None 12
9 None -1
10 None 12
11 None -1
12 None 12
13 None -1
14 None 12
15 None -1
16 None 12
17 None -1
18 None 12
19 None -1
[Limited by 20 rows]
and then after the change I get the output spread across my terminal:
Processed: 400 rows [00:00, 23722.10 rows/s]
Processed: 400 rows [00:43, 9.28 rows/s]
file file file file file file file file file file path_len
source parent name size version etag is_latest last_modified location vtype
0 gs://datachain-demo dogs-and-cats cat.1.jpg 16880 1721494538128219 CNuWtvOKtocDEAE= 1 1970-01-01 00:00:00+00:00 None 9
1 gs://datachain-demo dogs-and-cats cat.1.json 99 1721494541157069 CM2F7/SKtocDEAE= 1 1970-01-01 00:00:00+00:00 None -1
2 gs://datachain-demo dogs-and-cats cat.10.jpg 34315 1721494540482739 CLPxxfSKtocDEAE= 1 1970-01-01 00:00:00+00:00 None 10
3 gs://datachain-demo dogs-and-cats cat.10.json 100 1721494537938657 COHNqvOKtocDEAE= 1 1970-01-01 00:00:00+00:00 None -1
4 gs://datachain-demo dogs-and-cats cat.100.jpg 28377 1721494542320150 CJaEtvWKtocDEAE= 1 1970-01-01 00:00:00+00:00 None 11
5 gs://datachain-demo dogs-and-cats cat.100.json 101 1721494541917698 CIK8nfWKtocDEAE= 1 1970-01-01 00:00:00+00:00 None -1
6 gs://datachain-demo dogs-and-cats cat.1000.jpg 5944 1721494540506694 CMasx/SKtocDEAE= 1 1970-01-01 00:00:00+00:00 None 12
7 gs://datachain-demo dogs-and-cats cat.1000.json 102 1721494542191364 CISWrvWKtocDEAE= 1 1970-01-01 00:00:00+00:00 None -1
8 gs://datachain-demo dogs-and-cats cat.1001.jpg 23099 1721494538757000 CIjH3POKtocDEAE= 1 1970-01-01 00:00:00+00:00 None 12
9 gs://datachain-demo dogs-and-cats cat.1001.json 102 1721494539671362 CMKulPSKtocDEAE= 1 1970-01-01 00:00:00+00:00 None -1
10 gs://datachain-demo dogs-and-cats cat.1002.jpg 16999 1721494539926153 CIn1o/SKtocDEAE= 1 1970-01-01 00:00:00+00:00 None 12
11 gs://datachain-demo dogs-and-cats cat.1002.json 102 1721494538606028 CMyr0/OKtocDEAE= 1 1970-01-01 00:00:00+00:00 None -1
12 gs://datachain-demo dogs-and-cats cat.1003.jpg 13996 1721494540292260 CKShuvSKtocDEAE= 1 1970-01-01 00:00:00+00:00 None 12
13 gs://datachain-demo dogs-and-cats cat.1003.json 102 1721494538083602 CJK6s/OKtocDEAE= 1 1970-01-01 00:00:00+00:00 None -1
14 gs://datachain-demo dogs-and-cats cat.1004.jpg 41052 1721494539902670 CM69ovSKtocDEAE= 1 1970-01-01 00:00:00+00:00 None 12
15 gs://datachain-demo dogs-and-cats cat.1004.json 102 1721494541290725 COWZ9/SKtocDEAE= 1 1970-01-01 00:00:00+00:00 None -1
16 gs://datachain-demo dogs-and-cats cat.1005.jpg 33372 1721494538590041 CNmu0vOKtocDEAE= 1 1970-01-01 00:00:00+00:00 None 12
17 gs://datachain-demo dogs-and-cats cat.1005.json 102 1721494540901128 CIi23/SKtocDEAE= 1 1970-01-01 00:00:00+00:00 None -1
18 gs://datachain-demo dogs-and-cats cat.1006.jpg 23571 1721494539610043 CLvPkPSKtocDEAE= 1 1970-01-01 00:00:00+00:00 None 12
19 gs://datachain-demo dogs-and-cats cat.1006.json 102 1721494540202751 CP/ltPSKtocDEAE= 1 1970-01-01 00:00:00+00:00 None -1
[Limited by 20 rows]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the width correctly set on your machine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I get the same output as you (the first one). I think it's due to (max_columns, None)
, which sets it to be in unlimited mode.
The downside of removing that is that it's going to collapse columns, which is what's going to happen with your PR too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems as though setting width with None
gives nowrap behaviour but using os.get_terminal_size().columns
gives the desired result:
Screen.Recording.2024-07-22.at.2.14.56.PM.mov
I am going to revert to the original implementation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you don't set the ("display.width", None)
, you will get the same behaviour as os.get_terminal_size()
, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The width defaults to 80.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, right. It's the other way around. None
does the same thing as os.get_terminal_size()
, so you are right.
4bbd514
to
eae98dc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great
eae98dc
to
25f175d
Compare
This PR introduces a
truncate
flag intoDataChain.show
and updates the output's default width to match the terminal's (if available). The default behaviour ofDataChain.show
remains the same. Users can now force the command to expand all column output usingtruncate=False
(seems a bit late to introduce pagination before the release).Demo
Screen.Recording.2024-07-22.at.10.14.33.AM.mov
Here is a script that I used to test the code along with the output of that script: