Adapt schema-print for notebook use #342

johnkerl · 2022-01-07T15:45:55Z

Problem to solve

Currently the "show" generic for ArraySchema -- and its recursive attributes including Domain, Dimension, Attribute, FilterList, and Filter -- rely on libtiledb_array_schema_dump et al. In turn, those TileDB core functions print to a FILE* which is stdout. This works fine at the R CLI; for notebooks, stdout is not plumbed to the user so the user sees nothing when doing schema(arr).

Options

We could modify TileDB core to have parallel functions which return strings, rather than writing to FILE*. This was attempted on Splitting up #2769 to diagnose CI issues [WIP] TileDB#2773. However, the scoping of that work has other dependencies as narrated by @eric-hughes-tiledb there. While that work is worth doing, it's on a greater scope than the current task.
We could somehow manipulate Rcpp::Rcout (an ostream which is plumbed to the user in both CLI and notebook contexts) and graft it onto core's FILE* interface.
We can simply imitate what TileDB-Py does when presented with the same situation, namely, have __repr__ methods (Python's equivalent of "show" generics in R) make the relevant API calls.

This PR implements option three.

Validation

At the CLI:

library(tiledb)

for (uri in c(
  "~/demo/palmer_penguins2",
  "~/demo/quickstart_dense",
  "~/demo/quickstart_sparse",
  "~/demo/test-array-200x200x30",
  "~/demo/writing_dense_global",
  "~/demo/writing_sparse_global",
  "tiledb://TileDB-Inc/quickstart_dense",
  "tiledb://TileDB-Inc/quickstart_sparse",
  "tiledb://TileDB-Inc/gtex-analysis-rnaseqc-gene-tpm"
  )) {

  arr <- tiledb_array(uri, query_type="READ", as.data.frame=TRUE)
  sch <- schema(arr)
  cat("\n")
  cat("================================================================\n")
  cat("URI", uri, "\n")
  print(sch)
}

This should print the same information as before -- since it's to the CLI's stdout.

The more compelling test is to make a Jupyter notebook using a local build of this PR's code:

Notes

This will be rebased atop UINT32 mods in support of schema-printer PR #342 #343.

shortcut-integration · 2022-01-07T15:45:58Z

This pull request has been linked to Shortcut Story #12282: Expand R-UDF validation coverage: make sure essential API calls work.

eddelbuettel

I was a little tied up earlier so belated comments but I would like to disentangle the PR into the (simpler, limited scope) UINT32 additions to some switch statements (maybe with unit tests hitting them) and the more fluid-in-motion issue of pretty printing schemas etc which is already in my lap and where the PR helps a little less.

eddelbuettel · 2022-01-07T18:10:19Z

And, if I may, from a purely procedural standpoint, I have over the years added a github PR stanza / form / suggestion in one or two open source projects where I recommend 'issue ticket and discussion first' before entering with PR which (personal view here) can add more friction that needed unless coordinated. I still think that helps a PRs going beyond trivial ones.

johnkerl · 2022-01-07T19:29:01Z

Thanks @eddelbuettel ! I'm looking forward to scheduling (currently unscheduled) of SC 13272 and 13273 which we had discussed, among other things, at some length.

Would you prefer I abandon this PR at this point?

eddelbuettel · 2022-01-07T19:33:30Z

We can probably throw a few PR together in the blender because I'll do some work on SC 13272 and 13273 which are close in spirit and scope. So we'll see where we end up with. Nothing wrong with adding a few show methods, then again I do not know how much high-level stuff is actually given that the arr <- tiledb_array(uri, options) followed by work on arr covers most uses.

I have this overarching fundamental problem that we have "some many functions" already making it hard at time to find functionality (!!) and the only we seem to be able to offer is more functions still. Tricky issue.

johnkerl · 2022-01-07T19:36:41Z

Thanks @eddelbuettel ! @antalakas and I are of the viewpoint that invisible schema-printing in notebook contexts is a blocker for promoting cloud-R in notebooks. Personally I would prefer to merge these "show" generics sooner, rather than waiting for a future refactor. I'll be happy to red-PR this stuff out in the future once your refactor work is complete -- ?

eddelbuettel · 2022-01-07T19:48:49Z

Maybe I am not seeing the forest for the trees but is there a reason why you cannot add pretty-printers you need now into the cloud package you are in control of?

johnkerl · 2022-01-07T20:01:23Z

@eddelbuettel our cloud docs show examples of things like printing schemas. Python support for doing this in notebooks exists; R has a gap for notebooks which this PR addresses. While I've done significant work in TileDB-Cloud-R over the last few months, the affected package for this particular task is indeed TileDB-R, not TileDB-Cloud-R.

See for example https://docs.tiledb.com/cloud/tutorials/start-here

* uint32 mods in support of schema-printer PR #342 * unit-test cases for UINT32 attribute get/set fill value * code-review feedback

eddelbuettel · 2022-01-09T16:20:47Z

This PR is good, it gets us almost to the finish line of not relying on core code (to stdout) for object display and adding a numer of missing show() methods. It tickled an error an real penguins data set (with NA values, as opposed to the cleansed one in cloud use). I fixed that, made the code a little tighter, and removed use of try.

Please see #345 (a new PR into this branch for your review) and have a look with your test arrays, it it is looking ok on the ones I tried.

Additinal polish on show methods

eddelbuettel · 2022-01-10T14:52:24Z

GitHub Actions for R forces a qpdf installation which we do not even need or use but which currently times out :-/ I relaunched; that helped earlier on the weekend.

eddelbuettel

Looks good to me -- but then I also tainted the review by adding some code myself :)

Anybody else?

johnkerl requested review from eddelbuettel and aaronwolen January 7, 2022 16:05

johnkerl force-pushed the kerl/sc-12282-array-schema-show branch 2 times, most recently from 95b63f8 to 3acd0b8 Compare January 7, 2022 16:07

johnkerl marked this pull request as ready for review January 7, 2022 16:08

johnkerl requested a review from Shelnutt2 January 7, 2022 16:15

johnkerl force-pushed the kerl/sc-12282-array-schema-show branch from 3acd0b8 to c44fedb Compare January 7, 2022 16:18

eddelbuettel reviewed Jan 7, 2022

View reviewed changes

johnkerl added a commit that referenced this pull request Jan 7, 2022

uint32 mods in support of schema-printer PR #342

8fe730b

johnkerl mentioned this pull request Jan 7, 2022

UINT32 mods in support of schema-printer PR #342 #343

Merged

johnkerl marked this pull request as draft January 7, 2022 18:36

johnkerl force-pushed the kerl/sc-12282-array-schema-show branch 3 times, most recently from 2f8a6d9 to 6119c68 Compare January 7, 2022 19:27

johnkerl marked this pull request as ready for review January 7, 2022 19:29

eddelbuettel pushed a commit that referenced this pull request Jan 7, 2022

UINT32 mods in support of schema-printer PR #342 (#343)

7c9af4c

* uint32 mods in support of schema-printer PR #342 * unit-test cases for UINT32 attribute get/set fill value * code-review feedback

johnkerl added 6 commits January 7, 2022 15:50

Adapt schema-print for notebook use

8e10cbe

rebase 341 in

14ec424

roxygenise

1a4f600

fix typo sch -> object

b9664df

Proofreading against core schema-printer

ae3aa3a

rebase in #343

dc073b4

johnkerl force-pushed the kerl/sc-12282-array-schema-show branch from d36f8dd to dc073b4 Compare January 7, 2022 20:50

eddelbuettel added 7 commits January 9, 2022 08:10

array_schema show method

745abe3

attr show method

a5d1b74

dimemsion show method

3c702b0

domain show method, plus some linebreak improvements

99e039b

filter show method

1122b30

filterlist show method

0021281

switch test from deprecated tiledb_dense to tiledb_array

0d7adb0

eddelbuettel mentioned this pull request Jan 9, 2022

Additional polish on show methods #345

Merged

Merge pull request #345 from TileDB-Inc/de/sc-12282/show_methods

28f91ad

Additinal polish on show methods

eddelbuettel approved these changes Jan 10, 2022

View reviewed changes

johnkerl merged commit 2a1a6e6 into master Jan 10, 2022

johnkerl deleted the kerl/sc-12282-array-schema-show branch January 10, 2022 15:07

eddelbuettel mentioned this pull request Jan 10, 2022

Only call fill value getter if TileDB 2.1.0 or later #347

Merged

eddelbuettel pushed a commit that referenced this pull request Jan 10, 2022

Additional show methods for TileDB objects (#342, #345)

ef27748

johnkerl mentioned this pull request Jan 13, 2022

Make array-schema prints visible from notebooks [WIP] TileDB-Inc/TileDB#2769

Closed

eddelbuettel mentioned this pull request Jan 24, 2022

Release 0.11.0 #356

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adapt schema-print for notebook use #342

Adapt schema-print for notebook use #342

johnkerl commented Jan 7, 2022 •

edited

Loading

shortcut-integration bot commented Jan 7, 2022

eddelbuettel left a comment

eddelbuettel commented Jan 7, 2022

johnkerl commented Jan 7, 2022 •

edited

Loading

eddelbuettel commented Jan 7, 2022

johnkerl commented Jan 7, 2022 •

edited

Loading

eddelbuettel commented Jan 7, 2022

johnkerl commented Jan 7, 2022 •

edited

Loading

eddelbuettel commented Jan 9, 2022

eddelbuettel commented Jan 10, 2022

eddelbuettel left a comment

Adapt schema-print for notebook use #342

Adapt schema-print for notebook use #342

Conversation

johnkerl commented Jan 7, 2022 • edited Loading

Problem to solve

Options

Validation

Notes

shortcut-integration bot commented Jan 7, 2022

eddelbuettel left a comment

Choose a reason for hiding this comment

eddelbuettel commented Jan 7, 2022

johnkerl commented Jan 7, 2022 • edited Loading

eddelbuettel commented Jan 7, 2022

johnkerl commented Jan 7, 2022 • edited Loading

eddelbuettel commented Jan 7, 2022

johnkerl commented Jan 7, 2022 • edited Loading

eddelbuettel commented Jan 9, 2022

eddelbuettel commented Jan 10, 2022

eddelbuettel left a comment

Choose a reason for hiding this comment

johnkerl commented Jan 7, 2022 •

edited

Loading

johnkerl commented Jan 7, 2022 •

edited

Loading

johnkerl commented Jan 7, 2022 •

edited

Loading

johnkerl commented Jan 7, 2022 •

edited

Loading