-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[python] Use 31-bit-friendly default shape for ingest #1440
Conversation
019ba8a
to
f32e95e
Compare
Codecov ReportPatch coverage has no change and project coverage change:
❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more. Additional details and impacted files@@ Coverage Diff @@
## main #1440 +/- ##
===========================================
+ Coverage 64.44% 91.32% +26.87%
===========================================
Files 102 30 -72
Lines 8336 2547 -5789
===========================================
- Hits 5372 2326 -3046
+ Misses 2964 221 -2743
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One bug to be resolved. The other question: in some cases, this is a breaking change, so do we want to ensure it is only released when we have the remainder of the issues resolved?
Other related issues to be proposed by @mojaveazure , as we discussed today. I'm happy to wait on merging this PR until then, if folks prefer. |
d6e2d2f
to
c05c527
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Requested change: The fixtures in
./apis/python/src/tiledbsoma
./apis/python/notebooks/data
need to be re-generated. - Nice-to-have: I think letting the user to specify a shape would be ideal, with the default set to 32-bit. See for reference this https://github.com/single-cell-data/TileDB-SOMA/pull/1358/files#diff-dc1f1c04136350fc3134f40bd69db3715831b77579de8a425a1eacac03effcc7
@pablo-gar two points about passing user-specified shape all the way from top level (see also #1445): (1) I want to do that on a separate PR -- not this one (2) You point to a diff for passing the desired shape to a single array. But
There are 12 different arrays here, with varying shapes. What parameterization do you propose?
Something like the above? Or something else? |
@johnkerl Thanks! Important details on the many different arrays. As I think about it, there two main use cases we want to fulfill with this:
Not sure I have the right answer since non-defaults add ux complexity. |
Interop CI is failing with |
It passes on my laptop with this branch checked out, so, it needs to be a CI-resource-config problem to be solved ... 🤔 |
@pablo-gar my position is that the above
is prohibitively complex to require people to type. Just not OK. Another option is we require them to do
or some such -- let the num-PCAs axes etc be tight. The best option, I believe, is:
|
SummaryRe the CI fail:
This is due to increased runtime and a combination of several factors:
Details
Output for the
Output for the
From the above program output we see that in the
From the above program output we see that in the
Action items
|
Can you isolate where the |
@eddelbuettel it's at this line: |
Good. If you put a line with > set.seed(123); N <- 100; M <- as(rsparsematrix(N, N, 0.2), "dgCMatrix"); M[N,N]
[1] 0
> set.seed(123); N <- 100; M <- as(rsparsematrix(N, N, 0.2), "dgCMatrix"); M[N+1,N+1]
Error in intI(i, n = d[1L], dn[[1L]], give.dn = FALSE) :
index larger than maximal 100
>
> for (i in 4:7) { set.seed(123); N <- 10^i; M <- as(rsparsematrix(N, N, 1000/(N*N)), "dgCMatrix"); print(M[N,N]) }
[1] 0
[1] 0
[1] 0
[1] 0
> |
@eddelbuettel https://gist.github.com/johnkerl/567bdfc1a6c750be1b7fec8792605eb3 I believe the 31-bit shape, not the
|
18ac2ec
to
41e7ee6
Compare
41e7ee6
to
6876784
Compare
I just rebased on now-merged #1504 |
81d6470
to
cba7647
Compare
Not sure where to go next from here. I'm not willing to merge a PR which turns interop CI (or any CI for that matter) from green to red. |
612b137
to
9e3ab57
Compare
Files were regenerated. Extra non-default-shape arguments are a great idea, and outside the scope of this PR.
@aaronwolen @pablo-gar after rebase on #1504 #1508 #1521 this is now a simple, green-CI PR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
Issue and/or context: As discussed this week and last week. The room-for-growth is good. Two billion room-for-growth in each dimension is a quite roomy default, and works better for R interop.
Tracking issue: #1445
Changes: The default domain in Python is room for plenty of growth after creation. Here we retain that -- still plenty of room for growth -- just 32-bit friendly for better ability for R code to read Python-created SOMA data.
Notes for Reviewer: This is a non-breaking change at the API level.