Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attempting to copy 100,000 rows only copies 50,000, does not throw an error. #4977

Closed
mofojed opened this issue Dec 22, 2023 · 6 comments
Closed
Assignees
Labels
bug Something isn't working jsapi triage

Comments

@mofojed
Copy link
Member

mofojed commented Dec 22, 2023

Description

When you try to copy 100,000 rows, it only copies 50,000.

Steps to reproduce

  1. Create a table with 100,000 rows, e.g.
from deephaven import empty_table
t = empty_table(100000).update(["X=i", "Y=i*i"])
  1. In the table, press Ctrl+A to select all, then Ctrl+C to copy. Click "Copy" when prompted to confirm the copy of 100,000 rows.
  2. In a text pad, paste the result

Expected results
3. 100,000 rows pasted

Actual results
3. Only 50,000 rows are pasted

Versions

Engine Version: 0.32.0-SNAPSHOT
Web UI Version: 0.57.1
Java Version: 11.0.21
Barrage Version: 0.6.0

@mofojed
Copy link
Member Author

mofojed commented Dec 22, 2023

Reproducible in just the JSAPI. Run this in the browser console on the JSAPI page (http://localhost:10000/jsapi/)
It's requesting a snapshot of 100,000 rows, and only getting 50,000.

// REPLACE THE TOKEN WITH YOUR PSK
var token = 'iris';
var dh = (await import('./dh-core.js')).default

var c = new dh.CoreClient(window.location.protocol + "//" + window.location.host)
await c.login({type: 'io.deephaven.authentication.psk.PskAuthenticationHandler', token})

console.log('Logged in...')

var connection = await c.getAsIdeConnection()
var ide = await connection.startSession('python')

console.log('Session started...')

// Create a table with 100000 rows

var code = `
from deephaven import empty_table
t = empty_table(100000).update(["X=i", "Y=i*i"])
`;

var result = await ide.runCode(code);
var updates = [...result.changes.created, ...result.changes.updated]

var t = await connection.getObject(updates[0])

console.log('Got table t of size', t.size)

var subscription = t.setViewport(0, 100)

var rangeSet = dh.RangeSet.ofRange(0, t.size);

console.log('Fetching snapshot with rangeSet', rangeSet);

var snapshot = await subscription.snapshot(rangeSet, t.columns)

console.log('Snapshot is ', snapshot);
console.log('Snapshot added size is', snapshot.added.size, 'fullIndex', snapshot.fullIndex.size)

@mofojed mofojed transferred this issue from deephaven/web-client-ui Dec 22, 2023
@mofojed mofojed added the jsapi label Dec 22, 2023
@mofojed
Copy link
Member Author

mofojed commented Dec 22, 2023

It seems to cap at 50,000, even if I build a table with more columns, or only request 60,000 rows.
Also this is affecting Core workers on Enterprise, would upgrade this so a severe issue as it's erroneous data (think you're copying the whole table when you're not).

@mofojed
Copy link
Member Author

mofojed commented Dec 22, 2023

Should patch this in 0.30.5 for Enterprise/Vermilion as well.

@niloc132
Copy link
Member

This seems to be tied to BarrageUtil.minSnapshotCellCount defaulting to 50000, but oddly enough while that is a cell count, we're seeing it as a row count, regardless of how many columns (and so cells) there are.

JS API is wrong here, #188 will enable it to handle more than one separate snapshot before having the initial amount of data. We should be able to detect this case from the JS API and emit an error instead of claiming success with partial data. Raising the min and max snapshot cell count should make it possible for the server and client to get along here.

Another solution would/should be the web UI asking for smaller snapshots - very wide rows will end up eventually breaking with the above limits raised. This is what the csv export does, even with "smaller" exports.

@mofojed
Copy link
Member Author

mofojed commented Jan 9, 2024

@niloc132 can we please get a PR to throw an error if the snapshot is too big? This is marked as a blocker for Enterprise.

@niloc132
Copy link
Member

This should be fixed by #4985, at least until #188 is complete when we can revert the server-side workaround.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working jsapi triage
Projects
None yet
Development

No branches or pull requests

3 participants