Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix source_id updates and rewriting points of SubscriberScanModel #381

Merged
merged 1 commit into from
Apr 4, 2024

Conversation

pmldrmota
Copy link
Contributor

Without this, the applet of a TopLevelRunner with a constant dataset prefix doesn't update the source_id or the title or the cached data (e.g. averaged points) upon resubmitting the next scan.
Internal example: SPAM measurement.

Without this, the applet of a TopLevelRunner with a constant prefix wouldn't update
the source_id or the cached data (averaged points) upon resubmitting the same scan.
# Check if points were appended or rewritten.
if name in self._point_data:
imax = min(len(point_values), len(self._point_data[name]))
if point_values[:imax] != self._point_data[name][:imax]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We never accidentally end up with the lists being ndarrays here, right? (Then this would give a bool array.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, they are lists, but it wouldn't hurt using np.array_equal here to avoid issues in the future.

@dnadlinger
Copy link
Member

dnadlinger commented Jan 16, 2024

Thanks! This has been on my list as well – displaying the wrong source id is a high-priority bug, as it is a data traceability issues!

Looking at the rest of data_changed, wouldn't we also need to completely re-initialise the series/online analyses? Also, in theory one run could be a scan, the other a Repeat non-scan (necessitating a SubscriberSinglePointModel vs SubscriberScanModel switch).

I believe my original intention here was not to deal with all that complexity in the UI code, but have the model layer just completely replace the Model if the same prefix was reused. Clearly, I failed to implement that in the SubscriberRoot code.

@pmldrmota
Copy link
Contributor Author

I believe my original intention here was not to deal with all that complexity in the UI code, but have the model layer just completely replace the Model if the same prefix was reused.

Is this still your preferred approach? What would be the indication that a new scan has been issued? Should the TopLevelRunner also broadcast the date/time when the experiment was run? Or just use seed?

@dnadlinger
Copy link
Member

Is this still your preferred approach?

It seems cleaner, as otherwise you might e.g. get a brief moment where the old data including annotations is still displayed but the source_id has already been updated (might be quite long for subscans, probably until the first complete point comes in (?)).

Not sure what the best marker of a new scan would be; perhaps something like ndscan_schema_revision. Separating things like that cleanly was a large part of my initial motivation for the server-side dataset namespacing. Without it, we might need to manually handle this in the subscriber code (such that the old data isn't used to get the schemata/points immediately after the refresh).

Another option would be to give up and accept that there might sometimes be mixed states. If the user explicitly specifies the same top-level runner dataset root from multiple experiments (can be multiple instances of the same experiment), there is nothing we can do to avoid the mess on the applet side anyway.

@dnadlinger
Copy link
Member

Does sipyco emits mods when datasets are set_dataset() from a new experiment, even if the value has not changed?

Then a clean solution would be to designate one dataset that takes the role of begininng a new scan (e.g. ndscan_schema_revision), and when this is observed in the subscriber code, chuck away all the current datasets and append the new datasets in only as they come in.

(The problem to avoid here is to avoid refreshing the scan after a new one has begun, but immediately load in all the new data that might still be present.)

@dnadlinger dnadlinger added this pull request to the merge queue Apr 4, 2024
Merged via the queue into master with commit 751752c Apr 4, 2024
3 checks passed
@dnadlinger dnadlinger deleted the source-id+points-rewritten branch April 4, 2024 22:24
@dnadlinger
Copy link
Member

Merged this anyway to avoid more incorrect source_ids making it into lab book entries, but I'd still like to address this more properly/generally: #387

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants