Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MMCoreJ.i: added code to make adding the system cache to tagged images conditional. #305

Merged
merged 4 commits into from
Nov 25, 2022

Conversation

nicost
Copy link
Member

@nicost nicost commented Nov 24, 2022

That was remarkably simple. However, do not merge (yet). Consequences are unexpected: In both Live and Snap mode, the "user" tags contain the system state cache, which disappear after running mmc.setIncludeSystemCache(true) in the script panel. However, the system cache is also contained in the "device" tags, presumably entered somewhere upstream in the code. That makes me believe we should remove the addition of the systemStateCache altogether. There is some work needed to do so, because running an MDA with 3 Z positions after mmc.setIncludeSystemCache(true) results in:

2022-11-24T07:59:59.783280 tid33644 [IFO,App] 
                                    [       ] java.lang.IllegalArgumentException: Failed to convert TaggedImage tags to metadata in Thread[TaggedImage sink thread,6,main]
                                    [       ]   at org.micromanager.data.internal.DefaultImage.<init>(DefaultImage.java:113)
                                    [       ]   at org.micromanager.data.internal.DefaultImage.<init>(DefaultImage.java:82)
                                    [       ]   at org.micromanager.acquisition.internal.DefaultTaggedImageSink$1.run(DefaultTaggedImageSink.java:68)
                                    [       ] Caused by: java.lang.NullPointerException
                                    [       ]   at org.micromanager.data.internal.PropertyKey$48.extractFromGsonObject(PropertyKey.java:1256)
                                    [       ]   at org.micromanager.internal.propertymap.NonPropertyMapJSONFormats$MetadataFormat.fromGson(NonPropertyMapJSONFormats.java:212)
                                    [       ]   at org.micromanager.data.internal.DefaultImage.<init>(DefaultImage.java:111)
                                    [       ]   at org.micromanager.data.internal.DefaultImage.<init>(DefaultImage.java:82)
                                    [       ]   at org.micromanager.acquisition.internal.DefaultTaggedImageSink$1.run(DefaultTaggedImageSink.java:68)

Working on that now.

@nicost nicost marked this pull request as draft November 24, 2022 16:19
@marktsuchida
Copy link
Member

marktsuchida commented Nov 24, 2022

I think the new methods should have SystemStateCache, not SystemCache, in their name. I know it is long, but consistency (with existing methods) is helpful. Also MMCore API minor version should be incremented.

However, the system cache is also contained in the "device" tags, presumably entered somewhere upstream in the code. That makes me believe we should remove the addition of the systemStateCache altogether.

I don't think we should completely remove it from MMCoreJ; that would be a gratuitous compatibility break with no real benefit. And I suspect the device property tags are creeping in downstream (in MMStudio and acqEngine), not upstream.

If you are getting device properties in the metadata even when this new flag is set false, it probably means that MMStudio (and acqEngine) calls getSystemStateCache() to get the data. See DefaultAcquisitionManager. Also in acqEngine, where it is sometimes known as get-system-config-cached.

I would need to read the code to remind myself what acqEngine uses it for, but the reason MMStudio does this in DefaultAcquisitionManager.generateMetadata() (used by Snap and Live but not MDA) is probably related to the problem of segregating device properties from other (standard and user-defined) metadata fields. In early MM2 development we wanted to establish a separation, because mixing poorly-specified string keys in one bag was a mess and made the metadata impossible to reliably parse. The way DefaultAcquisitionManager does it should be faster than going through JSONObject (though I don't know if it is fast enough), and there is already a flag to disable it in generateMetadata(), which SnapLiveManager can decide to make configurable (currently it is hard-coded to true).

(In this sense not getting device properties via getTaggedImage or popNextTaggedImage actually plays well with the MM2 way of thinking.)

@nicost
Copy link
Member Author

nicost commented Nov 24, 2022

I think the new methods should have SystemStateCache, not SystemCache, in their name. I know it is long, but consistency (with existing methods) is helpful. Also MMCore API minor version should be incremented.

Done and done.

I don't think we should completely remove it from MMCoreJ; that would be a gratuitous compatibility break with no real benefit.

Fair enough, will leave as is.

If you are getting device properties in the metadata even when this new flag is set false, it probably means that MMStudio (and acqEngine) calls getSystemStateCache() to get the data. See DefaultAcquisitionManager. Also in acqEngine, where it is sometimes known as get-system-config-cached.

Further experimenting shows that Snap/Live indeed inserts the "device" tags, wheres the Clojure acquisition engine does not seem to insert such a thing. Going forward (i.e. after merging this PR), I would like Snap/Live to call mmc.setIncludeSystemStateCache(false) before it snaps/goes live, and return it to its original setting when it is done. It makes a lot of sense not to touch the Clojure acquisition engine and let it do its thing, but it would be great for the new new AcqEngJ to insert the system state cache with the first image, and from then on out only properties that changed for each image.

I believe this is OK to merge now, but I'll leave it up to you to judge.

@nicost nicost marked this pull request as ready for review November 24, 2022 21:41
@nicost
Copy link
Member Author

nicost commented Nov 24, 2022

Tested the cost of adding the system state cache with the script below. Set the DemoCamera to a size 0f 32x32 pixels, FastImage to 1, exposure time to 0.001ms (there us a huge difference with exposure time of 0.01 ms, which I do not understand). Result:

mmc.setIncludeSystemStateCache(true)  It took: 47823 ms, popped: 100000images.
mmc.setIncludeSystemStateCache(false)  It took: 4849 ms, popped: 100000images.

Beanshell script:

import java.time.Instant;
import java.time.Duration;
mmc.setIncludeSystemStateCache(false);
count = 0;
Instant start = Instant.now();
mmc.clearCircularBuffer();
mmc.startSequenceAcquisition(100000, 0, true);
while (mmc.getRemainingImageCount() == 0) {}
while (mmc.isSequenceRunning() || mmc.getRemainingImageCount() > 0) {
   if (mmc.getRemainingImageCount() > 0) {
	   img = mmc.popNextTaggedImage();
	   count++;
   }
}
Instant finish = Instant.now();
long timeElapsed = Duration.between(start, finish).toMillis();
mm.scripter().message("It took: " + timeElapsed + " ms, popped: " + count + "images.");

Rough estimate for adding system state cache: 40 s / 100,000 = 0.4ms per image.

Copy link
Member

@marktsuchida marktsuchida left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside from the one minor comment, I think this is ready to merge.

MMCoreJ_wrap/MMCoreJ.i Outdated Show resolved Hide resolved
@henrypinkard
Copy link
Member

Rough estimate for adding system state cache: 40 s / 100,000 = 0.4ms per image.

Wow, what a waste of cpu cycles.

it would be great for the new new AcqEngJ to insert the system state cache with the first image, and from then on out only properties that changed for each image.

Should be easy. You mean just in sequences or in general?

@marktsuchida
Copy link
Member

I would like Snap/Live to call mmc.setIncludeSystemStateCache(false) before it snaps/goes live, and return it to its original setting when it is done.

I think this makes sense.

there us a huge difference with exposure time of 0.01 ms, which I do not understand

Just speculating, but it might be the difference between DemoCamera calling Sleep() (which may schedule another thread (possibly the "idle" thread which pauses the CPU) to run, even for a brief time) vs not calling Sleep() at all due to the time interval having passed already. If so the 0.001 ms you used is the correct way to test the efficiency of popping.

@nicost
Copy link
Member Author

nicost commented Nov 25, 2022

Should be easy. You mean just in sequences or in general?

I think the answer is "in general". It should always be possible to reconstruct all properties just from the changed ones. Main difficulty I see is for the acquisition engine to keep track of what changed.

@nicost nicost merged commit 0cb2be1 into main Nov 25, 2022
@nicost nicost deleted the cacheConditional branch November 25, 2022 19:49
@henrypinkard
Copy link
Member

Main difficulty I see is for the acquisition engine to keep track of what changed.

If you want it to work in general whenever properties change, I think this will need to happen in the core, because how can the acquisition check what has changed without first reading the metadata itself?

@nicost
Copy link
Member Author

nicost commented Nov 28, 2022

It would be a great start to include properties the Acquisition Engine changed itself. It can also subscribe to the OnPropertyChanged callbacks. That may cover most.

@henrypinkard
Copy link
Member

Why not just do this directly in the core? There it would be most likely nothing will slip through the cracks, and it removes the burden of code other than the acquisition engine having to implement the same thing to solve the same problem

@nicost
Copy link
Member Author

nicost commented Nov 29, 2022

What should that look like? Something like: startCollectingPropertyChanges(), getPropertyChanges(), stopCollectingPropertyChanges(), clearCollectedPropertyChanges()?

@marktsuchida
Copy link
Member

I think theoretically it makes sense to do it in the Core, so that data is reduced at the source. The challenge is that in order to report changes to the properties, the Core will need to keep a diff since the last time it reported to the client. This can rapidly get complicated if there are multiple such clients, assuming the reporting is done by the client periodically pulling the diff.

If the Core pushes changes to subscribed clients, it will look much like a variant of onPropertyChanged, but with more frequent calls; we'd need to see if that doesn't negate the performance gain (plus, this means more work on the client side, which kind of defeats the whole point).

I think it is entirely possible that periodically copying the entire system state cache to the client (and performing a diff there) can end up being more efficient, depending on the parameters. This may especially be true if we store the cache in a more efficient data structure (sorted vector rather than map). Hard to discuss without defining the range of frequencies at which it will be used (and finding out how many properties "typically" change in one period).

There are still ways to get a pull interface to work. For example, we could say that the Core may, at its sole discretion, spuriously include unchanged properties in the diffs it produces (I'm assuming the "diffs" only include the new values, not previous values). That would allow it to keep a bounded log of changes: for example, it could maintain a list/queue of all properties that changed in the last 500 ms (a threshold to be tuned). When the client retrieves a diff, the diff should come with a monotonic timestamp. Next time the client requests a diff, it should pass the timestamp in a since parameter. Then the Core's job is pretty simple: if the since value is less than 500 ms old, it can return precisely the changed properties. Otherwise it can dump the whole cache. Of course this may not be the kind of interface that application programmers will be most comfortable with.

@nicost
Copy link
Member Author

nicost commented Nov 30, 2022

I like the interface I described in the previous comment better (since there is no auto-magic and the client decides when the recording should start and stop), although it does not deal with multiple clients. Do we need to deal with multiple clients?

@marktsuchida
Copy link
Member

If we don't deal with multiple clients, then all sorts of complexity will be necessary on the client side. For example, currently we support running Live mode in between timepoints of a (slow) MDA time-lapse. The acquisition engine would have to have (even more) hard-coded behavior to support this if it cannot know if somebody else might mess up the sequence of changes. The same applies to runnables and hooks, which may also wish to access this data. It also precludes other potential use cases, like subscribing to these changes to implement a GUI indicator panel that can coexist with other things. None of these concerns arise if the Core handles multiple clients cleanly.

In other words, the situation faced by a given client is actually more complicated with the single-client version, at least if one wants to write a correct client, because the client will need to make fragile guesses as to whether it can use this API at any given moment.

The solution I suggested above is definitely not the only one possible. For example, there could be a global "property change reporting interval" setting in the Core, and the Core could generate a change event containing all changes during each interval. This way, arbitrary numbers of clients can subscribe to the change stream as long as they don't attempt to modify the interval without coordination. But I don't know if this is enough -- we need to think about what intervals will be used in what cases and who will need to decide the interval.

It is also possible to design an API in which each client creates a change log in the Core, so that each client can retrieve the "changes since the last time I checked" independently.

(And finally, a more fundamental discussion to have is whether we are happy with the system state cache mechanism itself. Its behavior can only be described as "best effort", and it has the confusing characteristic of updating the values of properties when getProperty is called (which makes the update behavior practically non-deterministic, since it's impossible to know when other parts of the program may read property values). I never really trust the values from it unless I have confirmed that the specific properties I'm interested in are actually updated in a timely manner under the specific usage situation (which is often true but not always). Things are a little better for properties that reliably call onPropertyChanged, but users have no way of knowing which ones have this (or that such a distinction even exists). But maybe this should be a more long-term question.)

It would probably help to define the requirements (and how exactly this API will be used) first. If the goal is just to give the appearance that device properties are continuously updated in the image metadata, then even a fixed 1-second update interval may "work" (remembering that exact timing of updates was never guaranteed anyway). But I suspect we will find that changes will need to be retrieved in a way that is in sync with snaps (so that the devices appear in the right positions for channels in a non-triggered acquisition -- something that currently holds, I think) and other events. If 99% of uses will be retrieving property changes once per image frame, then maybe it does make sense to have an API that delivers them together with the image -- although I don't favor this, it does side-step the problem of multiple clients (sort of, because we already put lots of complexity into client code so that multiple acquisitions are not run simultaneously).

@nicost
Copy link
Member Author

nicost commented Dec 1, 2022

It is also possible to design an API in which each client creates a change log in the Core, so that each client can retrieve the "changes since the last time I checked" independently.

I kind of favor that idea.

Having some kind of interval over which changes are accumulated seems very error prone, and surely will lead to problems.

Is it a better path forward for the acquisition engine to insert the changes that it is aware of?

@marktsuchida
Copy link
Member

Is it a better path forward for the acquisition engine to insert the changes that it is aware of?

The simplest speedup that I think we can make in acqEngine is to start disabling system state cache data in popNextTaggedImage and instead call getSystemStateCache (which doesn't use JSONObject) separately to get the same data more efficiently. But this only helps if acqEngine switches to using org.micromanager.data.Image internally rather than TaggedImage. This change doesn't require anything new in MMCore(J) beyond #305.

I'm not sure what can easily be done beyond that without affecting the resulting data, given that our file formats require all fields to be present in the metadata of each frame. We could offer the user an option to turn off device metadata completely (or to update only once per sequence acquisition, once every N frames, every frame but no more than 10 Hz, etc.), or come up with an alternative way to save property values (though the latter is complex and unproven).

As for AcqEngJ and other new code, the answer depends on what format we want the data to be in at the destination.

Having some kind of interval over which changes are accumulated seems very error prone, and surely will lead to problems.

Only if it is misunderstood as being a change notification mechanism (I didn't present it super clearly). That idea is something that could be described as a (unidirectional) data synchronization protocol: the Core maintains the full set of current values, and the client also maintains the full set of values, and we periodically (at the client's discretion) sync the values from Core to client.

The naive way to do this is by copying all the values; this is the baseline case and is good enough when the sync interval is long. We can then optimize certain cases by taking advantage of the knowledge of when the sync last happened, but the optimization is guaranteed not to change the result (new values in client) of each sync. This can be much more robust than a lot of the event/notification stuff in MM.

However, all that it offers (over the existing getSystemStateCache) is a reduction in the amount of data to copy into Java, and that could potentially be negated by the added complexity of performing partial updates, so I'm not sure it is worth doing. It is definitely not worth doing without first having benchmarks that actually tell us where the bottlenecks are.

@nicost
Copy link
Member Author

nicost commented Dec 2, 2022

But this only helps if acqEngine switches to using org.micromanager.data.Image internally rather than TaggedImage. This change doesn't require anything new in MMCore(J) beyond #305.

Henry rightfully does not want AcqEngineJ to depend on MMStudio. So, the only way this could happen is if org.micromanager.data.Image would live in its own jar, distributed through maven, and have minimal dependencies.

I'm not sure what can easily be done beyond that without affecting the resulting data, given that our file formats require all fields to be present in the metadata of each frame. We could offer the user an option to turn off device metadata completely (or to update only once per sequence acquisition, once every N frames, every frame but no more than 10 Hz, etc.), or come up with an alternative way to save property values (though the latter is complex and unproven).

We owe the microscope operators at least the properties that we are trying to set ourselves during the acquisition. So, in a z-stack, each image should be annotated with the "correct" z position (even if it is only the z position that the Acquisition Engine thought we were going to be at). We are currently not doing that in sequences. Likewise, if the Acquisition Engine changes a channel preset, the metadata should contain the properties with that channel preset. Maybe this is best done by the Acquisitoin Engine after all, as that is the entity changing things.

@marktsuchida
Copy link
Member

But this only helps if acqEngine switches to using org.micromanager.data.Image internally rather than TaggedImage. This change doesn't require anything new in MMCore(J) beyond #305.

Henry rightfully does not want AcqEngineJ to depend on MMStudio. So, the only way this could happen is if org.micromanager.data.Image would live in its own jar, distributed through maven, and have minimal dependencies.

In my post "acqEngine" (with that specific capitalization) refers to the Clojure acquisition engine. Also, the point is not that MMStudio Image must be used (it's just what the Clojure engine should use in this context if we want to update it); the point is that TaggedImage cannot be used to hold device properties. AcqEngJ can do whatever it wants to do as long as it doesn't use TaggedImages containing device properties.

We owe the microscope operators at least the properties that we are trying to set ourselves during the acquisition. So, in a z-stack, each image should be annotated with the "correct" z position (even if it is only the z position that the Acquisition Engine thought we were going to be at). We are currently not doing that in sequences. Likewise, if the Acquisition Engine changes a channel preset, the metadata should contain the properties with that channel preset. Maybe this is best done by the Acquisitoin Engine after all, as that is the entity changing things.

The Z position is not part of the system state cache (because it's not a property), so that's a separate (and important) problem. But I generally agree that we should decide what the acquisition engines need to do, and only then design Core features to provide any missing information that is needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants