Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metadata handling, export, and real-time visualization #35

Merged
merged 50 commits into from
Aug 27, 2024

Conversation

t-sasatani
Copy link
Collaborator

@t-sasatani t-sasatani commented Jul 16, 2024

Some updates for interfacing metadata and accompanying updates for the CLI.

Changes:

  • Patch error _buffer_to_frame #34
  • Add metadata CSV export and circular buffer queue shared list (for use in other processes)
  • Add BufferedCSVWriter for buffered CSV exports.
  • Change CLI to output files with the same stem names (for -o sample, you get sample.avi, sample.csv, sample.bin)

Notes:

  • I used the runtime config module to add some environment variables, but I am unsure if I used it as @sneakers-the-rat intended. I would appreciate feedback.
  • Now, the video export is locked to .avi, which is good for now, but it might be better to define this by environment variables or something. There's no point doing this now, so we can probably leave it until we build an interface for video export options.

@t-sasatani t-sasatani added the enhancement New feature or request label Jul 16, 2024
@t-sasatani t-sasatani requested review from sneakers-the-rat and MarcelMB and removed request for MarcelMB July 16, 2024 13:41
@coveralls
Copy link
Collaborator

coveralls commented Jul 16, 2024

Coverage Status

coverage: 73.757% (+1.7%) from 72.069%
when pulling 479363c on feat-metadata-handle
into 2290874 on main.

@sneakers-the-rat
Copy link
Collaborator

Ope sorry missed this when u opened it. I got some notes but like what yr thinkin here :)

@t-sasatani
Copy link
Collaborator Author

No problem. This isn't a hurry at all!

@t-sasatani t-sasatani force-pushed the feat-metadata-handle branch from 18f9a52 to adc6e21 Compare August 15, 2024 10:02
…ncomplete, rest of work in following commits)
…vel `capture` method, pass headers rather than using a shared list, store writer objects as private attrs for testing, don't write a blank frame at the start of capturing
@sneakers-the-rat
Copy link
Collaborator

ok @t-sasatani see my updates in the above commits, tried to split them up so they are a little easier to follow, but they are sorta intrinsically related to each other.

Basically i am trying to have as light of a touch as possible here bc i know y'all are under a deadline while trying to avoid this package getting too spaghettilike from that rush against deadline.

  • Move streamdaq runtime config to its own configuration model: since these are parameters that should be generally true for this device, moved them there - the top-level Config object is for behavior that should be true across the entire package. as this package grows we want to avoid putting things at the top level if they are only used in some part of the module. We also want to avoid having too many ways to configure things without a clear structure in place: streamdaq config is already set in those yaml files, so kept it there.
  • Move csvwriter out of the inner loops which are perf sensitive to the outer loop which shouldn't be. We'll need a bigger refactor here eventually, but as i said above trying to not touch too much while we're on deadline. those inner methods should be super minimal and only do exactly one thing. the capture method is already a junk drawer method (hence need for refactor) but just trying to keep similar things in similar places for now.
  • Make plotter handle its own timing - simplify the API to that class so it's just "give me objects to plot" instead of every object that uses it needing to manage draw timing. I think you might also like some of the other tidying there, hopefully that makes sense.
  • Make basic e2e tests for these functions - we'll need a great deal more testing for this whole thing, the entirety of the inner methods are untested atm which i suspect is the source of continual regressions, but at least we can keep up with top-level tests.

There are some test errors that are only showing up in CI on ubuntu that i'm gonna fix, but other than that i'll take one final look and approve

@sneakers-the-rat
Copy link
Collaborator

ok here's a fun one for you - for the csv writer test, where since i didn't know exactly what we would want there, i just did a simple shape test, where the header metadata that gets written should be the same every time (for a given test data file), but we are getting...

  • linux: 893 rows
  • windows: 901 rows
  • mac: 901 rows
    (and, locally, on my machine, getting 855 rows, which is why the value is set that way).

I'm not really sure what that value should be since idk the source data, but the video is 20fps, 5.5s long = 110 frames, and there should be 8 buffers per frame = 880 total, which is actually none of those values.

so... ya...

@sneakers-the-rat
Copy link
Collaborator

OK fixed those bugs, and fixed the early termination bugs - removed ability to exit with escape while displaying videos, now there are two ways to quit a capture:

  • terminate event is the underlying holder of quitting logic
  • ctrl+c causes terminate to get flipped to true.

but instead of prior behavior where all the loops quit immediately when getting the terminate event, refactored the top-level capture frame processing logic into a separate method so it was possible to call in two places, so now we explicitly wait for the queues to drain before fully quitting (it's also possible to force quit with a second ctrl+c).

very very strong need to break apart those inner methods so that we have better access to them for unit testing, bc we are gonna keep getting mystery errors as long as we only have end-to-end tests and can't directly test them as pure functions, given some expected input, get some expected output, separated from the rest of the acquisition process.

@pytest.mark.parametrize(
"config,data,video_hash_list,show_video",
[
(
"stream_daq_test_200px.yml",
"stream_daq_test_fpga_raw_input_200px.bin",
"[\"f878f9c55de28a9ae6128631c09953214044f5b86504d6e5b0906084c64c644c\",\"8a6f6dc69275ec3fbcd69d1e1f467df8503306fa0778e4b9c1d41668a7af4856\",\"3676bc4c6900bc9ec18b8387abdbed35978ebc48408de7b1692959037bc6274d\"] ",
'["f878f9c55de28a9ae6128631c09953214044f5b86504d6e5b0906084c64c644c","8a6f6dc69275ec3fbcd69d1e1f467df8503306fa0778e4b9c1d41668a7af4856","3676bc4c6900bc9ec18b8387abdbed35978ebc48408de7b1692959037bc6274d"] ',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure when this got switched to just being a string that looks like a list of strings instead of being an actual list of strings, but this makes the test considerably weaker.

in particular, str(x) in str(y) checks if that substring is anywhere in the other string, but str(x) in list(y) checks whether exactly str(x) is found in the list of strings. So eg.

>>> "" in '["some other string", "that shouldn\'t include", "the empty string literal"]'
True
>>> "" in ["some other string", "that shouldn\'t include", "the empty string literal"]
False

@sneakers-the-rat
Copy link
Collaborator

Added tests that confirm that varying buffer size doesn't change output to guard against the problems @t-sasatani was describing here: #35 (comment) which was because of the termination logic

@sneakers-the-rat
Copy link
Collaborator

Down with merging this btw :)

@t-sasatani
Copy link
Collaborator Author

Thanks sounds great. I'm thinking about making some changes to save more messed-up metadata, but I will bump up a minor version after I decide what to do with that.

@sneakers-the-rat
Copy link
Collaborator

Merge this then do that?

Copy link
Collaborator Author

@t-sasatani t-sasatani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a look, and never mind about the metadata stuff: I thought the plotted and saved metadata might differ, but it was because I was looking at them on different scales. I just have one question in this review, which can also be a quick patch in the next PR.

@@ -404,10 +396,11 @@ def _buffer_to_frame(
continue

# push previous frame_buffer into frame_buffer queue
frame_buffer_queue.put(frame_buffer)
frame_buffer_queue.put((frame_buffer, header_list))
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why you want to couple these two tightly? I would instead like to have these in separate queues and separate for loops (pasted below) because we want to know metadata more when we don't get images.

for image, header_list in exact_iter(imagearray.get, None):

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pasted below -> pasted the corresponding line in the current code
because I don't know if referring multiple code blocks in one comment is possible.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess coupling these might be better for unifying the data flow. Just added a line to rescue some metadata even when an image can't be recovered.

@t-sasatani
Copy link
Collaborator Author

t-sasatani commented Aug 27, 2024

The refactored version is running very slow compared to 5d33e3b, and I'm looking why. I assume it's blocking/non-blocking stuff in plots, and here is an example binary to see this.
https://drive.google.com/open?id=1QoLB1F7HUau6TDrHQWxpivR48RhpTfWN&usp=drive_fs

@t-sasatani
Copy link
Collaborator Author

This was just the window refresh (cv2.waitKey(1)) getting deleted with the ESC key escape. Also, the plot window becomes unstable when moved (changing the GUI backends of matplotlib didn't fix it), but I guess I'll merge it with an experimental notice.

It might be worth considering GUI tests, but I'm unsure if there's a good way to do non-web GUI tests on GitHub actions.

@t-sasatani t-sasatani merged commit 4f4f788 into main Aug 27, 2024
15 checks passed
@t-sasatani t-sasatani deleted the feat-metadata-handle branch December 13, 2024 21:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants