Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trim tracking data while loading #169

Open
LeonardoLupori opened this issue Sep 23, 2024 · 6 comments
Open

Trim tracking data while loading #169

LeonardoLupori opened this issue Sep 23, 2024 · 6 comments

Comments

@LeonardoLupori
Copy link

Hi all,
First of all thanks a lot for the great software!

I'm using the software to cluster mouse open-field tracking data. The videos contain, at the beginning and the end, parts that should not be included in the analysis (e.g., placing and removing the mouse from the arena or adjusting light intensity).

For each tracking file, I have the frame numbers for when the analysis should start and end in that file, and I would like to include this information while loading the tracking data so that irrelevant patterns do not influence behavioral clustering.
looking at the source code it seems that this is not possible with the default data loaders.

I wanted to ask if there's a different way of doing this (e.g, trim the data after loading) or if I should look into writing my own data loader that has this functionality.

thanks a lot!

all the best,
leo

@calebweinreb
Copy link
Contributor

Hi Leo,

How do you have these frame numbers stored?

It would be fairly easy to crop the detections after loading them. For example, if you had a dictionary bounds with the same keys as coordinates and start/end times as values, you could trim as follows right after the loading step

coordinates = {coords[bounds[k][0]:bounds[k][1]] for k,coords in coordinates.items()}

and the same thing for confidences. However, one issue with this approach is that it will create a misalignment between the coordinates and the underlying videos, so you won't be able to run the calibration step or generate grid movies. To do that, you would also have to trim the videos in the same way.

Assuming time/compute aren't super limited, probably the easiest thing is to just trim the videos and then rerun keypoint inference from that starting point.

@LeonardoLupori
Copy link
Author

Thanks a lot for the quick feedback!

Yes, I tried cropping the keypoint locations after loading them and it works perfectly!

I see the problem with the calibration and the generation of grid movies. Still, I'm analyzing a relatively big dataset, and cropping the videos would mean almost duplicating the space on our storage solution since I want to keep the raw videos containing the beginning and the end of the experiment.
For the generation of grid movies, I can improvise something myself but the calibration seems to be relatively critical. I'll try to see if it's possible to hack something to restore the sync when cropping the coordinates data.
Thanks a lot!!

@calebweinreb
Copy link
Contributor

I think for calibration you could just use the original (untrimmed) coordinates as input and it would work. For grid movies you could either pad the contents of results (and use the original keypoints) or just set keypoints_only=True which would still give a sense of the syllables.

@LeonardoLupori
Copy link
Author

LeonardoLupori commented Sep 25, 2024

Thanks a lot, that works.

Just for future reference for others who might run into this:
For the calibration step, if I use the original untrimmed tracking data the frame selection algorithm mostly selects frames at the beginning or the end of the video where there's no mouse. This is expected (but inconvenient) since it's biased for frames with low likelihood scores, and when the mouse is not even in the arena, the likelihood is understandably low.
I solved it by making a copy of confidences where frames to be trimmed are forced at likelihood=1. In this way, most of the frames automatically picked up for the calibration are actually useful as they are mostly frames where one or more bodyparts are predicted with low confidence. This copy of confidences is used only for this step.

To generate grid movies, do you know if I only need to pad the syllable vectors in results or if I also need to pad coordinates? Just asking because It's one of the arguments.

kpms.generate_grid_movies(
   results,
   project_dir,
   model_name,
   coordinates=coordinates,
   keypoints_only=True,
   keypoints_scale=1,
   use_dims=[0,1], # controls projection plane
   **config());

@calebweinreb
Copy link
Contributor

Nice workaround for calibration!

I actually just had some free time this morning so I decided to bite the bullet and add more formal support for trimming. Would you be willing to beta test? Based on your feedback, I'll merge it into the next release. Here's how you can test it (let me know if I should clarify any of the following steps).

  1. Clone this repo
  2. Checkout the branch support_trimmed_videos
  3. Install locally (pip install -e .)
  4. Following the instructions in the docs, which I've copied below

It would be useful if you were able to test calibration, but also to avoid laboriously re-annotating, you can just grab the slope/intercept params that you derived last time and enter them directly into the config. Also you have to delete the file error_annotations.csv in the project directory or you might get an error (or alternatively you can just do this test in a different project directory).

Docs for trimming

In some datasets, the animal is missing at the beginning and/or end of each video. In these cases, the easiest solution is to trim the videos before running keypoint detection. However, it's also possible to directly trim the inputs to keypoint-MoSeq. Let's assume that you already have a dictionary called bounds that has the same keys as coordinates and contains the desired start/end times for each recording. The next step would be to trim coordinates and confindences

coordinates = {k: coords[bounds[k][0]:bounds[k][1]] for k,coords in coordinates.items()}
confidences = {k: confs[bounds[k][0]:bounds[k][1]] for k,confs in confidences.items()}

You'll also need to generate a dictionary called video_frame_indexes that maps the timepoints of coordinates and confindences to frame indexes from the original videos.

import numpy as np
video_frame_indexes = {k : np.arange(bounds[k][0], bounds[k][1]) for k in bounds}

After this, the pipeline can be run as usual, except for steps that involve reading the original videos, in which case video_frame_indexes should be passed as an additional argument.

# Calibration step
kpms.noise_calibration(..., video_frame_indexes=video_frame_indexes)

# Making grid movies
kpms.generate_grid_movies(..., video_frame_indexes=video_frame_indexes)

# Overlaying keypoints
kpms.overlay_keypoints_on_video(..., video_frame_indexes=video_frame_indexes)

@LeonardoLupori
Copy link
Author

Sorry for the delay, I had a batch of experiments to follow.

Thanks a lot for the update! I'll install it, test it, and get back to you soon!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants