-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance of only-keypoints augmentations #635
Comments
Hm, are you sure you have 10k points in your input? How do you call your augmentation routines? According to the computed performance numbers (see https://imgaug.readthedocs.io/en/latest/source/performance.html#keypoints-and-bounding-boxes ), the library is able to process around 700k keypoints per second with |
Thanks for looking into this! This sequence of keypoints (10k) is having roughly 100 keypoints per frame, for 100 frames. I want to augment an entire video at once, as it is probably the most correct way to do so. (augmentation is done the same way for every frame) You can see from my code (in the google collab) that augmenting keypoints is way too slow, and that actually doesn't also account for the creation of the "Keypoint" object. (added speed test in collab). Creating the Keypoint object (for 10,000 items) takes 0.02498 seconds on average, which means even without any augmentation this is limited to 40 times a second. I highly recommend to go to the collab, upload the "keypoints.txt" file, and "Runtime -> Run All" to really see how slow it is. Unless I have something wrong, as you said, it is what it is... Colab replecates all of the arguments here. |
Real-life use case: The main difference is that the Here is a performance test:
|
Following up: #621
I want to run data augmentation on poses alone (I have an interesting scenario imo) and I want to do it fast, as right now to augment a batch it takes me upwards of 30 seconds, while the training loop entirely takes 1 second.
I believe that augmenting keypoints is an easier task than augmenting images, and it might just be a matter of bad data representation.
I'm loading an image (500x500 = 250,000 pixels)
I'm also loading a list of keypoints (10,564 points = 4.2% of the pixels).
Now, for example, if we run different augmentations on these 2 data sources, we get a huge time disparity - images are much faster, although they contain more data:
(Time in seconds)
However, if instead, we decide to represent key points as a NumPy array of dimensions [N, 2], any operation on it is much faster!
A change of representation here will be huge for the time, we are talking at least 2 orders of magnitude in flip, and matrix transformation.
To reproduce everything, here is a collab!
https://colab.research.google.com/drive/1R5Kh3ryfHcurLnKVGMcNSP3BNGOgZSVv
keypoints.txt file is here if you want to run it yourself!
keypoints.txt
The text was updated successfully, but these errors were encountered: