-
Notifications
You must be signed in to change notification settings - Fork 1
Questions on supervised traning part code and performance #2
Comments
I see, saw your config to set z voxel size to the range, so for the config, it will be 0 [len=1] but why for that? 2D grid is better? |
The student is the FastFlow3D model, which uses a PointPillars feature encoder that turns everything into a 2D birds eye view pseudoimage. The voxelization function that I am using is provided by MMCV, which is more general and can be used for 3D voxelization (e.g. SECOND / VoxelNet). I set the minimum and maximum point height in the config as you referenced, and the point clouds should be chopped accordingly, so everything should be in a single very tall voxel (a point pillar, hence the name PointPillars) to form the pseudoimage. I added the referenced assert to validate this assumption when doing the voxilization for FastFlow3D; if something had a Z index other than zero, it means that the assumption is being violated and that assert should trigger. |
Thanks for your reply. 🥰 |
Download the pre-trained model weights and run the evaluation on those to ensure that everything else is setup correctly. If you're able to reproduce those test numbers, then there's something going on with the training run that we can dig into further. |
Thanks for your help! Appreciate. By the way, I saw there is new eval on zeroflow in leaderboard
|
ZeroFlow XL is the ZeroFlow pipeline with two changes:
To be clear, like with ZeroFlow, ZeroFlow XL is using zero human labels. Our results are simply because we used more unlabeled data and added more parameters! We are able to beat the teacher model's performance and achieve state-of-the-art on the AV2 test set because our model has seen enough diverse data and is expressive enough to learn to distinguish noise from signal in the teacher labels. On Sunday when we got this result, I tweeted this cool updated graph showing we are doing better than our scaling laws predicted on the normal student model: To further drive home this point, here are the raw results from our submissions to the AV2 Scene Flow test split:
If you look at the linked results, our XL model outperforms the teacher across all three categories of the Threeway EPE, but makes particularly large gains in the static foreground category. This means our model has learned to recognize (a lack of motion) better than NSFP is able to represent because it's seen enough data to know that, in expectation, static objects should have zero flow, even if there's a bit of noise in the teacher labels. This is while, in expectation, also extracting what correct movement vectors look like for moving objects. I also have more good news! I'm an idiot and forgot to run the XL model with our Speed Scaling feature enabled (Equation 5 from the paper), and so I stopped training this model after only 5 epochs (this is akin to seeing ~10 epochs worth of frames). This means that the XL model is undertrained, and it's missing a feature that provides free Foreground Dynamic EPE improvements (which substantially improves Threeway EPE). We are training a new XL model with these features enabled, and for more epochs, so we should hopefully get even better performance from our new student model. |
thank you so much for sharing these! looking forward to your updates. |
one more question 😊, XL dataset means also needs zeroflow paper pipeline which is NSFP to produce pseudo label first on the LiDAR dataset? or a new pipeline to do that? |
That's correct, I used NSFP to pseudolabel the Argoverse 2 LiDAR dataset data subset. We have a large SLURM cluster with a bunch of old 2080tis so the pseudolabeling only took a few days because I could parallelize across them all. I used the |
@kylevedder sorry to bother you again, but could you specify which three models here to get these three results? or maybe one Base Zeroflow results? Base ZeroFlow results; Threeway EPE of 0.0814 |
As we discuss in the paper, our reported Threeway EPE for ZeroFlow is an average of three runs. These weights are the ones highlighted in the weight repo README: NSFP doesn't have trained weights, it's a test time optimization method. We have not uploaded the ZeroFlow XL weights, they are too large (1.2GB) and require me to setup git LFS. |
hi, kylevedder, Could you please let me know the number of samples in the processed Argo/Waymo dataset, train/val/test splits respectively? It seems there are several versions of waymo scene flow datasets, such as PCAccumulation, ECCV2022. The strategies to calculate gt glows are similar. But I wonder if there is a difference in scale. I can not find any info in the paper and sup. It seems ego motion compensation is used when creating the scene flow dataset as mentioned in the sup. Could you please share the results WITHOUT ego-motion compensation, if any. So far my results show NSFP performs worse on dynamic objects when using ego-motion on waymo. Not sure to what extent this impacts the distillation. Any insights from your side? Thank you! |
Dataset Details For Argoverse2, I read the dataset straight from disk as downloaded, sans the minor folder rearrangement I discuss in the GETTING_STARTED.md. For Waymo Open, the exact dataset version and labels are detailed in my GETTING_STARTED.md -- we use 1.4.2 and use the standard flow labels provided on the Waymo Open website. We preprocess the data from the annoying ZeroFlow without ego motion compensation We do not have any results for ZeroFlow / FastFlow3D without ego motion compensation. In principle we can train our feedforward model without ego compensation, but it's reasonable to assume decent quality ego compensation is available at test time on modern service robot / autonomous vehicle stacks. Chodosh et al 2023 makes a fairly compelling case that ego compensation in general is broadly useful, so we decided to use it. NSFP without ego motion compensation I don't directly have head to head NSFP results with and without ego compensation. In my early work using NSFP on Argoverse2 I saw Threeway EPE was better with compensation (which makes sense, it's an easier problem), and we ran with that on Waymo. How much worse is NSFP on the dynamic bin? Do you have more details on what kinds of dynamic objects it's performing worse on / when? Are you doing ground removal? (this is basically mandatory to get NSFP to work, otherwise it fits a bunch of zero vectors to the lidar on the ground) I also found that NSFP performance is very dependent upon dataloading details -- ZeroFlow's implementation integrates the author implementation of NSFP, but we use our own data loaders. The NSFP authors actually reached out to me to discuss dataloader details because our NSFP implementation (which is listed as the Baseline NSFP implementation on the Argoverse2 Scene Flow leaderboard) actually significantly outperformed their own implementation. Their entry is NP (NSFP), my NSFP implementation is the Host_67820_Team NSFP entry (the challenge organizers asked me to send them results for a strong baseline). |
Hi Yancong,
Thanks for posting your result table.
I don't know why ego compensated NSFP performs so much worse on your
dynamic objects. My recommendation is you actually visualize the results
for both ego compensated and uncompensated for the same point clouds to
make sure that 1) your data is setup properly 2) NSFP is actually
optimizing the right things 3) NSFP has actually converged
I'm curious to find out what you discover from this, please keep me posted!
…On Tue, Oct 24, 2023 at 8:28 AM CVisioner ***@***.***> wrote:
Hi, kylevedder.
This is what I got on the waymo dataset PCAccumulation, ECCV2022
<https://github.com/prs-eth/PCAccumulation>: scene flow estimation
between two nearby frames.
It looks weird that NSFP ego performs worse than NSFP w/o ego on dynamic
and the margin is substantially large.
[image: Screenshot from 2023-10-24 14-27-04]
<https://user-images.githubusercontent.com/26435516/277656843-7fa8ef8b-eac7-44ad-a2ca-152315513ebb.png>
—
Reply to this email directly, view it on GitHub
<#2 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABO476FPELILOPQGEC7ZBNLYA6X7LAVCNFSM6AAAAAA24XO7UWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONZXGEYTAMZXHA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I also found when I tried to reproduce the FastFlow3D result which is Zeroflow teacher network. However, since I used the official dataloader inside av2-api, and there are some score lost. I will try to find out in the following days. |
Thanks to @kylevedder , he mentioned one of reasons in the email (in case someone after me has same problem, I attached his words here):
Thanks again to Kyle! |
Thanks for your work and open source,
When I read the code following, I'm wondering why set a assert here for z index must be 0? since Voxelization is 3D, z should not be 0?
zeroflow/models/heads/fast_flow_decoder.py
Lines 15 to 32 in 58a93f7
The text was updated successfully, but these errors were encountered: