-
Notifications
You must be signed in to change notification settings - Fork 159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exporter Custom Models Fix #77
base: main
Are you sure you want to change the base?
Conversation
You reversed VOXEL_SIZE_X and VOXEL_SIZE_Y in the definition of simplify_preprocess Defined in simplifier_onnx.py as: |
@GuillaumeAnoufa I will update this, I was rushing and didn't realise as my cloud is square |
@GuillaumeAnoufa I have fixed this. I reversed it twice so it actually should not of affected the final model but better to have correctly named variables/args... |
Hi, I have used this commit to successfully export my model to onnx format. However, when I perform predictions using TensorRT , I am seeing different results when compared to when I just do eval on the trained pth file. More about my issue can be found in #82. Please let me know if I am missing something. |
Hi @Allamrahul have you verified the pointcloud information is being loaded correctly and in the right order? |
Could you further elaborate if possible? What do you mean by right order? I was able to use my custom data, train the model to detect a single object, validate the results using the demo.py file: The boxes look right on the eval set and the results look really good. Post that, I tried to export but I realized that everything in the export script was hardcoded for 3 classes. I then referred your PR, made those changes, and thankfully, they unblocked me and allowed me to export the model. I later moved the generated params.h to include folder and .onnx file to model folder and followed the instrcutions in https://github.com/NVIDIA-AI-IOT/CUDA-PointPillars, under compile and run. If the point cloud information was not being loaded correctly, I think my results on eval set would have been terrible. I compared the exporter.py, the file responsible for exporting to onnx and demo.py script, the one which performs the eval and helps me visualize predictions on my eval set: both process the data in the same manner. I am using the following command for export: I have also changed line 157 in main.py to let me predict on .npy files instead of .bin file If you need further information to guide in the right direction, please let me know. |
exporter.py file
|
simplifier_onnx.py
|
By this, do you mean how main.py is loading the .npy file? The script is meant for .bin files but it should work for .npy files as well. Please let me know if I am missing something. |
Hi, I used this commit but when I compared my results using the pth file Vs TRT inference, my predictions matched in box sizes, z dimension and confidence but not in X and Y coordinates. I tweaked the code the following way: This is atleast allowing me to get the same results across both eval using pth and using the onnx file for TRT inference. Not sure why this is working. That being said, I am getting slightly lesser number of predictions when I make predictions using TF-RT. Not sure why this is. Would really like some help in to understand if what I am doing is right. |
Hi, this is the same as my original commit before the suggestion was made by @GuillaumeAnoufa to change it. I guess I was right all along as I inspected the model in netron. I'll revert the commit suggested by @GuillaumeAnoufa |
Hi, I tried your 1st commit but that's not working: In your 1st iteration: Call: X, Y conclusion: 2nd iteration: (according to commit suggested by @GuillaumeAnoufa): call: X, Y conclusion: what works for me: conclusion: I have just retried iteration 1 and 2 again and they don't solve the issue because, inherently they both are doing the same thing. Mapping gets reversed if I try the way I suggested. Could you confirm this? |
That seems right, in my implementation by voxel shape is (600,600) so I would not notice this issue. I will fix this as soon as I can |
One more question: the boxes I get during TFRT inference are just a subset of the boxes I get during evaluation phase using the pth file. For example, for a .npy file, during eval phase, if I get 4 bounding boxes, I am getting 1 or 2 or 3 during TFRT inference and the output number changes every time I run it. Any way to get all the detections during TFRT inference? |
@Allamrahul are you using the same score and NMS threshold? I guess I would start by adjusting these in Params.h. I haven't directly compared my PyTorch results to the ones from tensorrt but they seem the same for me. |
I just removed that entirely as I require very fast performance. I also implemented a better way of loading params from a yaml file exporter.py generates if you'd be interested. For guidance in my Params.h I find that a score threshold of 0.3-0.4 and an NMS thresh of 0.01 works well. |
Will check that. Additionally, when I enable fp16, I am getting 100's of bounding boxes ( in the range of 5 to 350) during TFRT inference. When I disable fp 16, recompile and run, the number of detections are back to normal. Let me know the right way of doing it and if I am missing something here. |
This worked for me. Obviously FP16 can incur a accuracy penalty |
Could you specify what worked for you? Its not clear from your comment. Thanks. |
I mean FP16 worked normally for me when commenting the lines you suggested above |
Perhaps try a score_thesh of 0.4 |
By normally, you mean you too are getting hundreds of detections? Sorry, I dont have much experience in deployment and this is the first time I am dealing with fp16. |
No worries, I meant that I did not observes having hundreds of detections with FP16 but my confidence is set to 0.3. Perhaps look at the detections you are getting and if you are receiving lots of low scores increase the threshold. |
Got it, let me check that. |
Also, one more thing: I am using .npy files since I am using a custom dataset. I observed that there is an 32 byte offset when I load the same npy file via python, numpy VS when I load it through cpp. Could this be a factor? |
I am using ROS so I do not have to load any files so can't fully recommend a solution. However I do write the binary files I use for training with OpenPCDet. The only think I could recommend is trying to making the dtype of the numpy array you are using np.float16. Although I seem to get good results in my implementation without explicitly using float16 when I convert from the ROS msg. |
@rjwb1 , could you point me to the exact TFRT inference files you are using at the moment? As mentioned before, my fp16 numbers are out of whack, 300 detections in some case and 5 in other. I am expecting it to give a number between 3 and 5 for every point cloud. I would like to cross reference the exact commit or group of commits you are using for inference just to make sure I am not missing anything of importance. After analyzing the results, I found out that the model is too confident on some examples, giving out a confidence values of 90 to 100 % in a lot of the detection. But on some examples, its giving the right output. |
@rjwb1 , in regards to #85, I see that MAX_VOXELS is hard coded to 10000 in the export script exporter.py. But when I examine the pointpillar.yaml, I see this: I tried this out: when I gave 40000 for MAX_VOXELS and exported the onnx file, the multiple false positives I get in FP16 inference TFRT goes away. Can anyone confirm what I did makes sense? |
Hi,I can export My custom onnx model, but the results seem innocent,can you give me some help |
Hello, thank you for your work on the custom model conversion. I found that the But, currently, everything is working fine after making a modification to swap the positions of In VOXEL_ARRAY = np.array([int(VOXEL_SIZE_X),int(VOXEL_SIZE_Y)]) after modification: VOXEL_ARRAY = np.array([int(VOXEL_SIZE_Y),int(VOXEL_SIZE_X)]) Hope this helps to solve the problem! |
@Allamrahul |
Hello everyone and thanks @rjwb1 for the amazing updates, I can successfully trained my custom data with 3 classes like KITTI (vehicle, pedestrian, cyclist) in OpenPCDet and results looked fine on python side. Then i converted the model with exporter.py and also ran it succesfully in my c++ code. Then i trained same custom data with 12 classes (i seperated the vehicle class like bus, van ,truck) and results also looked fine on python side but after the exporting the model with exporter.py the results on c++ side was completely random and produced lots of large false detections. Has anyone encountered a problem like this before? or trained with different class sizes before ? I would be glad if anyone can help. |
I found out that the "MAX_POINTS_PER_VOXEL" parameter in the pointpillar.yaml file is the problem. When I change the parameter from the default 32 to something different, it causes the problem I described above. |
hi,great job.
So u mean the reason is MAX_POINTS_PER_VOXEL changed?if u set it as 32,the c++ size results will be same with your python side results?
it bothers me a lot
…---Original---
From: ***@***.***>
Date: Wed, Aug 9, 2023 21:18 PM
To: ***@***.***>;
Cc: "Zhentao ***@***.******@***.***>;
Subject: Re: [NVIDIA-AI-IOT/CUDA-PointPillars] Exporter Custom Models Fix (PR #77)
I found out that the "MAX_POINTS_PER_" parameter in the pointpillar.yaml file is the problem. When I change the parameter from the default 32 to something different, it causes the problem I described above.
I am looking for solution.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
hi @zzt007, The pointcloud in my dataset is very dense in close range so i set the MAX_POINTS_PER_VOXEL parameter to 128. But after i trained my data with that parameter and export with this functions, the boundingbox results were completely random in c++ side. Then i started training with default MAX_POINTS_PER_VOXEL:32 parameter and tested after couple of epochs the model started to detect the objects in correct boundingbox sizes. I'am still in early stages in training and optimizing parameters, but as soon as i get proper results i will compare the results. |
Hi guys, I had to make some small changes as I work in a different private repository so I haven't fully tested everything. For my application I use a single class however I have tried with multiple. And I also use a custom voxel size and number (XYZ) and this works for me. I'm not at my computer right now but when I return I'd be happy to help 👍🏼 |
Just to confirm you're correctly copying the Params.h header over? In my version I generate a config file that does not need to be rebuilt but I haven't done this here |
@Acuno41 I have discovered that the MAX_POINTS_PER_VOXEL is also hardcoded in the kernel.h. Did you change it here? CUDA-PointPillars/include/kernel.h Lines 28 to 33 in 092affc
|
It sounds not bad , sincerely looking forward to your results and reply. thanks
…------------------ 原始邮件 ------------------
发件人: "NVIDIA-AI-IOT/CUDA-PointPillars" ***@***.***>;
发送时间: 2023年8月9日(星期三) 晚上10:42
***@***.***>;
***@***.******@***.***>;
主题: Re: [NVIDIA-AI-IOT/CUDA-PointPillars] Exporter Custom Models Fix (PR #77)
hi @zzt007,
The pointcloud in my dataset is very dense in close range so i set the MAX_POINTS_PER_VOXEL parameter to 128. But after i trained my data with that parameter and export with this functions, the boundingbox results were completely random in c++ side. Then i started training with default MAX_POINTS_PER_VOXEL:32 parameter and tested after couple of epochs the model started to detect the objects in correct boundingbox sizes.
I'am still in early stages in training and optimizing parameters, but as soon as i get proper results i will compare the results.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Hi @rjwb1,
Yes, i correctly copied the params.h to the c++ side and checked if they loaded correctly to the c++ code
And also i updated the kernel.h little bit to remove the hardcoded param.h dependent parameters, kernel.h looks like below in my code const int THREADS_FOR_VOXEL = 256; // threads number for a block
const int POINTS_PER_VOXEL = Params::max_num_points_per_pillar; // depands on "params.h"
const int WARP_SIZE = 32; // one warp(32 threads) for one pillar
const int WARPS_PER_BLOCK = 4; // four warp for one block
const int FEATURES_SIZE = 10; // features maps number depands on "params.h"
const int PILLARS_PER_BLOCK = 64; // one thread deals with one pillar and a block has PILLARS_PER_BLOCK threads
const int PILLAR_FEATURE_SIZE = Params::num_feature_scatter; // feature count for one pillar depands on "params.h" And i changed max_num_points_per_pillar and num_feature_scatter to static const in params.h. Considering that the MAX_POINTS_PER_VOXEL parameter is used in the preprocess part, I suspect something there might be causing the problem while preparing the data to feed to model. |
Hi @rjwb1 I also follow your forked repository and this PR #77 , Here's my overall procedure! 1. Train my custom model with custom dataset
2. Convert my custom model 3. Change
5. Modify Hard-coded value in
6. build and infer
Here's result of pytorch+ros inference Also, here's result of CUDA-PointPillars Also, have you wrapped this package into ROS? |
I also had the same problem at the cuda version, and I need to work with ROS too ,
if u have any ways or ideas to solve it , please contact with me . thanks a lot.
…---Original---
From: ***@***.***>
Date: Wed, Aug 30, 2023 00:20 AM
To: ***@***.***>;
Cc: "Zhentao ***@***.******@***.***>;
Subject: Re: [NVIDIA-AI-IOT/CUDA-PointPillars] Exporter Custom Models Fix (PR #77)
Hi @rjwb1
I also follow your forked repository and this PR #77 ,
But it does not shows same result compared to my pytorch(*.pth) inference
Here's my overall procedure!
1. Train my custom model with custom dataset
INPUT_RANGE: [-80, -80, -10, 80, 80, 10] (Square!)
VOXEL_SIZE: [0.4, 0.4, 20]
2. Convert my custom model *.pth into *.onnx with exporter.py
3. Change include/param.h
Apply newly created file param.h in step 2
5. Modify Hard-coded value in kernel.h
POINTS_PER_VOXEL
6. build and infer
visualize with open3d
Here's result of pytorch+ros inference
Also, here's result of CUDA-PointPillars
Can you give me some advice?
Also, have you wrapped this package into ROS?
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Correctly applies params from the model cfg to the onnx exporter