Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(autoware_detected_object_validation): reduce lanelet_filter processing time #8240

Conversation

a-maumau
Copy link
Contributor

@a-maumau a-maumau commented Jul 29, 2024

Description

This PR is for speeding up the lanelet filtering process.
This PR will reduce the number of loops by using the inner R-tree of intersected lanelets.

Using the TimeKeeper, I checked the bottleneck of lanelet filter node. The main bottleneck was the for-loop for the filtering for the objects that have no bounding box (unknown object).

processing times

Example of worst cases.

  • following values are different run from the figure

no bbox filter is the part taking time.

Before this PR

objectCallback: 136.19 [ms]
    ├── getConvexHull: 0.35 [ms]
    ├── getIntersectedLanelets: 1.62 [ms]
    ├── filterObject: 0.00 [ms]
    ├── filterObject: 0.00 [ms]
    ├── filterObject: 0.00 [ms]
    ├── filterObject: 0.00 [ms]
    ├── filterObject: 0.00 [ms]
    ├── filterObject: 0.00 [ms]
    ├── filterObject: 0.00 [ms]
    ├── filterObject: 0.00 [ms]
    ├── filterObject: 0.00 [ms]
    ├── filterObject: 2.24 [ms]
    │   └── isObjectOverlapLanelets: 2.24 [ms]
    │       └── no bbox filter: 2.24 [ms]
    ├── filterObject: 1.77 [ms]
    │   └── isObjectOverlapLanelets: 1.68 [ms]
    │       └── no bbox filter: 1.68 [ms]
    ├── filterObject: 1.60 [ms]
    │   └── isObjectOverlapLanelets: 1.60 [ms]
    │       └── no bbox filter: 1.60 [ms]
    ├── filterObject: 0.75 [ms]
    │   └── isObjectOverlapLanelets: 0.74 [ms]
    │       └── no bbox filter: 0.74 [ms]
    ├── filterObject: 1.11 [ms]
    │   └── isObjectOverlapLanelets: 1.11 [ms]
    │       └── no bbox filter: 1.11 [ms]
    ├── filterObject: 3.11 [ms]
    │   └── isObjectOverlapLanelets: 3.11 [ms]no bbox filter: pre process: 0.23 [ms]
    │       └── no bbox filter: main loop: 0.22 [ms]
    │       └── no bbox filter: 3.10 [ms]
    ├── filterObject: 1.29 [ms]
    │   └── isObjectOverlapLanelets: 1.29 [ms]
    │       └── no bbox filter: 1.28 [ms]
    ├── filterObject: 2.51 [ms]
    │   └── isObjectOverlapLanelets: 2.51 [ms]
    │       └── no bbox filter: 2.51 [ms]
    ├── filterObject: 1.55 [ms]
    │   └── isObjectOverlapLanelets: 1.55 [ms]
    │       └── no bbox filter: 1.55 [ms]
    ├── filterObject: 1.11 [ms]
    │   └── isObjectOverlapLanelets: 1.10 [ms]
    │       └── no bbox filter: 1.10 [ms]
    ├── filterObject: 0.54 [ms]
    │   └── isObjectOverlapLanelets: 0.54 [ms]
    │       └── no bbox filter: 0.54 [ms]
    ├── filterObject: 2.34 [ms]
    │   └── isObjectOverlapLanelets: 2.34 [ms]
    │       └── no bbox filter: 2.33 [ms]
   	...

After this PR

objectCallback: 12.90 [ms]
    ├── getConvexHull: 0.09 [ms]
    ├── getIntersectedLanelets: 0.32 [ms]
    ├── filterObject: 0.00 [ms]
    ├── filterObject: 0.00 [ms]
    ├── filterObject: 0.22 [ms]
    │   └── no bbox filter: pre process: 0.21 [ms]
    │       └── no bbox filter: main loop: 0.21 [ms]
    ├── filterObject: 0.58 [ms]
    │   └── no bbox filter: pre process: 0.57 [ms]
    │       └── no bbox filter: main loop: 0.57 [ms]
    ├── filterObject: 0.19 [ms]
    │   └── no bbox filter: pre process: 0.18 [ms]
    │       └── no bbox filter: main loop: 0.18 [ms]
    ├── filterObject: 0.01 [ms]
    │   └── no bbox filter: pre process: 0.01 [ms]
    │       └── no bbox filter: main loop: 0.00 [ms]
    ├── filterObject: 0.23 [ms]
    │   └── no bbox filter: pre process: 0.22 [ms]
    │       └── no bbox filter: main loop: 0.22 [ms]
    ├── filterObject: 0.20 [ms]
    │   └── no bbox filter: pre process: 0.20 [ms]
    │       └── no bbox filter: main loop: 0.20 [ms]
    ├── filterObject: 0.23 [ms]
    │   └── no bbox filter: pre process: 0.23 [ms]
    │       └── no bbox filter: main loop: 0.22 [ms]
    ├── filterObject: 0.18 [ms]
    │   └── no bbox filter: pre process: 0.17 [ms]
    │       └── no bbox filter: main loop: 0.17 [ms]
    ├── filterObject: 0.20 [ms]
    │   └── no bbox filter: pre process: 0.19 [ms]
    │       └── no bbox filter: main loop: 0.19 [ms]
    ├── filterObject: 0.12 [ms]
    │   └── no bbox filter: pre process: 0.12 [ms]
    │       └── no bbox filter: main loop: 0.12 [ms]
    ├── filterObject: 0.53 [ms]
    │   └── no bbox filter: pre process: 0.53 [ms]
    │       └── no bbox filter: main loop: 0.52 [ms]
    ├── filterObject: 0.17 [ms]
    │   └── no bbox filter: pre process: 0.16 [ms]
    │       └── no bbox filter: main loop: 0.16 [ms]
    ├── filterObject: 0.16 [ms]
    │   └── no bbox filter: pre process: 0.15 [ms]
    │       └── no bbox filter: main loop: 0.15 [ms]
    ├── filterObject: 0.05 [ms]
    │   └── no bbox filter: pre process: 0.04 [ms]
    │       └── no bbox filter: main loop: 0.04 [ms]
    ├── filterObject: 0.00 [ms]
    │   └── no bbox filter: pre process: 0.00 [ms]
    │       └── no bbox filter: main loop: 0.00 [ms]
    ├── filterObject: 0.13 [ms]
    │   └── no bbox filter: pre process: 0.13 [ms]
    │       └── no bbox filter: main loop: 0.13 [ms]
    ...

no bbox filter: pre process: from R-tree search to the end
no bbox filter: main loop: from for-loop checking to the end

VS. previous implementations

compare_processing_time

There is no guarantee the time is matching the actual time line or correspond to each other.

original: before #8109
current: after #8109 (before this PR)
optimized: after this PR

play in x0.5 means, bag play with rate of 0.5.

mean processing time

original mean: 34.401 ms
current mean: 30.980 ms
optimized best comb. mean: 3.899 ms

(mean processing time was calculated with records that have more than 1 ms)

Related links

Parent Issue:

  • Link

How was this PR tested?

Processing time

Processing time was gathered from TimeKeeper: https://github.com/autowarefoundation/autoware.universe/tree/main/common/autoware_universe_utils

Degradation

Qualitative test is done by manually checking and comparing with the original implementation's detected and filtered unknown objects.

No quantitative evaluation was done.

Notes for reviewers

None.

Interface changes

None.

Effects on system behavior

None.

a-maumau added 2 commits July 29, 2024 16:04
Signed-off-by: a-maumau <maumaumaumaumaumaumaumaumaumau@gmail.com>

change to _func__

add more debug

use local rtree

fix

tmp update

fix bug

clean unused

clean up

Signed-off-by: a-maumau <maumaumaumaumaumaumaumaumaumau@gmail.com>
Signed-off-by: a-maumau <maumaumaumaumaumaumaumaumaumau@gmail.com>
@github-actions github-actions bot added the component:perception Advanced sensor data processing and environment understanding. (auto-assigned) label Jul 29, 2024
Copy link

github-actions bot commented Jul 29, 2024

Thank you for contributing to the Autoware project!

🚧 If your pull request is in progress, switch it to draft mode.

Please ensure:

@technolojin technolojin self-assigned this Jul 29, 2024
@technolojin technolojin added the run:build-and-test-differential Mark to enable build-and-test-differential workflow. (used-by-ci) label Jul 29, 2024
@technolojin
Copy link
Contributor

TEST under TIER IV cloud

Copy link

codecov bot commented Jul 29, 2024

Codecov Report

Attention: Patch coverage is 0% with 39 lines in your changes missing coverage. Please review.

Project coverage is 29.24%. Comparing base (6719f5f) to head (d617ffb).
Report is 15 commits behind head on main.

Files Patch % Lines
...t_validation/src/lanelet_filter/lanelet_filter.cpp 0.00% 39 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8240      +/-   ##
==========================================
- Coverage   29.25%   29.24%   -0.01%     
==========================================
  Files        1600     1600              
  Lines      117744   117775      +31     
  Branches    50716    50723       +7     
==========================================
  Hits        34447    34447              
- Misses      74072    74103      +31     
  Partials     9225     9225              
Flag Coverage Δ *Carryforward flag
differential 2.75% <0.00%> (?)
total 29.26% <ø> (+<0.01%) ⬆️ Carriedforward from 6719f5f

*This pull request uses carry forward flags. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@technolojin technolojin changed the title perf(lanelet_filter): reduce lanelet_filter processing time perf(autoware_detected_object_validation): reduce lanelet_filter processing time Jul 29, 2024
Copy link
Contributor

@technolojin technolojin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank you for your contribution.

I did not confirmed that the Autoware coding guideline allows to use abbreviated namespace.
namespace bg = boost::geometry;

Copy link
Contributor

@YoshiRi YoshiRi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@YoshiRi
Copy link
Contributor

YoshiRi commented Jul 30, 2024

LGTM! Thank you for your contribution.

I did not confirmed that the Autoware coding guideline allows to use abbreviated namespace. namespace bg = boost::geometry;

It seems no problem because some planning code use this expression.

The code changes in `lanelet_filter.cpp` optimize the object filtering process by using the `empty()` function instead of checking the size of the `transformed_objects.objects` vector. This change improves performance and simplifies the code logic.

Refactor the code to use `empty()` instead of `size()` for checking if the `transformed_objects.objects` vector is empty. This change improves readability and performance.

Signed-off-by: Taekjin LEE <taekjin.lee@tier4.jp>
@technolojin technolojin enabled auto-merge (squash) July 30, 2024 02:31
@technolojin technolojin merged commit a1a22a6 into autowarefoundation:main Jul 30, 2024
22 of 24 checks passed
technolojin added a commit to technolojin/autoware.universe that referenced this pull request Jul 30, 2024
…essing time (autowarefoundation#8240)

* add local r-tree for fast searching

Signed-off-by: a-maumau <maumaumaumaumaumaumaumaumaumau@gmail.com>

change to _func__

add more debug

use local rtree

fix

tmp update

fix bug

clean unused

clean up

Signed-off-by: a-maumau <maumaumaumaumaumaumaumaumaumau@gmail.com>

* clean up

Signed-off-by: a-maumau <maumaumaumaumaumaumaumaumaumau@gmail.com>

* style(pre-commit): autofix

* chore: Optimize object filtering and improve performance

The code changes in `lanelet_filter.cpp` optimize the object filtering process by using the `empty()` function instead of checking the size of the `transformed_objects.objects` vector. This change improves performance and simplifies the code logic.

Refactor the code to use `empty()` instead of `size()` for checking if the `transformed_objects.objects` vector is empty. This change improves readability and performance.

Signed-off-by: Taekjin LEE <taekjin.lee@tier4.jp>

---------

Signed-off-by: a-maumau <maumaumaumaumaumaumaumaumaumau@gmail.com>
Signed-off-by: Taekjin LEE <taekjin.lee@tier4.jp>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Taekjin LEE <taekjin.lee@tier4.jp>
shmpwk pushed a commit to tier4/autoware.universe that referenced this pull request Jul 30, 2024
…essing time (autowarefoundation#8240) (#1433)

* add local r-tree for fast searching



change to _func__

add more debug

use local rtree

fix

tmp update

fix bug

clean unused

clean up



* clean up



* style(pre-commit): autofix

* chore: Optimize object filtering and improve performance

The code changes in `lanelet_filter.cpp` optimize the object filtering process by using the `empty()` function instead of checking the size of the `transformed_objects.objects` vector. This change improves performance and simplifies the code logic.

Refactor the code to use `empty()` instead of `size()` for checking if the `transformed_objects.objects` vector is empty. This change improves readability and performance.



---------

Signed-off-by: a-maumau <maumaumaumaumaumaumaumaumaumau@gmail.com>
Signed-off-by: Taekjin LEE <taekjin.lee@tier4.jp>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
@a-maumau a-maumau deleted the mau/perf/perception/lanelet_filter branch July 31, 2024 05:24
esteve pushed a commit to esteve/autoware.universe that referenced this pull request Aug 13, 2024
…essing time (autowarefoundation#8240)

* add local r-tree for fast searching

Signed-off-by: a-maumau <maumaumaumaumaumaumaumaumaumau@gmail.com>

change to _func__

add more debug

use local rtree

fix

tmp update

fix bug

clean unused

clean up

Signed-off-by: a-maumau <maumaumaumaumaumaumaumaumaumau@gmail.com>

* clean up

Signed-off-by: a-maumau <maumaumaumaumaumaumaumaumaumau@gmail.com>

* style(pre-commit): autofix

* chore: Optimize object filtering and improve performance

The code changes in `lanelet_filter.cpp` optimize the object filtering process by using the `empty()` function instead of checking the size of the `transformed_objects.objects` vector. This change improves performance and simplifies the code logic.

Refactor the code to use `empty()` instead of `size()` for checking if the `transformed_objects.objects` vector is empty. This change improves readability and performance.

Signed-off-by: Taekjin LEE <taekjin.lee@tier4.jp>

---------

Signed-off-by: a-maumau <maumaumaumaumaumaumaumaumaumau@gmail.com>
Signed-off-by: Taekjin LEE <taekjin.lee@tier4.jp>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Taekjin LEE <taekjin.lee@tier4.jp>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:perception Advanced sensor data processing and environment understanding. (auto-assigned) run:build-and-test-differential Mark to enable build-and-test-differential workflow. (used-by-ci)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants