Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allocations for a large mesh are reaching 1GB of allocation to generate nav mesh. #61

Closed
Doprez opened this issue Mar 4, 2024 · 7 comments

Comments

@Doprez
Copy link
Contributor

Doprez commented Mar 4, 2024

I was messing around with the library and wanted to test a larger scene for an RTS example using a default cude in Blender that I scaled to 500x1x500.

I definitely expected a regression in performance but are these allocations correct or is there a known issue with the navmesh generatioon?

I collected some stacktraces of the memory from DotMemory here:
Singles being allocated, mostly from DividePoly.

System.Single[]
  Objects : n/a
  Bytes   : 1603298776

>99.9%  DividePoly • 1.49 GB / 1.49 GB • DotRecast.Recast.RcRasterizations.DividePoly(Single[], Int32, Int32, Int32, Int32, Int32, Int32, Single, Int32)
  >99.9%  RasterizeTri • 1.49 GB / - • DotRecast.Recast.RcRasterizations.RasterizeTri(Single[], Int32, Int32, Int32, Int32, RcHeightfield, RcVec3f, RcVec3f, Single, Single, Single, Int32)
    >99.9%  RasterizeTriangles • 1.49 GB / - • DotRecast.Recast.RcRasterizations.RasterizeTriangles(RcContext, Single[], Int32[], Int32[], Int32, RcHeightfield, Int32)
      >99.9%  BuildSolidHeightfield • 1.49 GB / - • DotRecast.Recast.RcVoxelizations.BuildSolidHeightfield(RcContext, IInputGeomProvider, RcBuilderConfig)
        >99.9%  Build • 1.49 GB / - • DotRecast.Recast.RcBuilder.Build(IInputGeomProvider, RcBuilderConfig)
          >99.9%  BuildRecastResult • 1.49 GB / - • DotRecast.Recast.Toolset.Builder.SoloNavMeshBuilder.BuildRecastResult(DemoInputGeomProvider, RcConfig)
            >99.9%  Build • 1.49 GB / - • DotRecast.Recast.Toolset.Builder.SoloNavMeshBuilder.Build(DemoInputGeomProvider, RcPartition, Single, Single, Single, Single, Single, Single, Int32, Int32, Single, Single, Int32, Single, Single, Boolean, Boolean, Boolean)
              >99.9%  Build • 1.49 GB / - • DotRecast.Recast.Toolset.Builder.SoloNavMeshBuilder.Build(DemoInputGeomProvider, RcNavMeshBuildSettings)
                >99.9%  OnNavMeshBuildBegan • 1.49 GB / - • DotRecast.Recast.Demo.RecastDemo.OnNavMeshBuildBegan(NavMeshBuildBeganEvent)
                  >99.9%  OnMessage • 1.49 GB / - • DotRecast.Recast.Demo.RecastDemo.OnMessage(IRecastDemoMessage)
                    >99.9%  OnWindowUpdate • 1.49 GB / - • DotRecast.Recast.Demo.RecastDemo.OnWindowUpdate(Double)
                      >99.9%  <Run>b__0 • 1.49 GB / - • Silk.NET.Windowing.WindowExtensions+<>c__DisplayClass2_0.<Run>b__0()
                        >99.9%  Run • 1.49 GB / - • Silk.NET.Windowing.Internals.ViewImplementationBase.Run(Action)
                          >99.9%  Run • 1.49 GB / - • Silk.NET.Windowing.WindowExtensions.Run(IView)
                            >99.9%  Run • 1.49 GB / - • DotRecast.Recast.Demo.RecastDemo.Run()
                              >99.9%  StartDemo • 1.49 GB / - • DotRecast.Recast.Demo.Program.StartDemo()
                                >99.9%  Main • 1.49 GB / - • DotRecast.Recast.Demo.Program.Main(String[])
                                  ► >99.9%  [AllThreadsRoot] • 1.49 GB / - • [AllThreadsRoot]

#stacktrace

RcSpans being allocated by AddSpan

DotRecast.Recast.RcSpan
  Objects : n/a
  Bytes   : 890084272

 100%  AddSpan • 848.85 MB / 848.85 MB • DotRecast.Recast.RcRasterizations.AddSpan(RcHeightfield, Int32, Int32, Int32, Int32, Int32, Int32)
   100%  RasterizeTri • 848.85 MB / - • DotRecast.Recast.RcRasterizations.RasterizeTri(Single[], Int32, Int32, Int32, Int32, RcHeightfield, RcVec3f, RcVec3f, Single, Single, Single, Int32)
     100%  RasterizeTriangles • 848.85 MB / - • DotRecast.Recast.RcRasterizations.RasterizeTriangles(RcContext, Single[], Int32[], Int32[], Int32, RcHeightfield, Int32)
       100%  BuildSolidHeightfield • 848.85 MB / - • DotRecast.Recast.RcVoxelizations.BuildSolidHeightfield(RcContext, IInputGeomProvider, RcBuilderConfig)
         100%  Build • 848.85 MB / - • DotRecast.Recast.RcBuilder.Build(IInputGeomProvider, RcBuilderConfig)
           100%  BuildRecastResult • 848.85 MB / - • DotRecast.Recast.Toolset.Builder.SoloNavMeshBuilder.BuildRecastResult(DemoInputGeomProvider, RcConfig)
             100%  Build • 848.85 MB / - • DotRecast.Recast.Toolset.Builder.SoloNavMeshBuilder.Build(DemoInputGeomProvider, RcPartition, Single, Single, Single, Single, Single, Single, Int32, Int32, Single, Single, Int32, Single, Single, Boolean, Boolean, Boolean)
               100%  Build • 848.85 MB / - • DotRecast.Recast.Toolset.Builder.SoloNavMeshBuilder.Build(DemoInputGeomProvider, RcNavMeshBuildSettings)
                 100%  OnNavMeshBuildBegan • 848.85 MB / - • DotRecast.Recast.Demo.RecastDemo.OnNavMeshBuildBegan(NavMeshBuildBeganEvent)
                   100%  OnMessage • 848.85 MB / - • DotRecast.Recast.Demo.RecastDemo.OnMessage(IRecastDemoMessage)
                     100%  OnWindowUpdate • 848.85 MB / - • DotRecast.Recast.Demo.RecastDemo.OnWindowUpdate(Double)
                       100%  <Run>b__0 • 848.85 MB / - • Silk.NET.Windowing.WindowExtensions+<>c__DisplayClass2_0.<Run>b__0()
                         100%  Run • 848.85 MB / - • Silk.NET.Windowing.Internals.ViewImplementationBase.Run(Action)
                           100%  Run • 848.85 MB / - • Silk.NET.Windowing.WindowExtensions.Run(IView)
                             100%  Run • 848.85 MB / - • DotRecast.Recast.Demo.RecastDemo.Run()
                               100%  StartDemo • 848.85 MB / - • DotRecast.Recast.Demo.Program.StartDemo()
                                 100%  Main • 848.85 MB / - • DotRecast.Recast.Demo.Program.Main(String[])
                                  ►  100%  [AllThreadsRoot] • 848.85 MB / - • [AllThreadsRoot]

#stacktrace

There are a couple more that are also allocating a couple hundred MBs but these 2 are by far the most agressive it seems.

@Doprez
Copy link
Contributor Author

Doprez commented Mar 4, 2024

So I think I found at least one solution for the Singles being allocated?

The old stacktrace with 1.5GB allocations:
image

The new stacktrace with 266KB allocations:
image

The change made here was to move the float array to be a static reference instead of making a new float array on each method call.
from:

        /// Divides a convex polygon of max 12 vertices into two convex polygons
        /// across a separating axis.
        /// 
        /// @param[in]	inVerts			The input polygon vertices
        /// @param[in]	inVertsCount	The number of input polygon vertices
        /// @param[out]	outVerts1		Resulting polygon 1's vertices
        /// @param[out]	outVerts1Count	The number of resulting polygon 1 vertices
        /// @param[out]	outVerts2		Resulting polygon 2's vertices
        /// @param[out]	outVerts2Count	The number of resulting polygon 2 vertices
        /// @param[in]	axisOffset		THe offset along the specified axis
        /// @param[in]	axis			The separating axis
        private static void DividePoly(float[] inVerts, int inVertsOffset, int inVertsCount,
            int outVerts1, out int outVerts1Count,
            int outVerts2, out int outVerts2Count,
            float axisOffset, int axis)
        {
            // How far positive or negative away from the separating axis is each vertex.
            float[] inVertAxisDelta =  new float[12];
            for (int inVert = 0; inVert < inVertsCount; ++inVert)
            {
                inVertAxisDelta[inVert] = axisOffset - inVerts[inVertsOffset + inVert * 3 + axis];
            }
....

to:

        private static readonly float[] _inVertAxisDelta = new float[12];

        /// Divides a convex polygon of max 12 vertices into two convex polygons
        /// across a separating axis.
        /// 
        /// @param[in]	inVerts			The input polygon vertices
        /// @param[in]	inVertsCount	The number of input polygon vertices
        /// @param[out]	outVerts1		Resulting polygon 1's vertices
        /// @param[out]	outVerts1Count	The number of resulting polygon 1 vertices
        /// @param[out]	outVerts2		Resulting polygon 2's vertices
        /// @param[out]	outVerts2Count	The number of resulting polygon 2 vertices
        /// @param[in]	axisOffset		THe offset along the specified axis
        /// @param[in]	axis			The separating axis
        private static void DividePoly(float[] inVerts, int inVertsOffset, int inVertsCount,
            int outVerts1, out int outVerts1Count,
            int outVerts2, out int outVerts2Count,
            float axisOffset, int axis)
        {
            // How far positive or negative away from the separating axis is each vertex.
            float[] inVertAxisDelta = _inVertAxisDelta;
            for (int inVert = 0; inVert < inVertsCount; ++inVert)
            {
                inVertAxisDelta[inVert] = axisOffset - inVerts[inVertsOffset + inVert * 3 + axis];
            }
....

I'm not sure if this is a proper solution but the results seem to be a bit improved at least. The only issue I found is the performance test fails with this change so a bit more investigation is needed here.
image

@Doprez
Copy link
Contributor Author

Doprez commented Mar 4, 2024

Switching to the ArrayPool instead of having the static reference seems to fix the test issue and performs as well in allocations as far as I can tell.

The next thing to look into is the RcSpan allocations, I cant seem to create the same fix so I will need something different for that issue.

@ikpil
Copy link
Owner

ikpil commented Mar 5, 2024

@Doprez Hello!

I wanted to inform you that we received a PR regarding this matter. One has been merged, and the other one is currently under review.I've been quite busy with work lately and haven't been able to pay much attention, but I'll try to make the necessary adjustments as soon as possible.

Could you please send me the test obj model that I can review?

@ikpil
Copy link
Owner

ikpil commented Mar 5, 2024

#62
#63

@Doprez
Copy link
Contributor Author

Doprez commented Mar 5, 2024

@Doprez Hello!

I wanted to inform you that we received a PR regarding this matter. One has been merged, and the other one is currently under review.I've been quite busy with work lately and haven't been able to pay much attention, but I'll try to make the necessary adjustments as soon as possible.

Could you please send me the test obj model that I can review?

I added it to a repo here with the cube https://github.com/Doprez/RandomFileShare/blob/main/500x500-cube.obj

Its the default cube from blender scaled high. I also tried with your large terrain example and the same issue is there but with less allocations. I am wondering if more verts helps in some odd way but I havent been able to properly test that thought.

If I have some time after work today I will try to make a PR for the Single/float allocations. I was happy to see someone took care of the RcSpan because that one was stumping me a bit lol.

edit: nevermind about my PR that person beat me to it lol

ikpil added a commit that referenced this issue Apr 20, 2024
@ikpil
Copy link
Owner

ikpil commented Apr 20, 2024

@Doprez
I've resolved the issue you reported!

50ea674 - used option keepInterResults to save memory

@Doprez
Copy link
Contributor Author

Doprez commented Apr 20, 2024

Oh wow, after some quick testing with my previous project the difference is huge!

System.Single[]
  Objects : n/a
  Bytes   : 534944

 20.6%  DividePoly • 107.7 KB / 107.7 KB • DotRecast.Recast.RcRasterizations.DividePoly(Single[], Int32, Int32, Int32, Int32, Int32, Int32, Single, Int32)
   20.6%  RasterizeTriangles • 107.7 KB / - • DotRecast.Recast.RcRasterizations.RasterizeTriangles(RcContext, Single[], Int32[], Int32[], Int32, RcHeightfield, Int32)
     20.6%  BuildSolidHeightfield • 107.7 KB / - • DotRecast.Recast.RcVoxelizations.BuildSolidHeightfield(RcContext, IInputGeomProvider, RcBuilderConfig)
       20.6%  <BuildMultiThreadAsync>b__0 • 107.7 KB / - • DotRecast.Recast.RcBuilder+<>c__DisplayClass6_1.<BuildMultiThreadAsync>b__0()
         20.6%  RunFromThreadPoolDispatchLoop • 107.7 KB / - • System.Threading.ExecutionContext.RunFromThreadPoolDispatchLoop(Thread, ExecutionContext, ContextCallback, Object)
           20.6%  ExecuteWithThreadLocal • 107.7 KB / - • System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task, Thread)
             20.6%  Dispatch • 107.7 KB / - • System.Threading.ThreadPoolWorkQueue.Dispatch()
               20.6%  WorkerThreadStart • 107.7 KB / - • System.Threading.PortableThreadPool+WorkerThread.WorkerThreadStart()
                ►  20.6%  [AllThreadsRoot] • 107.7 KB / - • [AllThreadsRoot]

#stacktrace

I dont even see AddSpan in the allocations list anymore but the total allocationsare 147MB compared to the almost 900MB before.

I think this is good to close for the initial reported issue, Ill open a new one if something comes up. Thank you!

@Doprez Doprez closed this as completed Apr 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants