Manifold Performance #383
Replies: 12 comments 35 replies
-
@ochafik made this excellent performance comparison of OpenSCAD running Manifold compared to CGAL's fast-csg, finding a 5-30x improvement. And fast-csg is itself a vast improvement over CGAL's Nef routines (baseline OpenSCAD), showing its own 30-150x improvement. This validates that the 1,000x improvement shown above was not a unique case. |
Beta Was this translation helpful? Give feedback.
-
There is also this lovely analysis comparing Manifold to JSCAD. Some quotes:
|
Beta Was this translation helpful? Give feedback.
-
Some of my previous Issues/PRs regarding performance:
|
Beta Was this translation helpful? Give feedback.
-
Ideas about further optimization:
|
Beta Was this translation helpful? Give feedback.
-
Just to report on the hashtable changes: I tried implementing SparseMap with Hashtable for boolean3, and sorts are mostly removed (except for the sort for winding numbers). The problem is that Hashtable is really slow comparing with our previous array approach (basically double the time). I tried various approaches including
While the time for Intersect12 is significantly faster, the time required for hashtable insert is just too slow. There are several reasons:
And I think these issues are quite fundamental for hashtable, so I don't think there is a way out of this. We probably need to stay with the existing method. |
Beta Was this translation helpful? Give feedback.
-
I've been asking performance questions that you may find a bit odd... like why would I think simple ops take a while. In OpenSCAD:$fn = 100;
module rounded_rect(r, dim) {
hull() {
translate([r,r]) circle(r);
translate([dim.x-r,r]) circle(r);
translate([r,dim.y-r,r]) circle(r);
translate([dim.x-r,dim.y-r]) circle(r);
}
}
module rounded_slice(r, dim, h) {
linear_extrude(h) rounded_rect(r, dim);
}
module rounded_slices(r, dim) {
h = 0.1;
union() {
for(iota = [0.0: 0.1: 2.0]) {
translate([iota, iota, iota]) rounded_slice(r-iota,[dim.x-2*iota, dim.y-2*iota], h);
}
}
}
rounded_slices(5, [10, 20]); In Python:import time
import manifold3d as m
def rounded_rect(r, dim):
x, y = dim
circ = m.CrossSection.circle(r, 100)
return m.CrossSection.batch_hull(
[
circ.translate((0, 0)),
circ.translate((x-r, 0)),
circ.translate((0, y-r)),
circ.translate((x-r, y-r)),
]
)
def rounded_slice(r, dim, h):
return m.Manifold.extrude(rounded_rect(r, dim), h)
def rounded_slices(r, dim):
x, y = dim
h = 0.1
l=[]
iota = 0.0
while iota < 2.09:
l.append(
rounded_slice(r-iota, (x-2*iota, y-2*iota), h).
translate([iota, iota, iota])
)
iota += 0.1
return m.Manifold.batch_boolean(l, m.OpType.Add)
ts = time.perf_counter()
rs =rounded_slices(5, (10, 20))
rs.num_vert() # Make sure all is evaluated.
te = time.perf_counter()
tdiff = te - ts
print(f"{tdiff: 5.3f}", "Created round_slices created.")
import piecad
piecad.view(piecad.Obj3d(rs)) |
Beta Was this translation helpful? Give feedback.
-
I ran several timing tests to determine several things.
2D timing tests
import time
import manifold3d as m
from random import random
ts = time.perf_counter()
for i in range(1000000):
c = m.CrossSection.square((10.0, 10.0))
c.num_vert()
te = time.perf_counter()
tdiff = te - ts
print(f"{tdiff: 5.3f}", "1,000,000 creations of CS with square member function")
p = c.to_polygons()
ts = time.perf_counter()
for i in range(1000000):
c = m.CrossSection(p)
c.num_vert()
te = time.perf_counter()
tdiff = te - ts
print(f"{tdiff: 5.3f}", "1,000,000 creations of CS with points from square")
r = 5
s = 36
ts = time.perf_counter()
for i in range(100000):
circ = m.CrossSection.circle(r, s)
c = m.CrossSection.batch_hull(
[
circ,
circ.translate((10, 0)),
circ.translate((0, 10)),
circ.translate((10, 10)),
]
)
c.num_vert()
te = time.perf_counter()
tdiff = te - ts
print(f"{tdiff: 5.3f}", "100,000 creations of roundrect using hull and 4 circles")
l = []
ts = time.perf_counter()
for i in range(1000):
c1 = c.translate((random() * 200, random() * 200))
c1.num_vert()
l.append(c1)
te = time.perf_counter()
tdiff = te - ts
print(f"{tdiff: 5.3f}", "Creation of 1,000 randomly placed roundrect")
c1 = m.CrossSection()
ts = time.perf_counter()
for i in range(1000):
c1 = m.CrossSection.__add__(c1, l[i])
c1.num_vert()
te = time.perf_counter()
tdiff = te - ts
print(f"{tdiff: 5.3f}", "1000 unions of 1000 randomly placed roundrect")
ts = time.perf_counter()
c = m.CrossSection.batch_boolean(l, m.OpType.Add)
c.num_vert()
te = time.perf_counter()
tdiff = te - ts
print(f"{tdiff: 5.3f}", "batch_boolean union of 1,000 randomly placed roundrect") 3D timing tests
import time
import manifold3d as m
from random import random
import numpy as np
ts = time.time()
for i in range(100000):
c = m.Manifold.cube((10.0, 10.0, 10.0))
c.num_vert()
te = time.time()
tdiff = te - ts
print(f"{tdiff: 5.3f}", "100,000 creations of M with cube member function")
mesh = c.to_mesh()
tv=np.array(mesh.tri_verts.tolist().copy(), np.int32)
vp=np.array(mesh.vert_properties.tolist().copy(), np.float32)
ts = time.time()
for i in range(100000):
mesh = m.Mesh(tri_verts=tv, vert_properties=vp)
c = m.Manifold(mesh)
c.num_vert()
te = time.time()
tdiff = te - ts
print(f"{tdiff: 5.3f}", "100,000 creations of M with mesh from cube")
def cube(dim):
x, y, z = dim
c = m.CrossSection.square([x, y])
return m.Manifold.extrude(c, z)
ts = time.time()
for i in range(100000):
c = cube((10.0, 10.0, 10.0))
c.num_vert()
te = time.time()
tdiff = te - ts
print(f"{tdiff: 5.3f}", "100,000 creations of M with python function")
r = 5
s = 36
ts = time.time()
for i in range(1000):
sph = m.Manifold.sphere(r, s)
c = m.Manifold.batch_hull(
[
sph,
sph.translate((10, 0, 0)),
sph.translate((0, 10, 0)),
sph.translate((10, 10, 0)),
sph.translate((0, 0, 10)),
sph.translate((10, 0, 10)),
sph.translate((0, 10, 10)),
sph.translate((10, 10, 10)),
]
)
c.num_vert()
te = time.time()
tdiff = te - ts
print(f"{tdiff: 5.3f}", "1,000 creations of roundedbox using hull and 8 spheres")
n = 1000
l = []
ts = time.time()
for i in range(n):
c1 = c.translate((random() * 200, random() * 200, random() * 200))
c1.num_vert()
l.append(c1)
te = time.time()
tdiff = te - ts
print(f"{tdiff: 5.3f}", f"Creation of {n} randomly placed roundedbox")
c1 = m.Manifold()
ts = time.time()
for i in range(n):
c1 = m.Manifold.__add__(c1, l[i])
c1.num_vert()
te = time.time()
tdiff = te - ts
print(f"{tdiff: 5.3f}", f"{n} unions of {n} randomly placed roundedbox")
ts = time.time()
c = m.Manifold.batch_boolean(l, m.OpType.Add)
c.num_vert()
te = time.time()
tdiff = te - ts
print(f"{tdiff: 5.3f}", f"batch_boolean union of {n} randomly placed roundedbox") |
Beta Was this translation helpful? Give feedback.
-
I have no real world performance numbers here, but just and idea. Would it be beneficial and generally useful for users using extrude to provide the 2d shape, and already triangulated mesh for it instead of relying on manifold to triangulate it ? |
Beta Was this translation helpful? Give feedback.
-
Is there a generally agreed-upon definition of Performance? To most it means execution speed. That's fine, but speed at the expense of robustness/reliability is no gain. Maybe I missed it in my readings, but there have been precious few discussions on reliability/correctness of Manifold's Boolean output. SOTA Boolean libraries with a legacy of many decades are still struggling with this general issue. ParaSolids from Siemens is one. PTC and Dassault Systemes have their own. But, I believe the issue remains. Which explains my curiosity on how/if Manifold addresses this. |
Beta Was this translation helpful? Give feedback.
-
You mentioned "symbolic perturbation" (of vertices, I suppose). |
Beta Was this translation helpful? Give feedback.
-
Stumbled into this libary recently: https://meshlib.meshinspector.com/ Have you folks checked out their 3D Booleans? |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
As a partner to #340, I'm starting this discussion as a place to share real-world performance insights on Manifold. I feel a discussion is a better form since performance varies widely based on the chosen benchmark, the parallel architecture, and what library it's compared against, not to mention the fact that our performance is improving over time.
To start, there's the first performance analysis I did, comparing Manifold to baseline OpenSCAD (CGAL), though Manifold has been optimized significantly since then (and likely OpenSCAD has as well). Overall takeaway was Manifold being 100 - 1,000 times faster.
Please post links to any performance analysis you've done and give a top-level summary here. I'm hoping we can build a more complete picture and also get a sense of what optimizations we should focus on going forward.
Beta Was this translation helpful? Give feedback.
All reactions