Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Higher auto-refinement limits #285

Merged
merged 37 commits into from
Aug 1, 2024
Merged

Higher auto-refinement limits #285

merged 37 commits into from
Aug 1, 2024

Conversation

tzanio
Copy link
Member

@tzanio tzanio commented Jun 20, 2024

This is meant to address the issue where small meshes are curved but refined versions of them appear to lose the curvature in GLVis. See also this task in #284

Command-line option to ensure minimal level of refinement to large meshes don't show as linear by default (ping: @dylan-copeland, @psocratis)?

The reasons for that behavior are the auto-refinement limits, which previously were:

  • at most 16 refinement levels (auto_ref_max = 16)
  • at most 20K "refined" elements to be visualized (auto_ref_max_surf_elem = 20000)

Both of these variables are set in lib/vsdata.cpp

After considering several options, I suggest that we just increase those variables to 32 and 5M respectively.

These limits seems to work fine on modern hardware and the waiting time for large meshes is a few seconds on my Mac.

@psocratis and @dylan-copeland -- do you mind trying this on your use cases and reporting if it works well?

@tzanio tzanio self-assigned this Jun 20, 2024
@tzanio tzanio added this to the glvis-4.3 milestone Jun 20, 2024
@tzanio tzanio mentioned this pull request Jun 20, 2024
22 tasks
lib/vsdata.cpp Outdated
Comment on lines 1273 to 1274
auto_ref_max = 16;
auto_ref_max_surf_elem = 20000;
auto_ref_max = 32;
auto_ref_max_surf_elem = 5000000;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we have a need to increase auto_ref_max -- 16 is more than sufficient for any practical case. For extreme cases it can be increased manually with the keys.

Also, I think increasing auto_ref_max_surf_elem to $5\times 10^6$ is too excessive. An increase by 25x from $2\times 10^4$ to $5\times 10^5$ is more reasonable, however, even that may cause undesired, noticeable slowdowns, especially for lower order meshes (e.g. linear meshes) -- note that the refinement is applied independent of degree, so that curvature from the non-linear terms in $Q_1$ meshes/fields can be captured.

Another option is to keep the current value and instead issue a warning in the terminal (we may want to add a feature where terminal messages are shown for a brief moment in the main window and then fade away) when the auto-selected refinement factor is less than the max of the orders of the mesh and the order of the field.

Copy link
Member Author

@tzanio tzanio Jun 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what we want ideally is that these are set to the maximum values for what a "reasonable" mesh may need, and then for GLVis to select the actual auto-refinement values based on the order of the mesh (and solution).

For example auto_ref_max should be at least max{ element_order, gf_order}, maybe scaled by a factor of 2?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, maximum of some reasonable value and a value based on the grid function + mesh order 👍 , but it could be capped by some higher number maybe? To give the user a chance to change the value and not burn off the computer right away 🔥 😄

Copy link
Member

@v-dobrev v-dobrev Jun 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me, "reasonable" auto-selected refinement factor is mainly driven by the time it takes to display the initial mesh. If possible, displaying the initial mesh should not take longer than ~0.1-0.3 sec on relatively new hardware. If we roughly say that the time to display some surface is proportional to to the point evaluations we need to perform, then we need to choose an upper bound for auto_ref_max_surf_elem which gives the desired time limit. Of course, achieving this limit may not be possible when the number of the drawn surface elements is too big.

In the methods VisualizationSceneSolution::GetAutoRefineFactor() and VisualizationSceneSolution3d::GetAutoRefineFactor(), the number of point evaluations is computed as ne*(ref+1)*(ref+1) where ne is the number of surface elements in the mesh (i.e. number of elements in 2D meshes, embedded in 2D or 3D, or the number of boundary elements for 3D meshes). So the question is: how many point evaluations can we do in the allowed ~0.1-0.3 sec -- we can debate what exact value to aim for in this range. The evaluation speed per point will depend to some extend on the orders of the mesh and the field, so to make things more concrete, let's say we should measure the evaluation times for $Q_2$ meshes/fields.

Does this sound like a "reasonable" approach to determine the upper limit auto_ref_max_surf_elem?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regardless of how we do the auto-selection of refinement level, users will need to be aware that for big meshes they may not get any element sub-divisions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There could be if order_ref will give a number lower than 20k, or no? 🤔 .

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The order_ref is just an initial value for ref before the while loop (when the if is true). So if order_ref < 16 && ne*(order_ref+1)*(order_ref+1) <= 100000, you will get more refinements than order_ref and no less refinements than before.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem, for me, with @tzanio's suggestion (even when the while loop is moved after the if else statement) is that it can result in big jumps in speed for small change in the input. For example, if ne*(order_ref+1)*(order_ref+1) is close but less than 2M we get ref=order_ref and if we increase ne just a little to push ne*(order_ref+1)*(order_ref+1) beyond 2M, then can immediately go down to ref=1.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha, like this, yes, in the modified version it would work, but I thought it is more logical to iterate to the same limiting number as in the criterion for order_ref to keep continuity and not jump down, as you say.

Copy link
Contributor

@najlkin najlkin Jul 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I was proposing would solve it (updated version to have the same numbers):

int ne = mesh->GetNE(), ref = 1;
int autoref_max = min(max(ne*(order_ref+1)*(order_ref+1), 100k), 2M)
while (ref < 16 && ne*(ref+1)*(ref+1) <= autoref_max) { ref++; }

(which implies ref==order_ref if it is in the range and order_ref < 16)

@najlkin najlkin mentioned this pull request Jul 8, 2024
5 tasks
@najlkin najlkin self-requested a review July 16, 2024 18:59
@tzanio
Copy link
Member Author

tzanio commented Jul 17, 2024

I created a poll for this, please vote here: #292

@najlkin najlkin self-assigned this Jul 22, 2024
@najlkin najlkin removed their request for review July 22, 2024 21:02
@najlkin
Copy link
Contributor

najlkin commented Jul 22, 2024

How about this? 😉 Please test it, it cannot pass the regression tests for obvious reasons and I do not have much time for it.

@najlkin
Copy link
Contributor

najlkin commented Jul 22, 2024

Now all tests run, but there are some visible differences in the zoom level interestingly 🤔 .

@justinlaughlin
Copy link
Contributor

Now all tests run, but there are some visible differences in the zoom level interestingly 🤔 .

just making sure - this isn't the difference between the test.*.saved.png vs test.fail.*.saved.png files right? Those filenames are misleading (in my opinion) because the test.fail.* files are actually the output of the negative test (making sure that the tests fail when they should). All of the "negative tests" are zoomed in so that they fail.

In order to compare to the baselines you have to download the baselines separately. I'm looking at the test outputs and see some differences but I don't see any visible differences in zoom.

e.g. (test.rehmos.saved.png) left = baseline, right = test output. I manually zoomed in so its not aligned perfectly.
image

lib/vsdata.cpp Outdated Show resolved Hide resolved
lib/vsdata.cpp Outdated
}

// limit the total number of elements
int auto_ref_surf_elem = ne*(order_ref+1)*(order_ref+1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a critique, just trying to understand for myself...

Since order_ref>=1, this auto refinement function will always try to refine the base number of elements ne by at least 4x. Is that correct?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dammit, you are right, it is shifted by one as the loop below has <= 😬 .

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the catch 😅 , it is fixed in 73a0174 .

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The formula was right, the name of the variable is wrong -- ultimately we determine ref so that the number of vertices in the refined representation is less than or equal to a given max value. So I think you should revert the formula and change the name of the variable to auto_ref_vert_max or something like that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The formula was wrong, but rather from different perspective than @justinlaughlin mentioned. The point of the this and following lines is to set ref=order_ref when possible, but when the loop below gets to the point ref==order_ref then ne*(ref+1)*(ref+1) <= auto_ref_surf_elem with auto_ref_surf_elem = ne*(order_ref+1)*(order_ref+1), so ref is increased and the algorithm ends with ref==order_ref+1.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I understand now. I think your fix makes sense.

Copy link
Contributor

@najlkin najlkin Jul 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I rewrote the algorithm to vertices and rebalanced the test cases, so how about now @v-dobrev ?

lib/vsdata.cpp Show resolved Hide resolved
lib/vsdata.cpp Outdated Show resolved Hide resolved
CHANGELOG Outdated
Comment on lines 15 to 19
- Changed the auto refinement algorithm, which is now based on the order of
the grid function and mesh. The total number of elements is limited to the
range between 100k and 2M. If the upper limit is reached a warning is shown
and you may still need to press 'o' if you want to increase the refinement
even further.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're actually limiting the total number of vertices used for plotting the subdivided mesh. This is motivated by the observation that the vertices determine the size of the floating point data pushed to OpenGL, the other data is just connectivity information.

Also, it should be made clear that the choice of the limit number between 100k and 2M is chosen based on the order of the solution and the mesh and on the number of drawn elements.

Also, this limit is additionally restricted so that no more than 16 refinements are actually performed for auto-refinement.

Also, the order-based limit will be exceeded when the number of the drawn elements is big enough, i.e. when using refinement factor of 1 needs more vertices than the order-based limit.

Also, it is good to clarify that the "number of drawn elements" refers to the number of mesh elements for 2D meshes (including 2D meshes embedded in 3D) and to the number of boundary elements for 3D meshes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I improved the changelog a bit, so it should be more clear now 😉 I use the number of elements consistently with the comment above ☝️ . I am not sure if it addresses all points, but have a look.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tzanio , could you please reformulate it further? Also with the "vertices terminology" now 🙂 .

lib/vssolution.hpp Outdated Show resolved Hide resolved
lib/vssolution3d.hpp Outdated Show resolved Hide resolved
lib/vsvector.hpp Outdated Show resolved Hide resolved
lib/vsvector3d.hpp Outdated Show resolved Hide resolved
@v-dobrev
Copy link
Member

Here are some observations about the speed of the current auto-refinement choice (on Macbook pro with M2 Max chip):

  • First I build MFEM in debug mode but with -Og optimization.
  • I build GLVis using the defaults which is to use the same build as MFEM (i.e. debug) and the same optimization flags.
  • I generate a mesh + solution with ex1 with the run ./ex1 -m ../data/square-disc-p3.mesh -o 6 (this is a mesh with triangles)
  • To measure the time to load and display this mesh + solution in GLVis, I run
    time ./glvis -m ../mfem/examples/refined.mesh -g ../mfem/examples/sol.gf -k q
    and the result is (the subdivision factor chosen by GLVis is 6):
    7.870u 0.088s 0:08.34 95.3%	0+0k 0+0io 14pf+0w
    which means it tool 8.34 seconds.
  • Switching to optimized build for both MFEM and GLVis results in the following output:
    3.726u 0.088s 0:04.42 85.9%	0+0k 0+0io 465pf+0w
    so it took 4.42 seconds.
  • Using the output from ./ex1 -m ../data/star.mesh -o 6 (quad mesh), the timing output is (the subdivision factor chosen by GLVis is 6):
    0.983u 0.071s 0:01.40 75.0%	0+0k 0+0io 9pf+0w
    so this took 1.4 seconds. The mesh has 20480 elements which with refinement factor of 6 results in 1003520 vertices
    (737280 linear quads).
  • Extrapolating the above numbers to 2M vertices, visualizing the result will take ~2.8 seconds.

Copy link
Contributor

@justinlaughlin justinlaughlin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After #285 (comment), I realize the intended algorithm was different from my understanding ($ne * order^2$ rather than $ne * (order+1)^2$).

Overall I think this PR is a big improvement 👍 .

@@ -298,3 +304,46 @@ Key commands
- `x`-component: `v_x`
- `y`-component: `v_y`
- `z`-component: `v_z`

## Auto-refinement
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice and clear section, thanks for adding it @v-dobrev !

@tzanio tzanio changed the title [DO-NOT-MERGE] Higher auto-refinement limits Higher auto-refinement limits Jul 31, 2024
@tzanio
Copy link
Member Author

tzanio commented Jul 31, 2024

Checklist based on #285 (comment)

@tzanio tzanio merged commit 2ffc0ab into master Aug 1, 2024
10 checks passed
@tzanio tzanio deleted the higher-auto-ref branch August 1, 2024 00:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants