Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows build guide, optimization flags and verbose exposure script #146

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

Skyrion9
Copy link

I had many issues compiling for Windows. I used existing vcpkg from windows ENV to fix/work around the first error. But you need VS 2022 17.8 Preview 2 or later to avoid msys error when building master. I wrote a guide to compile on Windows.

Using non-destructive optimization flags with a bias towards runtime speed over size. In my limited profiling there's no performance degradation, and kernel spends slightly more CPU time (~1%) on apriltag functions within same time frame. Within margin of error, this means either the CPU is being utilized better, or less efficiently. Minor filesize and memory footprint increase due to more aggressive function inlining and double-alignment in memory among other things. Note that MSVC usually doesn't just apply the flags but checks if they would be beneficial first.

You can easily override them by using your own flags from terminal, they are safe fast defaults so to speak. There's more to do on the Linux side of things there but O2 is a good start. There could be potential gains from using (AVX/2/512), Maybe an AVX build could be released along with the normal SSE from here on out since VRChat requires AVX so most of the users already have it.

I've also updated the exposure script to be more verbose. It was annoying not knowing if it changed anything without checking my phone, The script tells you if IP is not found, and doesn't wait on Curl's long timeout.

@funnbot
Copy link
Contributor

funnbot commented Oct 21, 2023

Windows and Linux LTO is set here:

set_target_properties(AprilTagTrackers PROPERTIES

@@ -32,14 +32,18 @@ function(att_bootstrap_vcpkg)
set(vcpkg_bootstrap_cmd "${VCPKG_ROOT}/bootstrap-vcpkg.sh")
endif()

if (NOT EXISTS "${vcpkg_bootstrap_cmd}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will only run if vcpkg was not previously installed, as the bootstrap script won't exist.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check https://devblogs.microsoft.com/cppblog/vcpkg-is-now-included-with-visual-studio/#using-the-visual-studio-copy-of-vcpkg

Bootstrap script is no longer shipped with VS installations and requires running a terminal command to run in "classic" mode.

Without this check, the program compiles successfully.

@@ -32,14 +32,18 @@ function(att_bootstrap_vcpkg)
set(vcpkg_bootstrap_cmd "${VCPKG_ROOT}/bootstrap-vcpkg.sh")
endif()

if (NOT EXISTS "${vcpkg_bootstrap_cmd}")
find_program(GIT_CMD git REQUIRED)
execute_process(COMMAND "${GIT_CMD}" clone --filter=tree:0 "https://github.com/microsoft/vcpkg.git" "${VCPKG_ROOT}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming the git error is due to --filter=tree:0? It requires a newish version of git.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've the latest 2.42 maintained version of git, updating it was the first thing I've tried as a fix. What version is necessary exactly?

# MSVC flags for runtime performance over size, and suggest more aggressive inline functioning to compiler. /favor:AMD64 or INTEL64 may benefit.
# Eigen3 note: /arch:AVX causes crash with Refine tracker calibration, due to Eigen3. 16-byte memory alignment might fix, possibly worse performance than default SSE2.
# You can set /arch to AVX, AVX2 or AVX512 if you don't use calibration refinement. This should increase vectorized loop performance noticably.
set(CMAKE_CXX_FLAGS_RELEASE "${CMAKE_CXX_FLAGS_RELEASE} /O2 /Ob3 /GA /GL /Qpar /Qpar-report:1 /Qvec-report:1" PARENT_SCOPE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/O2 /Ob3 /GA are already set in this variable, /GL will be set later with the INTERPROCEDURAL_OPTIMIZATION property.

/Qpar is new and interesting, we might need to check msvc version to enable that?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the default is /Ob2 which is included with /O2. /Ob3 overrides this and suggests more aggressive inlining to the compiler.

I'm not sure if /GL was already enabled, as before it was set here the compiler didn't recommend enabling LTCG.

@@ -92,10 +96,24 @@ function(att_target_platform_definitions target)

if (MSVC)
set(comp_name "MSVC")
# Note that any flag specified in build command will overwrite these. Treat them as safe defaults.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of these optimization options should be in their own function, and use target_compile_options() and target_link_options().

# You can set /arch to AVX, AVX2 or AVX512 if you don't use calibration refinement. This should increase vectorized loop performance noticably.
set(CMAKE_CXX_FLAGS_RELEASE "${CMAKE_CXX_FLAGS_RELEASE} /O2 /Ob3 /GA /GL /Qpar /Qpar-report:1 /Qvec-report:1" PARENT_SCOPE)
# Additional linker optimizations as compiler output recommends /LTCG with /GL
set(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} /OPT:REF /OPT:ICF=6 /OPT:LBR /LTCG" PARENT_SCOPE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again these should be removed, they are the defaults.
Except for /OPT:ICF=6 which defaults to 1, (this shouldn't be set for debug builds as it can negatively effect debugging), 6 seems high though? I wonder how it effects compile times.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It didn't change the compile time much, and I doubt higher than 3 does anything but the time difference was minor so I went with 6.

As for LTCG, I'm not sure if it was enabled by default. After I set /GL, the compiler suggested enabling it and the suggestion went away after setting it here.

elseif(CMAKE_CXX_COMPILER_ID MATCHES "Clang")
set(comp_name "CLANG")
# Clang compiler flags for release. Runtime performance over size. O3 is higher level, but placebo in most cases and increases code size unnecessarily. Should be profiled first.
# TODO : Investigate linker flags for CLang and GCC compilers, and test ftree-vectorize along with fvect-cost-model. flto is experimental but could help.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-O2 is also the default set in release builds, and lto is enabled.

-Added project option to enable additional MSVC flags.
Enabled by default, affects only release configuration.
@ju1ce
Copy link
Owner

ju1ce commented Oct 31, 2023

I couldn't get this to compile with my usual workflow, and I don't have the time to go any deeper right now, so I'll leave this open for now. I'll link this pr, your repo and your building guide on the readme though.

Thank you for the contribution and sorry I cannot merge stuff right now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants