Skip to content

Build with PGO #548

@helpau

Description

@helpau

Building with PGO speeds up loda benchmark by 0%-35%. However, this is quite difficult due to at least 3 profiles (cl, gcc, clang, maybe more because of x86_64/arm64?) and the dependency on the toolchain version. Example of changes here #547
Results from Apple M3, clang 17
Without PGO

Sequence Terms Reg Eval Inc Eval Vir Eval
A000040 1000 3.48s - 0.43s
A000394 1000 1.72s - 0.22s
A000401 1000 2.98s - 0.22s
A000796 300 0.70s - -
A001041 300 0.73s - -
A001113 300 0.63s - -
A002110 300 0.77s - -
A002760 200 24.71s - 1.18s
A057552 300 2.81s 0.03s -
A079309 300 2.80s 0.03s -
A002193 400 0.56s 0.23s -
A035856 500 1.62s - -
A001609 1000 0.52s 0.00s -
A003411 1000 0.59s 0.00s -
A012866 1000 1.00s 0.00s -
A000045 2000 1.82s 0.00s -
A001304 3000 0.98s 0.00s -
A000005 5000 1.04s - -
A130487 5000 1.70s 0.00s -
A000030 500000 0.38s - -

With PGO(instrumented profile, profile generated from loda mine -H 1)

Sequence Terms Reg Eval Inc Eval Vir Eval
A000040 1000 2.22s - 0.32s
A000394 1000 1.41s - 0.17s
A000401 1000 2.34s - 0.17s
A000796 300 0.67s - -
A001041 300 0.72s - -
A001113 300 0.60s - -
A002110 300 0.73s - -
A002760 200 19.51s - 0.98s
A057552 300 2.79s 0.03s -
A079309 300 2.77s 0.03s -
A002193 400 0.56s 0.21s -
A035856 500 1.55s - -
A001609 1000 0.50s 0.00s -
A003411 1000 0.58s 0.00s -
A012866 1000 0.99s 0.00s -
A000045 2000 1.82s 0.00s -
A001304 3000 0.84s 0.00s -
A000005 5000 0.74s - -
A130487 5000 1.12s 0.00s -
A000030 500000 0.28s - -

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions