-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generate MTL and MPS structs and enums with Clang.jl #492
base: main
Are you sure you want to change the base?
Conversation
Even if only for enums etc, this would be pretty useful! I imagined having to write our own Clang.jl-based generator in order to generate ObjC calls, but this would probably be a good first step. |
21b2dd0
to
4b4ed49
Compare
I think most of it can be upstreamed to Clang.jl. The changes I've done on my branch are either bug fixes or definitions of the |
d197c0f
to
6595240
Compare
6595240
to
3d45135
Compare
3d45135
to
e9941fa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Metal Benchmarks
Benchmark suite | Current: e47ff83 | Previous: 5056e33 | Ratio |
---|---|---|---|
private array/construct |
26472.25 ns |
25920.083333333336 ns |
1.02 |
private array/broadcast |
460958 ns |
464792 ns |
0.99 |
private array/random/randn/Float32 |
803541 ns |
811812 ns |
0.99 |
private array/random/randn!/Float32 |
660875 ns |
673333 ns |
0.98 |
private array/random/rand!/Int64 |
556041 ns |
553250 ns |
1.01 |
private array/random/rand!/Float32 |
591354.5 ns |
601000 ns |
0.98 |
private array/random/rand/Int64 |
761250 ns |
763896 ns |
1.00 |
private array/random/rand/Float32 |
636666.5 ns |
616375 ns |
1.03 |
private array/copyto!/gpu_to_gpu |
670291 ns |
660854.5 ns |
1.01 |
private array/copyto!/cpu_to_gpu |
822166 ns |
619854 ns |
1.33 |
private array/copyto!/gpu_to_cpu |
591125 ns |
834687.5 ns |
0.71 |
private array/accumulate/1d |
1345146.5 ns |
1328542 ns |
1.01 |
private array/accumulate/2d |
1401125 ns |
1387334 ns |
1.01 |
private array/iteration/findall/int |
2106875 ns |
2070500 ns |
1.02 |
private array/iteration/findall/bool |
1843916 ns |
1824916 ns |
1.01 |
private array/iteration/findfirst/int |
1700958 ns |
1682146 ns |
1.01 |
private array/iteration/findfirst/bool |
1676417 ns |
1637959 ns |
1.02 |
private array/iteration/scalar |
3408875 ns |
3891833 ns |
0.88 |
private array/iteration/logical |
3199459 ns |
3177708.5 ns |
1.01 |
private array/iteration/findmin/1d |
1775959 ns |
1740229 ns |
1.02 |
private array/iteration/findmin/2d |
1346750 ns |
1343375 ns |
1.00 |
private array/reductions/reduce/1d |
1044979.5 ns |
1035708 ns |
1.01 |
private array/reductions/reduce/2d |
664542 ns |
651667 ns |
1.02 |
private array/reductions/mapreduce/1d |
1039292 ns |
1037167 ns |
1.00 |
private array/reductions/mapreduce/2d |
670541 ns |
657917 ns |
1.02 |
private array/permutedims/4d |
2572375 ns |
2537625 ns |
1.01 |
private array/permutedims/2d |
1021666 ns |
1022708 ns |
1.00 |
private array/permutedims/3d |
1600749.5 ns |
1577291.5 ns |
1.01 |
private array/copy |
582667 ns |
621750 ns |
0.94 |
latency/precompile |
5770735750.5 ns |
5243893542 ns |
1.10 |
latency/ttfp |
6564931562.5 ns |
6538101604 ns |
1.00 |
latency/import |
1172940125 ns |
1165440583 ns |
1.01 |
integration/metaldevrt |
707354.5 ns |
705833 ns |
1.00 |
integration/byval/slices=1 |
1581729.5 ns |
1588833.5 ns |
1.00 |
integration/byval/slices=3 |
11089208.5 ns |
10079959 ns |
1.10 |
integration/byval/reference |
1576209 ns |
1568145.5 ns |
1.01 |
integration/byval/slices=2 |
2634791.5 ns |
2643625 ns |
1.00 |
kernel/indexing |
456291 ns |
444959 ns |
1.03 |
kernel/indexing_checked |
462834 ns |
446541 ns |
1.04 |
kernel/launch |
8166 ns |
10666.666666666666 ns |
0.77 |
metal/synchronization/stream |
15000 ns |
15167 ns |
0.99 |
metal/synchronization/context |
15250 ns |
15833 ns |
0.96 |
shared array/construct |
27764 ns |
25503.5 ns |
1.09 |
shared array/broadcast |
475500 ns |
475125 ns |
1.00 |
shared array/random/randn/Float32 |
829791.5 ns |
752770.5 ns |
1.10 |
shared array/random/randn!/Float32 |
666417 ns |
655458 ns |
1.02 |
shared array/random/rand!/Int64 |
570645.5 ns |
560916 ns |
1.02 |
shared array/random/rand!/Float32 |
591542 ns |
598459 ns |
0.99 |
shared array/random/rand/Int64 |
783292 ns |
774208 ns |
1.01 |
shared array/random/rand/Float32 |
647541 ns |
629583.5 ns |
1.03 |
shared array/copyto!/gpu_to_gpu |
88834 ns |
85500 ns |
1.04 |
shared array/copyto!/cpu_to_gpu |
88875 ns |
91792 ns |
0.97 |
shared array/copyto!/gpu_to_cpu |
78916 ns |
77917 ns |
1.01 |
shared array/accumulate/1d |
1355104 ns |
1347021 ns |
1.01 |
shared array/accumulate/2d |
1377334 ns |
1383874.5 ns |
1.00 |
shared array/iteration/findall/int |
1815166 ns |
1783167 ns |
1.02 |
shared array/iteration/findall/bool |
1583292 ns |
1585458 ns |
1.00 |
shared array/iteration/findfirst/int |
1391896 ns |
1392020.5 ns |
1.00 |
shared array/iteration/findfirst/bool |
1367375 ns |
1354916.5 ns |
1.01 |
shared array/iteration/scalar |
159834 ns |
162042 ns |
0.99 |
shared array/iteration/logical |
2976229 ns |
2968958 ns |
1.00 |
shared array/iteration/findmin/1d |
1471125 ns |
1461729.5 ns |
1.01 |
shared array/iteration/findmin/2d |
1367542 ns |
1364792 ns |
1.00 |
shared array/reductions/reduce/1d |
729729.5 ns |
730458 ns |
1.00 |
shared array/reductions/reduce/2d |
677979 ns |
656958 ns |
1.03 |
shared array/reductions/mapreduce/1d |
744062.5 ns |
746250 ns |
1.00 |
shared array/reductions/mapreduce/2d |
675271 ns |
660562.5 ns |
1.02 |
shared array/permutedims/4d |
2566270.5 ns |
2528541.5 ns |
1.01 |
shared array/permutedims/2d |
1011000 ns |
1025917 ns |
0.99 |
shared array/permutedims/3d |
1598583 ns |
1580896 ns |
1.01 |
shared array/copy |
244645.5 ns |
242437.5 ns |
1.01 |
This comment was automatically generated by workflow using github-action-benchmark.
0d8492d
to
2fa5f8a
Compare
Please ping when this is ready for review; you've marked it as such, but more commits keep appearing 🙂 Also, were the necessary changes to Clang.jl merged? |
Also comment out MTL enums and structs
Revert if function split doesn't end up getting merged
2837395
to
e47ff83
Compare
I do have a terrible habit of refining my PRs after marking ready. I'll be more mindful of that in the future.
Not yet, JuliaInterop/Clang.jl#519 and JuliaInterop/Clang.jl#522 are general improvements/bug fixes to handle the types of enums that Apple ObjectiveC headers love so much (elaborated and attributes). I just created JuliaInterop/Clang.jl#524 which once the other 2 are merged, shouldn't be too hard to review. I just rebased and any further changes other than review changes will be for a future PR. |
Currently works for structs and enums in Metal.framework with a few hacks and this branch of Clang devved: https://github.com/christiangnrd/Clang.jl/tree/objectiveC
Any advice, suggestions, or help welcome!