Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate MTL and MPS structs and enums with Clang.jl #492

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

christiangnrd
Copy link
Contributor

@christiangnrd christiangnrd commented Dec 10, 2024

Currently works for structs and enums in Metal.framework with a few hacks and this branch of Clang devved: https://github.com/christiangnrd/Clang.jl/tree/objectiveC

Any advice, suggestions, or help welcome!

@maleadt
Copy link
Member

maleadt commented Dec 10, 2024

Even if only for enums etc, this would be pretty useful! I imagined having to write our own Clang.jl-based generator in order to generate ObjC calls, but this would probably be a good first step.

@christiangnrd christiangnrd force-pushed the wrapperexplore branch 3 times, most recently from 21b2dd0 to 4b4ed49 Compare December 10, 2024 19:22
@christiangnrd
Copy link
Contributor Author

I imagined having to write our own Clang.jl-based generator in order to generate ObjC calls, but this would probably be a good first step.

I think most of it can be upstreamed to Clang.jl. The changes I've done on my branch are either bug fixes or definitions of the Generators types for Objective-C constructs.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Metal Benchmarks

Benchmark suite Current: e47ff83 Previous: 5056e33 Ratio
private array/construct 26472.25 ns 25920.083333333336 ns 1.02
private array/broadcast 460958 ns 464792 ns 0.99
private array/random/randn/Float32 803541 ns 811812 ns 0.99
private array/random/randn!/Float32 660875 ns 673333 ns 0.98
private array/random/rand!/Int64 556041 ns 553250 ns 1.01
private array/random/rand!/Float32 591354.5 ns 601000 ns 0.98
private array/random/rand/Int64 761250 ns 763896 ns 1.00
private array/random/rand/Float32 636666.5 ns 616375 ns 1.03
private array/copyto!/gpu_to_gpu 670291 ns 660854.5 ns 1.01
private array/copyto!/cpu_to_gpu 822166 ns 619854 ns 1.33
private array/copyto!/gpu_to_cpu 591125 ns 834687.5 ns 0.71
private array/accumulate/1d 1345146.5 ns 1328542 ns 1.01
private array/accumulate/2d 1401125 ns 1387334 ns 1.01
private array/iteration/findall/int 2106875 ns 2070500 ns 1.02
private array/iteration/findall/bool 1843916 ns 1824916 ns 1.01
private array/iteration/findfirst/int 1700958 ns 1682146 ns 1.01
private array/iteration/findfirst/bool 1676417 ns 1637959 ns 1.02
private array/iteration/scalar 3408875 ns 3891833 ns 0.88
private array/iteration/logical 3199459 ns 3177708.5 ns 1.01
private array/iteration/findmin/1d 1775959 ns 1740229 ns 1.02
private array/iteration/findmin/2d 1346750 ns 1343375 ns 1.00
private array/reductions/reduce/1d 1044979.5 ns 1035708 ns 1.01
private array/reductions/reduce/2d 664542 ns 651667 ns 1.02
private array/reductions/mapreduce/1d 1039292 ns 1037167 ns 1.00
private array/reductions/mapreduce/2d 670541 ns 657917 ns 1.02
private array/permutedims/4d 2572375 ns 2537625 ns 1.01
private array/permutedims/2d 1021666 ns 1022708 ns 1.00
private array/permutedims/3d 1600749.5 ns 1577291.5 ns 1.01
private array/copy 582667 ns 621750 ns 0.94
latency/precompile 5770735750.5 ns 5243893542 ns 1.10
latency/ttfp 6564931562.5 ns 6538101604 ns 1.00
latency/import 1172940125 ns 1165440583 ns 1.01
integration/metaldevrt 707354.5 ns 705833 ns 1.00
integration/byval/slices=1 1581729.5 ns 1588833.5 ns 1.00
integration/byval/slices=3 11089208.5 ns 10079959 ns 1.10
integration/byval/reference 1576209 ns 1568145.5 ns 1.01
integration/byval/slices=2 2634791.5 ns 2643625 ns 1.00
kernel/indexing 456291 ns 444959 ns 1.03
kernel/indexing_checked 462834 ns 446541 ns 1.04
kernel/launch 8166 ns 10666.666666666666 ns 0.77
metal/synchronization/stream 15000 ns 15167 ns 0.99
metal/synchronization/context 15250 ns 15833 ns 0.96
shared array/construct 27764 ns 25503.5 ns 1.09
shared array/broadcast 475500 ns 475125 ns 1.00
shared array/random/randn/Float32 829791.5 ns 752770.5 ns 1.10
shared array/random/randn!/Float32 666417 ns 655458 ns 1.02
shared array/random/rand!/Int64 570645.5 ns 560916 ns 1.02
shared array/random/rand!/Float32 591542 ns 598459 ns 0.99
shared array/random/rand/Int64 783292 ns 774208 ns 1.01
shared array/random/rand/Float32 647541 ns 629583.5 ns 1.03
shared array/copyto!/gpu_to_gpu 88834 ns 85500 ns 1.04
shared array/copyto!/cpu_to_gpu 88875 ns 91792 ns 0.97
shared array/copyto!/gpu_to_cpu 78916 ns 77917 ns 1.01
shared array/accumulate/1d 1355104 ns 1347021 ns 1.01
shared array/accumulate/2d 1377334 ns 1383874.5 ns 1.00
shared array/iteration/findall/int 1815166 ns 1783167 ns 1.02
shared array/iteration/findall/bool 1583292 ns 1585458 ns 1.00
shared array/iteration/findfirst/int 1391896 ns 1392020.5 ns 1.00
shared array/iteration/findfirst/bool 1367375 ns 1354916.5 ns 1.01
shared array/iteration/scalar 159834 ns 162042 ns 0.99
shared array/iteration/logical 2976229 ns 2968958 ns 1.00
shared array/iteration/findmin/1d 1471125 ns 1461729.5 ns 1.01
shared array/iteration/findmin/2d 1367542 ns 1364792 ns 1.00
shared array/reductions/reduce/1d 729729.5 ns 730458 ns 1.00
shared array/reductions/reduce/2d 677979 ns 656958 ns 1.03
shared array/reductions/mapreduce/1d 744062.5 ns 746250 ns 1.00
shared array/reductions/mapreduce/2d 675271 ns 660562.5 ns 1.02
shared array/permutedims/4d 2566270.5 ns 2528541.5 ns 1.01
shared array/permutedims/2d 1011000 ns 1025917 ns 0.99
shared array/permutedims/3d 1598583 ns 1580896 ns 1.01
shared array/copy 244645.5 ns 242437.5 ns 1.01

This comment was automatically generated by workflow using github-action-benchmark.

@christiangnrd christiangnrd changed the title Explore wrapper generation with Clang.jl Generate MTL and MPS structs and enums with Clang.jl Dec 11, 2024
@christiangnrd christiangnrd force-pushed the wrapperexplore branch 8 times, most recently from 0d8492d to 2fa5f8a Compare December 16, 2024 17:08
@maleadt
Copy link
Member

maleadt commented Dec 17, 2024

Please ping when this is ready for review; you've marked it as such, but more commits keep appearing 🙂

Also, were the necessary changes to Clang.jl merged?

@christiangnrd
Copy link
Contributor Author

Please ping when this is ready for review; you've marked it as such, but more commits keep appearing 🙂

I do have a terrible habit of refining my PRs after marking ready. I'll be more mindful of that in the future.

Also, were the necessary changes to Clang.jl merged?

Not yet, JuliaInterop/Clang.jl#519 and JuliaInterop/Clang.jl#522 are general improvements/bug fixes to handle the types of enums that Apple ObjectiveC headers love so much (elaborated and attributes). I just created JuliaInterop/Clang.jl#524 which once the other 2 are merged, shouldn't be too hard to review.

I just rebased and any further changes other than review changes will be for a future PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants