-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement @fastmath
#9406
Implement @fastmath
#9406
Conversation
This should probably not be 36 commits. There's a lot of extra stuff here, a double-conversion tarball and help changes from your |
Here is an example of
As you can see, the |
@tkelman I'll be happy to rebase, but I'd really like feedback first. If you look at the diff then you see that there's nothing unrelated there, except the unmerged |
This is great! With Ex: baz(x::Int) = ....
bar(x::Int) = ....
foo(x::Int) = begin
@fastmath begin
bar(baz(...))
end
end
|
This is great. I would love to have the -ffast-math command line switch as well. |
Please no. We have too many command-line switches. If it can be done as a macro, why does it need to be done as a command-line switch as well? |
i agree that it would be bad to add this as a command line switch, and preferable to require a function to explicitly state that it is amenable to doing fast-math. for consideration, i think the implementation of |
I rewrote the If this design looks good, then I'll clean up the patch to remove the other implementation and make it ready to merge. We can switch to the implementation in #8227 once it is available. The core of the patch here is how to talk to LLVM to generate fast math operators, and specifying which math functions can/should be made fast. How the actual |
Note that the build failures above are caused by an unrelated problem. |
The Travis failure was due to trailing whitespace on comment lines. We should probably be doing that check with a separate service so it shows up as a different status item, and doesn't preclude running the build and tests. The AppVeyor failure was due to an upgrade of the Git version on the build workers (considering the recent vulnerability), but was not matching the previous configuration so the build script could no longer find MSYS Please fix the whitespace so the Travis tests can actually run on this, and preferably doing a rebase next time you update the PR so it's fewer commits and easier to review without looking at all 19 changed files at once. |
A global switch to to turn on fastmath globally, as is provided in many C compilers is not terribly useful. In this particular case, a command line switch to turn off fastmath globally would be useful for two purposes. Without changing the code, the effects of fastmath and not having fastmath can be tested by just starting julia with different command line options, and it is also useful for testing. |
@tkelman I fixed the white space and rebased the branch. I don't think rebasing the branch helped. There are now 22 instead of 19 changed files. A batch of changes to the autogenerated You seem quite keen on rebasing, and on having a small number of commits -- you asked me twice. I understand that that's good to have when pulling a change back into the master. But during review -- does this actually help? I now have to go through the changes above manually and carefully review them, removing those changes that are unrelated and shouldn't be there. I don't think I produced anything that you would find useful. I consider this exercise a waste of time. It's obvious that I didn't go through the commits above while rebasing, ensuring that all of them built and tested fine. That would have amounted to re-developing everything from scratch. (My original commits did, of course.) It's a fact of life that I make my changes to Julia over the course of time, and not instantaneously. My history will thus always include merges from master, and will look messy. There isn't much worth in rebasing things to make it look as if I made my changes instantaneously. I'm happy to squash my changes before the final pull, but for everything else you have two choices:
|
You had to resolve the same conflicts that you were earlier resolving via merge commits. Rebasing or merging are two different ways of updating a pull request against master, I'll stop asking if it's more time-consuming for you but I greatly prefer looking at a pull request that has been updated via rebasing than via merging. When history has a large number of tiny "correct typo" or "undo change" commits, then it's more tedious to look at the individual commits and I'll tend not to look at any of them individually - rebasing also gives you an opportunity to squash those so reviewers don't have to look at the insignificant ones. Looking over the entire diff at once can be a bit much to review when the change is nontrivial. If the pull request is a smaller number of mostly-independent commits, then looking at the individual commits is an effective way to break down the change into smaller, individually reviewable pieces. The development history of how you work on the change is not necessarily the most effective way to present the change for others to look at. But this is a workflow concern that not everyone agrees on, and is less important than the actual substance here. Which seems mostly fine, though in need of more tests to verify the functionality is actually working. |
Writing tests for |
I think a performance test would be tough. How about tests that verify known expressions, where fast math gives an expected different result? BTW, do you have a specific use case for fastmath? Just curious about it, if there is one driving this. |
@tkelman I realize where our workflows differ. If I had started with a branch that contained only fast-math related work, then I could have rebased instead of merged, as you suggested. However, I usually use a master branch that contains several sets of changes, some of them from pull requests. When I implement a new feature, there is thus no "clean start" that would allow this rebasing. I then create a pull-request branch by cherry-picking and then fixing things up. This leads to the messy history. |
@ViralBShah My use case is creating efficient code for numerical calculations, such as solving PDEs on a grid. When doing so, re-arranging floating-point operations can greatly improve performance. When I write down the original code, I do not choose any particular order of operations -- I just write code that "looks simple" or that "mimics the equations". Since there is no order that I prefer, I want to give the compiler all the freedom it wants in re-arranging the expressions. This is similar to |
Thanks for the explanation. How much faster fastmath is in these cases? |
@ViralBShah Here is an example. Somewhat contrived, of course, to make Here is the output:
You see (1) that the output does not change much in this case (not at all, but that's just a coincidence), and (2) that I've used Truth be told: I don't understand this factor of four. This is on my laptop, with an Intel CPU with AVX instructions. The only difference I see in the disassembled output is that |
@@ -79,22 +80,25 @@ static const char *opts = | |||
|
|||
void parse_opts(int *argcp, char ***argvp) | |||
{ | |||
enum optids { check_bounds = 300, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
4 space indent https://github.com/JuliaLang/julia/blob/master/CONTRIBUTING.md#general-formatting-guidelines-for-c-code-contributions
The enum
is a good idea though, despite my general dislike of proliferating command-line flags and the number of different places they need to be handled and documented.
I'd much prefer aiming towards being able to either globally enable/disable compiler behaviors easily and concisely via Julia code, in addition to locally scoped operation like the current macros have been doing. We're proliferating too many command-line flags and it's an unfriendly way to talk to the compiler, especially for people who are using Julia through IDE's (this will likely grow to be a significant portion of Julia's users as those IDE's get better) who have to go to messy lengths to use command-line flags - http://www.reddit.com/r/Julia/comments/2pz8nj/optimizing_and_using_the_inbounds_macro/cn1l8o8 And it's messy code-wise, doing global state in C but very closely related functionality with Julia code. |
I don't care much about what the interface for switching Julia's global state regarding fast-math should be. Switching it at run time is easy to implement -- the compiler essentially has global variables for such state ( However, getting a good semantics is not trivial as it may require recompiling all functions, or tagging functions with these flags. Command-line options at least make it obvious that this state is not mutable at run time. In any way, this can easily be added later. |
On Tue, Dec 23, 2014 at 2:04 AM, Tony Kelman notifications@github.com
Can you give me a specific example for how to squash/merge commits, say for
In this discussion, I think we both agree on the advantages of having a However, if there is a reasonably easy way to reduce the number of commits -erik Erik Schnetter schnetter@gmail.com |
I am reasonably sure that we'll want to squash these commits before merging - but for now, let's have others look at it. Any cleanup can be done close to merge time. |
regardless of how you develop, rebase is an incredibly powerful tool for allowing you to present your change request story exactly how you want. when dealing with git's learning curve, it's also important to realize that every commit is a branch, and similarly, every branch is just a sequence of changes. some of the branches have names, and some of the names have additional metadata, such as which upstream branch the user wants them associated with. but that is all very mutable. you can quickly name a commit (
github is remarkably forgiving about maintaining review comments across rebase and force pushes. and even if you delete or move the code entirely, it maintains it as a "comment on deleted code". i wouldn't worry about it.
if the file wasn't supposed to be part of the commit, you can do |
As much as I'd like to keep the discussion on But then, when you have such a large change, maybe you want to create separate pull requests in the first place? In this case, there are about 200 lines of new code (and very few changes to existing code), plus test cases, plus changes to an auto-generated file. I would be very surprised if that level exceeds the review capability of the average Julia contributor. After all, if you know ... and are there further comments on |
I think this is wonderful. It lowers the barrier of entry into Julia, which is great. Loads of users are not going to want to spend a ton of time optimizing code. The learning curve of mastering multiple macros can slow the adoption of the language. Having something like -O3 in gfortran Having a --fast-math command line interface I think is an important option. I would like to see the command line be accepted as it traditionally is for compilers. IDEs can adapt to Julia like IPython did with success. As Julia gains in popularity, this should be inevitable. |
Looks great, sorry about all the back and forth earlier. I guess merge this first, then #9482 can be rebased to take it into account? AppVeyor |
I rebased again to trigger AppVeyor again. |
Would be great if @ArchRobison can take a look. |
1f97e3f
to
4e38b84
Compare
7289f72
to
605f065
Compare
ping? |
Does this need a rebase after #9732? Otherwise I'll merge. |
Looks good to me. Was Julia's domain-checking for |
function deriv!(u, du) | ||
n = length(u) | ||
dx = 1.0 / (n-1) | ||
@fastmath @inbounds du[1] = (u[2] - u[1]) / dx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if it's worth showing the trick of wrapping several statements in a begin
...end
pair so that the @fastmath @inbounds
only has to be written once.
Thanks @ArchRobison. I think you are right that it would be good to update the docs, but we can do that later. |
Thanks @eschnett! |
Documentation note: for future reference, you want
not
The latter is an error in Sphinx. Fixed in 3c0d8ed |
This implements a macro
@fastmath
that works similar to@inbounds
: expressions evaluated this way have LLVM's "fast-math" flag set, which is equivalent to calling clang with the-ffast-math
option.I experimented with two different ways to implement this. Both work, but obviously only one of them should be used, the other should be removed again.
Implementation 1: Math functions such as
+
,-
have fast-math equivalent calledadd_fast
,sub_fast
, etc. These have their own intrinsics. A macro@fastmath
(easy to implement, but not yet done) would convert math operations to their fast form. Advantage: This gives programmers full control over when to use fast math operations, and when not. Disadvantage: The macro needs to convert a parse tree, which may be slow. I believe@simd
goes this way.Implementation 2: The Julia compiler maintains a global state variable that describes whether fast-math is in effect or not. The
@fastmath
macro (this is how it is currently implemented) toggles this flag. Advantage: Very fast. Disadvantage: Uses global state, I can't tell whether the toggling works at the right times. This is equivalent to how@inbounds
is working.Please comment, both on the general idea of having
@fastmath
(I believe there is consensus that this is a good and necessary thing), as well as on the implementations. I will continue to update the branch with suggestions.