-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Write down what we know about Base.@pure #39954
Conversation
Is there a typo in the title? Side note: I don't personally like the syntax |
Our current algorithm ("ask Jameson") has scaling limits, so this is a welcome improvement! |
For reference: #27432 I still maintain my opinion that it should be |
base/expr.jl
Outdated
1. A pure function must always return exactly (`===`) the same result for a given input. | ||
If the return is a mutable struct this means it must always return the *same* object. | ||
2. A pure function cannot be extended with new methods after it is called the first time. | ||
3. A pure function cannot recurse (i.e., call itself). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typejoin recurses, and that seems to be okay
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed mention to recursion until we are sure in which conditions it is allowed/disallowed.
base/expr.jl
Outdated
This also means a `@pure` function cannot use any global mutable state, including | ||
generic functions. Calls to generic functions depend on method tables which are | ||
The criteria used by Julia to deem a function pure is stricter than the one | ||
used by most other languages, and incorrect `@pure` annotation may introduce |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"other languages" is a weasel word that means nothing. Our definition of pure is (currently) weaker than LLVM's, for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed the unnecessary comment. Now it is just "very strict".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if instead we should go a slightly different direction here and note that unlike "other languages", we don't require the user to mark a contexprs to tell the Julia compiler something it already knows? So that typically it should not be needed, but that instead using @pure
is often used as a means of telling the compiler something it knows is false.
This wasn't true a number of years ago, but when we encounter new cases, we tend to try to add them to the compiler instead now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good thought. I will make that change.
* Removed (at least for now) the mention about recursion. If we are sure it is only allowed in some specific circumstances we should add it back. * Stopped comparing Julia criteria with other languages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am re-editing the inline doc for @pure
. Half of the changes are to promote clarity so faithful translations are more easily accomplished. Additional edits are for accuracy while keeping these inline docs sounding as other inline docs for macros do in Base. The proposed revision will be available here -- in my next review entry.
@pure
is an annotation that asserts a simple function
is stateless, branchless, and entirely self-contained.
The compiler will rely on these implicit assertions
to best allow type inference to pass through the function,
and properly connect inferencing into whatever follows.
This macro is not exported. There are few situations
where it would be reasonable to declare one of your own
functions @pure
. For example, this should not be used
before you have demonstrated that there is an optimization
or a type inference important to your design which is not
occurring, and using the annotation corrects the issue.
Sometimes @pure
is the most appropriate way to prevent
the compiler from applying some misconstrued transformation.
That use is rarely needed. If you encounter something
similar, please raise an issue to inform the core developers.
A @pure
function must be very simple and stateless,
and stateless implies the absence of errors and
and no throwing of exceptions. Pure functions must be
"self-sufficient" with regard to states and state transitions.
Nothing from outside of the function may induce change
to its activity or influence any internal behavior --
regardless of how inconsequential the influence or how
unnoticeable the behavior.
Given the same argument values,
a pure function always returns the same result.
If the result is a mutable struct, then that very
same instance must be returned given the
same argument values.
Declaring a function @pure
when it does not meet
the requirements and follow the restrictions below,
may introduce compiler errors without warning.
-
pure functions always return the same result given the same arguments
- with the same arguments given
- returning a mutable object means returning that identical object
-
pure functions that use another function must use built-ins only, no generics
- generic functions use external state (they dispatch through method tables)
julia> tuple # ok to use inside an `@pure` function
tuple (built-in function)
julia> length # do not use inside an `@pure` function
length (generic function)
-
once a pure function has been called, it cannot be extended to support another signature
-
?open for additional pure informatives?
Some comments from a julia application (not core) programmer: I stumbled upon @pure in a situation with unsatisfactory performance - compiler did not eliminate a constant subexpression as expected. Doc on @pure warned me it-s dangerous, but did not guide me in correct use, and did not help bridging the gap between my "common understanding" of a pure function and the semantics and pragmatics of @pure in julia. The current proposal of JeffreySarnoff adresses practical use questions very well. Concerning the difference to common understanding, there are open questions. I agree a formulation "stricter than in other languages" is too vague. What about comparing to wikipedia definition? It stresses two criteria: (1) welldefinedness (2) no side effects. From the former discussion I know (2) is the critical part. in particular concerning method tables. The proposal states very strictly "must use built-ins only, no generics". This restriction assures safety of @pure use, but is too restrictive: all @pure annotated functions in julia base call generic functions (I verified that on julia 1.5.3 source). My impression is: it is very difficult to formulate the exact preconditions on calls of generic functions in @pure annotated functions. Hiere is my draft for the chapter on generic functions (feel free to modify/correct):
|
@JeffreySarnoff I have no problem in changing the text to what you have suggested, but I really think we need to address the inconsistent usage of Another doubt of mine, cannot you use a |
Given two hyperpure functions, one could be used within the other.
[note:
the function Base.fieldnames(Type{NamedTuple}) is not declared to be pure Here is a Base function annotated
on line 2 the logic reaches outside of the edges of the function itself to bring inside its operation the current value of an external variable, a value that this function itself may alter as an application runs (see next). on lines 5 and 9 the logic pushes its internal state outside of itself to update an external precision. on line 7 there is a call to a nonpure external method |
additional tidbits The most deeply woven pure functions are those that the compiler knows to be pure as soon as it wakes up.
(from ast.md): (a) any function known to be a pure function of its arguments (b) [struct MethodResultPure is used to] represent [that] a method result constant pure functions inside
|
I have read this old discussion on the purity of Rational but to me it does not seem to reach a clear conclusion. Is it a bug that Rational ignores future changes to |
Well -- @vtjnash is this (and other) season's final arbiter of which uses of I have collected all appearances of @pure that I found in Julia's github latest. There could be some opportunity to prune and replant more cohesively -- otoh, what was declared pure was done purposefully. I will clusterify those uses so we have a place from which to take the long view. |
This comment has been minimized.
This comment has been minimized.
I find it easier to start with constructive descriptions -- Are there specific uses or recognized situations in which use of these macros is encouraged? Looking around Base, it appears that functions using (gentle segue) The functions that are declared |
What does "extended" mean here?
If code above is a counterexample demonstrating invalid use of @pure, I think the consequence is that a @pure function must not be defined with free type parameters in its signature like N, T in the example. Maybe @nospecialize could cure that: my understanding is that @nospecialize tells the compiler not to generate methods with concrete types in its signature everytime it encounters a new signature of concrete types, but to compile exactly one method which has exactly the signature with abstract types and restrictions as stated in the function definition, at the price of adding type information to actual parameters in every call of the function at runtime. |
Strictly speaking, [almost] none of the "rules" is a rule. Until we have completed this foray to know what's what, our proposed guidelines for effective and safe assertions of functional purity is virtually unreadable. For the moment, we are better served by considering current guidance as best-practice to avoid bad practice. There is no requirement that breaking current guidance must cause havoc, nor that havoc caused be havoc revealed.
for the docs: Some strongly disreccomended uses of |
In the first place, I think the (pre-)requirements for applying In other words, the documentation has two purposes: to prevent inadvertent misuse of Edit: |
thank you for cutting through the foliage. Your point is well taken -- we must do both, and we certainly may do one and then the other. Our interaction has lead us to ask better questions. Respecting the process, let's ask them. @vtjnash we have better questions and an improved strategy.
(a) clear and direct guidance that protects the Julia's runtime integrity from a developer's misunderstanding. (b) tighter, more pithy hand-holding for developers who would use (c) When the following occur and is likely fixable with the use of
|
I am wondering what the practical utility of documenting
In other words, you should not be using it in a package unless you follow the development of the language very closely to catch changes, and/or are willing pin to a specific version of Julia. It is documented to a sufficient level already for internal use. |
I think this is provably false: basically all uses inside |
I have to admit, however, that I feel like the text should start with a disclaimer saying if you use it outside of |
I am not so sure about this — quite a few of them seem valid (there are around 20 in total in I am wondering if in fact |
Well, you can point them to @rryi, that said above:
The emphasis on "all" is not mine. |
To be fair, the current documentation was not added by one of the persons that regularly work on the compiler. Which kind of proves the point, adding a bunch of docs to this part of the compiler internals demonstrably causes more confusion than assistance. Personally, I think the current docs should just be removed since it is wrong, and wrong docs are worse than no docs. Edit: #40092 |
Simple grepping for Base.jl:148:@pure sizeof(s::String) = Core.sizeof(s)
strings/string.jl:98:@pure ncodeunits(s::String) = Core.sizeof(s) which I guess should be OK. In any case, if a usage of But I think that the sufficient conditions enumerated above are too strong — some code in (incidentally, please let's try to quote |
If we deem that is impracticable to give someone without years of experience in the compiler a short/clean guideline on how to use
|
@rryi probably has confused
If this is the case, this is, |
It already says it is for internal compiler use. #40092 gets rid of all the guessing. |
I guess I will wait to see what the core developers think it is the best route before putting more effort on this PR. Keeping it documented enough just for nobody outside the compiler development to use it, or trying to give a usage guideline. |
I think the truth of the matter is that there are very few uses of |
and for when the compiler is inferring (or deferring) incorrectly where the insertion of |
For people arriving at this thread now, the decision leaned in favor of "Keeping it documented enough just for nobody outside the compiler development to use it" and it was done in PR #40092; some annotations of pure are recently being removed by #40097. And it seems clear that the decision of marking a function as |
The discussion about what is the correct
@pure
usage is not new. A search on Discourse brings many over the years. Unfortunately, the knowledge of the exact criteria for (safely) using@pure
is inside the mind of a few, and I believe would be good if anyone could just point to the docs (or even better, the interested person arrives at the docs and do not open a thread in discourse) instead of the current continuous overhead for these same members of the community. The last such discussion is this one, and it sparked this PR.Note this PR does not propose a new name for
@pure
this can be done after by another PR, the focus is making the criteria for safely using@pure
as clear as possible.The initial commit already compile some common advice found in the discourse. However, I think it would be important to answer the questions brought up by me here. I think it is important because the only criterion the current
@pure
documentation established (and insisted on) is no generic functions. However, a quick search onBase
shows@pure
functions using some generic functions like+
,max
, and so on. This leaves users confused: "Is generic here being used here as the opposite of built-in, or it means something else? Are some generic functions basic enough they can be used? If I am 100% sure of the method being called is it ok? The problem is not generic functions themselves but they being extended with new methods? Any extension is a problem or just the ones that may change the selected method inside a@pure
function?" As it is very hard to write an useful function without any generic functions, andBase
examples already break the rule, users for which@pure
have shown to have a positive effect insist in knowing if their use was, or not, safe; or even if it is not safe, what are the conditions that may trigger incorrect behaviour and what exactly such incorrect behaviour would consist of.