-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reviving proper tail calls? #23
Comments
Since code using proper tail calls to avoid blowing the stack isn’t web-compatible, and would only work in Safari - which would imply that nobody is shipping it - how would you expect to see complaints about the implementation? |
@ljharb Tail call optimization applies even if the programmer didn't intend for it to apply. So let's say somebody debugged their code in Safari, and their debugging experience was fine (despite the unintended tail call optimization). Obviously they would not complain. On the flip side, if somebody is debugging code in Safari and they experience weird behavior (such as call stacks being missing), then they would complain! So the lack of complaints is (a little bit of) evidence that tail call optimization does not harm the debugging experience, and thus is acceptable to implement in other browsers as well. |
I'd argue that, as far as debugging is concerned, accidentally optimized tail calls are the only ones that matter. People who use them deliberately would not complain, they know what they are doing. A third case would be someone who relies on a 3rd party library that uses tail calls internally. It may make it harder to debug said lib if it were buggy. If it is a real problem in practice, people will avoid such libs and give them a bad name (I'd be surprised if it happened though.) |
Another thing I see raised against proper tail calls is perf issues. They can be made fast in LuaJIT where the equivalent of function _while(condition, body) {
if (condition() && !body()) { // return true in the body to break
return _while(condition, body);
}
}
let a = 0;
_while(()=> a < 1000000000, () => {a++});
console.log(a); is as fast as a let a = 0;
while (a < 1000000000) a++;
console.log(a); (and that has been the case since 2010). What does JS have that makes the code above impossible to optimize? Edit: FWIW, Safari is almost as good as LuaJIT here.
Putting both benchmarks in the same tick has the plain while loop slower than the functional version.
Edit2: well LuaJIT has both versions at 150 ms, actually :-) But still, the overhead of the functional verison over the plain loop is rather small in Safari. |
@ljharb as I've pointed out multiple times, including in #4, tail calls are not web-incompatible. I have code shipped in production (that I wrote years ago) using that "adaptive" pattern, which ostensibly is working fine (as far as I know anyway) in non-PTC browsers, and in Safari has "progressively enhanced" to using tail calls. Whether that pattern is common or not is a separate discussion. It is in fact web compatible and should therefore not be continually dismissed on such grounds. |
Also, @Pauan is correct here:
There must be tons of code (especially non-recursive) out in the wild that is unintentionally being tail-call'd by Safari. One would have to assume that if tail calls were such a problem (either for performance or debugging), then surely by now enough of a complaint would have been raised in the community as backlash against Safari, we'd certainly know it. I'm not aware of any such backlash against Safari, and they clearly haven't changed their minds to back it out. More than a year in production strongly suggests the resistance to PTC was at least partially FUD and not grounded in fact. |
@getify you phrased it better than I did, and I too think that the resistance is misguided, but I'd avoid throwing arounds terms like "FUD" (which I associate with wilful disinformation), it will not help our cause. |
@getify your code is not "web compatible PTC code". It's specifically web-incompatible PTC code, combined with a fallback mechanism to the true web-compatible code. By saying PTC is not web compatible, I'm not saying that you can't write code that utilizes PTC. I'm saying that PTC code without a fallback is web incompatible. The resistance to PTC is not solely based on debugging concerns; there's also the education aspect of implicit tail calls having magic behavior, but also, there's implementation concerns, where at least one of the major browsers is unable to implement PTC at all. |
@ljharb seems like nitpicking on words. you used "not web compatible" as an excuse to dismiss the possibility that anyone may have deployed PTC code (or something demonstrably like it) and thus that was the reason for "silence" in terms of complaints. Not only is my code pattern a counter-example that excepts that reasoning, but you also don't account for web code that's affirmatively using PTC (for recursion or otherwise) but is deployed in a safari-only way, such as in web views in iOS apps. I still maintain your reasoning is faulty, no matter what label we bikeshed on to label it. |
@ljharb I'm surprised that the architecture of one of the VMs is incompatible with PTC. I'm going to suppose it is Edge we're talking about, since Brendand Eich had started working on the Firefox implementation (I assume he had a general idea of the path forward when he wrote the parser patch), and V8 had it done before it was unshipped. |
TL;DR: +1, but with some more details:
|
Can it be implemented vice versa? Use proper TCO by default, and in case of greater need use some explicit keyword/expression/directive what else like "education aspect" just makes me laugh, in that case, we can never add anything to standard because people will need to learn something, we should never create any new tools, libraries, frameworks etc, it will take an effort to learn them! In reality, anyone who will face with "broken" call stack will spend 30 seconds to read first entry from google search and will find solved question on StackOverflow with lots of links, and people who don't like to learn anything will still make mistakes like they do right now with any other language features. "some people won't like to learn something new" - it's totally not a valid reason to take it into consideration. Only reason which can be valid, is the complexity of implementation in browser engines. But if it's not an issue - all other issues can be solved without introducing explicit keyword "optimize me please". |
I've thought about a similar path, too. For example, for any libs that want to collect (error).stack in the background, we could have a static boolean property like Error.stack to opt-in to full stack traces. This isn't really a deopt like stopping tail calls, but just indicating to the engine that the shadow stackframes should be collected. If it's not set, it's up to the engine if they want to collect, but if it's set the engine knows the page's code wants that data if possible so it should collect if it can. For the open devtools use case, it could automatically set that same flag to trigger the same behavior. |
TCO is just an optimization, stacktraces which are smaller in size than the limit of a stack in browsers which is usually few thousand may not be optimized at all, the optimization should imho trigger when the stack reaches this limit. At this point stack would be overflown anyway with exception. Usecases where TCO is useful are when code is running recursion over huge number of function calls something like billions or possibly infinity. |
@Odrinwhite disagree. PTC (not necessarily TCO) is important even without recursion. Recursion is an obvious and acute use case but certainly not the only one. I want (in FP) to make multiple wrapping layers of functions on top of functions, preserving parameters at each level, via closure, and I don't want there to be a deeper call stack as a result. In that perspective, function wrapping calls being exposed in the call stack is only extra noise for debugging and is thus a leaky implementation. |
@Odrinwhite The strategy of "wait until the stack overflows and then activate TCO" is used in Chicken Scheme (and other compilers as well). It is a valid way of implementing TCO, so browsers can use that implementation strategy if they wish. But that has nothing to do with the spec: the spec does not specify how browsers achieve TCO, it merely mandates that they use some method of achieving TCO. @getify That can be easily handled by having the browser record which frames are tail-call frames, so it can hide them from the debugger. So there's nothing stopping browsers from implementing the Chicken Scheme style of TCO. I doubt that browsers will actually do that, since there are downsides to it, but they could if they wanted to. |
@Pauan of course, I don't only care about the debugger output... that was just one convenient illustration of a non-recursion use case. What I actually want is all those wrapped FP functions to not actually cost function calls. And BTW, "cost" is not as much wanting to save CPU cycles in those calls, but wanting to minimize all the memory/GC churn of all those extra stack frame allocations/deallocations. IMO the entire theory of FP rests on being able to assume there's no "penalty" (memory or otherwise) for an extra wrapping layer of function. In JS we just whistle along, merrily wrapping all these calls, and just winking and nodding to pretend it's OK... that what is conceptually one function call is actually 9 calls... because the cost of the extra 8 calls is generally still small so who cares? But as soon as you wire up such a wrapped abstraction as a transducer, and hook that up to a continual stream of data events, each one firing every few hundred milliseconds, and run that on a user's mobile device for hours on end, you start wondering if FP is actually worth all that. memory burn. I'm just trying ensure it's on record, because it seems all too easily lost/ignored/overlooked in these discussions, that the need for PTC goes well beyond recursion. You can "solve" the lack of tail call recursion with a trampoline and even a code transform or macro... but you can't prevent all my extra function wrappings in FP from churning memory... unless you give me PTC. Sorry I'm being so persistent/insistent in this thread. It's frankly quite frustrating to have spent basically 5 years campaigning for this vital feature and yet people still casually discount it as niche or suggest narrow work-arounds that aren't relevant to the broader perspective. |
@getify Don't get me wrong I would like to see your version implemented but I think the argument that it breaks the web may be valid here for some people. And not breaking web was a big deal in designing ES6 so my proposition is to get the foot into the doors and allow people write TCO in some code which may not need to be very efficient but still fun to do. And with more people trying and using this style of programming a better or more aggressive solution can be established. |
@Odrinwhite proper TCO doesn't break the web itself. Only some ways of implementation break it, and only in some cases. So instead of rejecting standard, it's better to find other ways of implementation, which will be acceptable. One proposal was to give developer possibility to deoptimize code on demand and see a proper stack trace. We will have proper TCO, developers will be able to check stack trace. Why not to go this way? |
That is not guaranteed by the spec, the spec only guarantees amortized behavior. In general the spec doesn't say much about performance. Also, stack allocations/deallocations are not the same as heap allocations: stack allocations are not managed by the garbage collector, and they are generally free (zero-cost). There is no performance penalty to using the stack. Things get more complicated with JS, because browsers might have shadow stacks, or put additional information onto the stack/heap, etc. But the general principles of stack vs heap still applies. Also, trying to guarantee no-extra-stack-frames-ever can actually make the performance worse, not better. Performance is complicated.
That's not how Chicken Scheme works. It doesn't actually pay any performance penalty for the function calls. It doesn't say "oh we'll create extra stack frames because the cost is low". Instead it says "we do normal function calls without any performance penalty, but then once in a while (when the stack overflows) we have to do garbage collection of the stack (which is slow, but not as slow as heap garbage collection)". It's similar to how vectors/arrays work: when you insert elements into an array, most of the time it is extremely fast. But sometimes it has to resize the array, which is very slow. The idea is that because the resizing happens very rarely (usually exponentially rarely), it "averages out" to be very fast in practice. The same principle applies to hash tables. Amortized performance is extremely common.
I agree with that: PTC is not merely an optimization, it is a feature of the language which enables new kinds of programs to be written (including, but not limited to: state machines, CPS transforming compilers, new flow control constructs, call-with-continuation, async code, etc.)
As far as I can tell, TCO/PTC does not break the web, as indicated by Safari. The resistance to PTC is not about breaking the web, it is because PTC makes the JS engines more complicated, and it possibly has some performance penalties as well. |
It seems to me that it will be helpful to clarify the purpose of tail call elimination: the main goal is not to reduce runtime, but to reduce memory use --- and the actual stack frames are not the main concern there. When I write this code:
then having PTC means that when [This is the main thing that Guido missed (or more likely chose to miss) in his shutting down of tail calls --- the example he uses makes it a requirement to keep It's also interesting to see Guido's initial bottom line, "After all TRE only addresses recursion that can easily be replaced by a loop", is something that is clearly questionable to modern JS programmers (for example, trying to convert a tail-recursive promise loop into a |
Here is one possibility for developers to use fast tail calls today, without needing extra keywords in JavaScript: https://github.com/glathoud/fext |
recent jslint versions have 2 tail-call "bugs" where it raises RangeError in certain cases in v8/nodejs/chrome (but not safari) [1], [2]. these bugs were closed by the author as "wont fix" (under assumption bugs will go away when engines eventually fall inline with ptc-spec). will that assumption happen in the forseeable future? yes/no/maybe? an authoritative answer would provide clarity/guidance to development-roadmap of jslint. [1] jslint-issue - RangeError when linting excessive leading-newlines [2] jslint-issue - RangeError when linting json with large string-values |
oh also, firefox doesn't seem to have jslint tail-call issues under normal use-cases - because it has a ridiculously large callstack-depth limit (~100,000 vs. chrome's measly ~12,000). you can get these numbers by running the script below in the respective browser's console. maybe a happy-medium/short-term solution is for v8/chakra to raise their callstack depth-limit to the same level as firefox? function computeMaxCallStackSize() {
try {
return 1 + computeMaxCallStackSize();
} catch (e) {
// Call stack overflow
return 1;
}
}
computeMaxCallStackSize() |
Why would you use jslint, though. |
@slikts because eslint is too slow on the ultra-portable laptop i use while backpacking. javascript product-development should not follow java's model where the only reason you need powerful laptops (with poor battery-life) is because of bloated-tooling. also i don't have strong feelings about PTC either way. i'm just unhappy that jslint is currently in a state thats not ideal for production-use, and wish its author and nodejs/tc39 can come to some compromise to resolve that. raising v8's stackcall-limit to firefox's seems like something reasonable to me. |
This repo isn’t the place to complain about, or seek change in, any of those things. jslint’s author by their own admission doesn’t care about web reality, only what’s in the spec; file a bug on Firefox if you want them to implement something; use something other than eslint if you like, just be aware that eslint is the de facto standard for “product-development” in the entire JS community. While PTC is in the spec, STC (this proposal) is untenable, and there’s nothing of use to discuss here. |
@ljharb, my understanding is this thread is about PTC (not STC), and i'm bringing up a PTC issue in the wild. |
The thread is also not appropriate for this repo, since this entire repo is about removing PTC and replacing it with STC. You’ll have to to take it up with each engine that has chosen not to ship PTC if you want PTC “revived”, or with the one engine that has if you want it removed. |
That is not actually what's required. Here's the relevant part of the spec:
First, this doesn't require elimination of a frame. It only talks about freeing up resources. That's a subtle but important difference, which implementations like Safari seem to have taken to allow things like the Shadow Stack. But further, I maintain that rather than being concerned about the letter of the law, the spirit of the law is the most important thing here. What the spec is getting at is less that there must strictly not be any extra resource allocation, but rather that the allocation growth must be O(1) instead of O(n). I would like TC39 to reword this section in this respect, as it would clear the way for lots of other creative ways of accomplishing the bigger goal of tail-calls, which is that I could make any depth of call stack in my program without fear of running my device out of memory. That's why people want tail calls, not that they explicitly want every single non-essential stack frame thrown away. |
python's creator [in]famously rejected TCO in a 2006 blog-post [1]: "This is also the reason why Python will never have continuations, and even why I'm uninterested in optimizing tail recursion." it caused some brouhaha back in the day (similar to what's going on in this thread) [2]. [1] original-quote from python-creator's 2006 blog-post "Language Design Is Not Just Solving Puzzles" [2] backlash to [1] on [Python-Dev] mailing-list [3] python-creator's response to backlash (with further community "feedback" in the comments-section) |
@kaizhu256: it might be that some promises are forced in async-ly in a different tick and therefore there is no problem with tail calls, but that might not always be the case. For example, this code (or some similar variant):
should be a loop even though there's no such deferring. But more generally, in the simple case of
I'm relying on @concavelenz: (1) yes, that was the point I made later: losing bindings from tail-calles is the feature of reducing resource use; (2) if there are "ways to record context", then wonderful -- tail calls could use the same tools. For (3), quite the opposite. The statistical sampling fits perfectly with tail calls, if Finally, in those @concavelenz / @kaizhu256: yes, it's that mess that I'm referring to, but IIRC the relevant post had some "final" in its title and that's when Guido dropped a concrete lid on it. |
@getify I fully understand why folks want tail recursion. The text requires that everything be reused or released, nothing remains of the stack frame. The implementation of the shadow stack has nothing to do with that text. @elibarzilay RE: (2) recording the context for every function that has a call in the tail position is impractical, as it involves recording the stack trace, which is expensive. Doing this for timer, which are far rarer, is sometimes reasonable. RE: (3) You imply that a you only care about the runtime of a function, but in my experience how you reach the function (why it was invoked at all) is usually as important that the cost of the function proper. |
Well, you brought it up... It can be expensive, but you have mentioned the optional shadow stack thing... And in that context I obviously cared only about runtime, since I was talking about one common concern about the usability of flame graphs. (And further, in a statistical profiler context you generally don't care about precise information which is why the random stack polling works fine.) The "why it was invoked" can be addressed using tools that were mentioned many times in this thread; tools that can be used for the sake of skeptics who don't believe functional programmers who say that in practice this is not a problem. (Discussing why it's not a problem in practice will further derail this thread so I'll avoid doing that.) |
This thread is very unfortunate that nothing we (programmers) can do except waiting. But I still want to share my thought about PTC vs STC.
Example: function f() {
if (x > max) return
++x
return f()
} This is a valid PTC code. Consider if we introduce consistent-return rule, ESLint will complain. How to fix? Someone will just change |
'consistent-return' seems a little misguided here. Proper type checking reveals that the types remain consistent in your example. I regularly notice when my code would benefit from TCO, and it's disappointing to know that there's this misguided resistance to it. There's a long history to TCO, and its absence in JS is a serious deficiency. |
What more, the technical reason for not adding them (the ChakraCore engine having the wrong calling conventions hardwired) is soon going away since MS is co-opting Chromium. The other technical problem isn't really one (cross-realm calls can't be eliminated in Firefox) since that can be spec'ed around. |
Last I heard, ChakraCore will continue to exist as a JS engine for use in any number of projects not including a browser, so that point doesn't quite work out that way. |
Once PTC starts to spread in practice due to its actual availability, and it eventually becomes a problem to identify TCs in code, IDEs will very quickly kick in with syntax highlight help - just as they do with dead code by fading it. |
@sarimarton Not everyone in a team use same IDE/editor. And lightweight editors, tools may not identify TCs easily without full AST parsed. |
i write code exclusively in vim (and use a customized version of jslint that allows ignoring foreign code-blocks) |
I have a feeling that we might be able to make more progress on explicit tail call syntax in JavaScript. If everyone able to think through and agree on a syntax, I think it'd be worth bringing back to TC39 to see if we can get consensus on pursuing this proposal. |
@sarimarton, yes -- identifying tail calls is in most cases extremely simple, almost to the point of regexp-ing it. With a parser it becomes straightforward. @littledan, the problem is not in agreeing on a syntax -- the real problem is the fact that "opt-in" syntax is going to nullify much of the point of tail calls. See the last bullet in my original post above. |
Well, if we can't agree on a syntax, I imagine the current situation will remain as is. |
@littledan, please see that comment that I pointed to: the problem is not the syntax, it's the fact that you'll need to modify code to "opt into" a tail call, which would lead to situations like what I describe there. (IOW, this is an objection to any explicit syntax for tail calls.) |
I'm still not clear on why that's an obstacle - libraries needing to change their code is a temporary problem, as new code would theoretically be written with the explicit syntax where desired. |
I'll try to explain it with an actual (but very simple) example.
In a theoretical world where none of these reasons hold, everyone would eventually add The only "real" answer to that is to make it easier to verify that what you think is a tail call, is ineed one. IMO, this makes exactly the same sense that the "capture clause" in C++ lambdas does. And for all I know, you might like that feature (maybe because you're doing C++ at google). But like I said previously, in the current non-theoretical world, the much likelier result is that tail calls will not be widely used (if only because of the natural bias against longer code). |
You could also imagine the syntax enabling proper tail calls for the entire stack it generates - avoiding the need to add the keyword all the way down. |
That sounds interesting, but it's not what's suggested in this proposal... (Sidenote: I imagine that you're talking about something that happens at the beginning of the stack since going the other way is impossible. Something that I thought could work is some |
Enabling the entire stack it generates? Why not go other way and have PTC but provide unfolded stack trace when required? So my app will have PTC, but if i'll struggle with something, I will be able to enforce unfolded stack trace, debug my app, and disable it back again? But having ETC keyword which enables PTC for stack trace is fine enough if I will be able to do it once at the root of my app (e.g. in index.js) and work with PTC everywhere. Anyway, explicit TC, which I will be able to use only once at the top of app will be good thing, but have it explicit everywhere will be much much worse |
I prefer "implicit PTC" (current spec) but with revised wording that would widen the possibilities for various engine implementations, as discussed earlier in this thread. However, it seems almost impossible for that to ever happen from the current state. The years-long stalemate is bad for JS and for engines. As for whether TC39 could remove PTC without consensus, I have changed my mind from earlier positions: I now strongly feel that TC39 can, and should, remove PTC from the spec. This needs to happen first before any possible progress on tail calls can happen. The webkit folks don't have standing to object to the removal, as this doesn't betray or invalidate their implementation at all. Had PTC never been added, I believe webkit could have added this "feature" as an optimization without being in spec violation. So even if the spec part is removed, that doesn't harm webkit. I believe the editor of the spec should make an executive decision that any spec'd feature (or typo or bug or wording or ..) which is later openly defied by implementation(s) -- irreconcilably so -- is a post facto veto, and therefore can and must be removed, retroactively stricken from the standard. I imagine there may be some "legal" objection (ECMA, etc) to that, so in effect the next best thing is just to inline the change in the next spec revision without need for consensus. In any case, we need to do a reset on this topic so we can then have productive discussion about any possible path forward. |
That is not within the power of an editor - every delegate has standing to block anything for any reason, including removing PTC from the spec. Consensus is required for normative changes, which is why we’re in the stalemate we’re now in. I agree that no progress can be made until PTC is removed. |
I just do not understand why we cannot have an important programming feature like TCO (not the syntactical proposal, but natural as it supposed to be) due to some claims that it has negative effects on somethings which does not have much ground as we have seen in different threads so far.
Shortly, for debugging I can say; |
Gilad Bracha has shared some advice on how to have your cake and eat it too with PTC. Stack Compression, Heap Allocation, Ring Buffers, etc. https://gbracha.blogspot.com/2009/12/chased-by-ones-own-tail.html |
TC39 made a mistake in naming the feature 'Proper Tail Calls'. Lots of coders have no interest in propriety. It should have been called 'Tail Call Optimization'. Coders will jump through all sorts of hoops to optimize, even when that work has no observable benefit. The only change TC39 should make is to correct the name of the feature. Meanwhile, all JavaScript engines should implement it in order to be called Standards Compliant, and debuggers should be enhanced to mitigate the debugging experience. |
Standards document reality, they don’t dictate it, and the proper thing to do would be to remove PTC from the spec, and attempt to reintroduce it through the modern proposal process. |
I asserted years ago that TC39 could/should adopt the position that any feature in the spec which is willfully violated by the majority of implementations is de facto already voted down, and should thus be removed/demoted (back to stage 2 or 3) as an editorial change, requiring no further consensus vote. Even if webkit still wants to keep its PTC implementation, they should be allowed to do, because there's no current assertion being made that the spec would include a prohibition on PTC. It could/should just be an experimental feature that webkit implemented that distinguishes them from the other engines. Therefore, it doesn't require a consensus vote to remove the specification of the feature. If PTC/TCO was demoted to stage 2 (or even 1), it would (as @ljharb suggests) be given the time to go through a more rigorous exploration and specification cycle. If it ever reached stage 4 again, it would certainly have done so after convincing all of its detractors, and we'd be much more likely to get actual implementations of it (beyond webkit). Of course, webkit could still veto any substantial design changes (such as STC) while it was in those earlier stages. But that should be a healthy debate process that's allowed to happen, instead of the stalemate we currently have. Side note 1: I would feel differently if two or more implementations had a current shipping implementation... in that case, I would argue that TC39 must at a minimum move the feature specification into appendix B, in the spirit of documenting web reality even if the specification doesn't necessarily endorse some feature or behavior as a first-class citizen. Side note 2: I still absolutely strongly feel JS should have PTC (or TCO -- yes, I'm agreeing with Doug). My assertion above should not be construed as being against the feature, but rather against the (IMO) unacceptable indefinite standoff and willful violation by most engines. |
|
Hi all,
as much as some people love to hate Safari ("the new IE", yadda, yadda), I've yet to see a complain about their support for (implicit) proper tail calls in the wild. The debugging problems mentioned in this article seem to be non-issues in practice.
Maybe it is time to revisit the situation?
The text was updated successfully, but these errors were encountered: