Consider removing `--check-bounds=no`? #48245

Keno · 2023-01-11T23:47:20Z

I think we should consider removing the --check-bounds=no option.
I can't really think of any situation in which it would be safe or sensible to turn it on.
If you really must have something like it, a Cassette pass that disables bounds checking
in a particular region of code, could achieve any residual benefits without being as massive a footgun.
Worse, turning on --check-bounds=no on current master can actually result in significantly
worse performance because it removes inference's ability to do concrete evaluation (constant folding).
Note that that we can make the option a noop in a minor release since throwing a BoundsError
is allowable undefined behavior.

EDIT: note for future readers that the recommended replacement is to mark @inbounds code that is discovered to benefit from this flag from @profile analysis.

The text was updated successfully, but these errors were encountered:

matthias314 · 2023-01-14T00:22:20Z

Here is a case where I regularly use --check-bounds=no: I write a program that does some computation, say based on some integer parameter n. The larger n is, the longer the program runs. I want to get the result for as large a value of n as possible. I run the program for small n with bounds checking turned on to make sure everything works. Then I turn bounds checking off to push n as high as possible. That's quick and easy and does the trick for me.
(This is of course for code that I use myself, not for packages released to the public.)

vtjnash · 2023-01-14T01:07:22Z

Just to be clear, the reason we want to remove it is it requires us to compile the code more conservatively, leading to significant losses in inference accuracy, leading to significant losses of performance when running with check-bounds=no

JeffBezanson · 2023-01-19T19:26:37Z

@matthias314 Out of curiosity, what kind of speedup do you get?

gbaraldi · 2023-01-19T21:15:29Z

Triage discussed this in length and the conclusion was that issues that are present in --check-bounds=no are similarly present with @inbounds meaning that this is probably a bandaid.
One idea @oscardssmith proposed but wasn't sure of the feasibility is to refine the inbounds effect to not taint if we can prove that the acesses are always inbounds.

vtjnash · 2023-01-19T22:14:53Z

That sounds easy to prove: just remove --check-bounds=no and the required property is proved exactly when expected, and more often than it could be proven under --check-bounds=no in the current system (e.g. we can eliminate taint on bounds more easily with check-bounds=yes/auto than with check-bounds=no. That is the statement of purpose for this issue.)

matthias314 · 2023-01-19T23:28:22Z

@JeffBezanson The speedup varies; often it is indeed not impressive.

My point was not so much about the effectiveness of the current implementation of --check-bounds=no, but about the general idea: Instead of sprinkling my code with @inbounds, @boundscheck and @propagate_inbounds, I find it easier not to worry about it when I write my code and then turn all checks off once I think my code is correct.

LilithHafner · 2023-01-20T18:47:37Z

leading to significant losses of performance when running with check-bounds=no

Could we get a specific example of @inbounds and/or check-bounds=no decreasing performance? I feel like there should be an open issue for this but I'm having trouble finding either a gh issue or an example of this behavior.

oscardssmith · 2023-01-20T19:24:43Z

julia> Base.@assume_effects :terminates_locally function f(s)
           t = 0.
           for i in 1:2^20
               for m in 1:length(s)
                   t += 1/(i + s[m])
               end
           end
           t
       end
f (generic function with 1 method)

julia> Base.@assume_effects :terminates_locally function f_inbounds(s)
           t = 0.
           for i in 1:2^20
               for m in 1:length(s)
                   @inbounds t += 1/(i + s[m])
               end
           end
           t
       end
f_inbounds (generic function with 1 method)

julia> g() = f((2,3,4))
g (generic function with 1 method)

julia> g_inbounds() = f_inbounds((2,3,4))
g_inbounds (generic function with 1 method)

julia> @btime g()
  0.861 ns (0 allocations: 0 bytes)
37.903821175206865

julia> @btime g_inbounds()
  2.683 ms (0 allocations: 0 bytes)
37.903821175206865

So a nice example where @inbounds a 3 million times regression.

LilithHafner · 2023-01-21T01:13:03Z

Thanks! I figured out the reason I couldn't reproduce this is that I wasn't using the latest master.

maleadt · 2023-01-23T10:14:09Z

Apparently there's GPU users that care about this, because GPUs are very sensitive to the branch-heavy code introduced by bounds checks (typically because it increases register pressure which hurts occupancy) and they are using unoptimized kernels that don't have the necessary @inbounds annotations. At the same time, --check-bounds=no is now (on 1.9) unusable to the because the regression in const-prop breaks static compilation, so that's not great.

@Keno Can you elaborate on the Cassette-like solution? What's a reasonable way to implement this for GPUCompiler's abstract interpreter without the const-prop regressions?

omlins · 2023-01-23T13:05:57Z

The feature check-bounds=no is one of the top beautiful things of Julia in my opinion: it allows to develop code with the benefit of bounds checking, and once everything works as it should, with a single switch we can remove the bounds checking not necessary anymore to have it run faster. I think there is a large community, in particular domain scientists (who in the end are the probably largest part of the target users) that will strongly appreciate if this feature can be conserved. :)

vtjnash · 2023-01-23T13:26:55Z

The biggest security flaw in C come from the fact it does not have bounds checking enabled in releases.

omlins · 2023-01-23T14:41:25Z

Bounds checking can naturally drastically impact performance as we can see running the following example The performance achieved without bounds checking is here over three times higher than with bounds checking activated (executed on a P100 GPU):

omlins@nid02356:~/tmpwdir/ParallelStencil/examples> julia -O3 --check-bounds=no --math-mode=fast diffusion2D_shmem_novis.jl
time_s=1.1297500133514404 t_it=0.012552777926127115 T_eff=513.2291021090084
Benchmarktools (min): t_it=0.012316079 T_eff=523.0926940302998

omlins@nid02356:~/tmpwdir/ParallelStencil/examples> julia -O3 --math-mode=fast diffusion2D_shmem_novis.jl
time_s=3.5698001384735107 t_it=0.03966444598303901 T_eff=162.42382275438484
Benchmarktools (min): t_it=0.039542728 T_eff=162.9237857337511

KristofferC · 2023-01-23T14:45:43Z

I think the argument is to use @inbounds where it matters which prevents making your whole application vulnerable to index mistakes with --check-bounds=false.

omlins · 2023-01-23T15:01:25Z

Bounds checking can naturally drastically impact performance as we can see running the following example The performance achieved without bounds checking is here over three times higher than with bounds checking activated (executed on a P100 GPU):
omlins@nid02356:~/tmpwdir/ParallelStencil/examples> julia -O3 --check-bounds=no --math-mode=fast diffusion2D_shmem_novis.jl
time_s=1.1297500133514404 t_it=0.012552777926127115 T_eff=513.2291021090084
Benchmarktools (min): t_it=0.012316079 T_eff=523.0926940302998
omlins@nid02356:~/tmpwdir/ParallelStencil/examples> julia -O3 --math-mode=fast diffusion2D_shmem_novis.jl
time_s=3.5698001384735107 t_it=0.03966444598303901 T_eff=162.42382275438484
Benchmarktools (min): t_it=0.039542728 T_eff=162.9237857337511

This example shows that bounds checking drastically impacts performance for high performance (GPU) applications. As a result, when running scientific high performance code, we do not want a single bounds check to happen . In the same time, while developing a scientific high performance code, we would like bounds check everywhere.
Thus, we do not want to add @inbounds statements into our code in order to be able to develop conveniently with bounds checking; at run time however, we need to deactivate the bounds checking everywhere: the global switch --check-bounds=no which has been available until now` has been a perfect solution.

maleadt · 2023-01-23T15:06:26Z

when running scientific high performance code, we do not want to have a single bounds check to happen

The point is that your entire session shouldn't be running under --check-bounds=no, or a single mistake can kill your it. You apparently want GPU code to run without bounds checks (even though it would still be better to properly optimize your kernels using appropriate @inbounds annotations), so the option should be narrowed to GPU code generation, which as I mentioned already is something that could be added to e.g. @cuda calls, or as an env var to GPUCompiler.jl.

omlins · 2023-01-23T16:27:21Z

The point is that your entire session shouldn't be running under --check-bounds=no, or a single mistake can kill your it.

At the moment we run the code without bounds checks (meaning for a production run), there will be no errors almost 100% of the time.

so the option should be narrowed to GPU code generation

That could be a good solution indeed, if bounds checking would not impact performance of high performance cpu code. However, running the same code on cpu we can already see that this is not true (problem size: 1024^2):

omlins@nid02061:~/tmpwdir/ParallelStencil/examples> julia --threads=12 -O3 --check-bounds=no --math-mode=fast diffusion2D_shmem_novis.jl
time_s=0.032401084899902344 t_it=0.0003600120544433594 T_eff=69.90272600430198
Benchmarktools (min): t_it=0.000321402 T_eff=78.30014747885825

omlins@nid02061:~/tmpwdir/ParallelStencil/examples> julia --threads=12 -O3 --math-mode=fast diffusion2D_shmem_novis.jl
time_s=0.044291019439697266 t_it=0.0004921224382188586 T_eff=51.137322839988364
Benchmarktools (min): t_it=0.000453692 T_eff=55.46896132177777

Seelengrab · 2023-01-24T09:33:24Z

Any indiscriminate use of --check-bounds=no is just as dangerous as just throwing @inbounds on a loop "to make things go fast".

Any use of @inbounds should always be accompanied by a matching @boundscheck before the use, as well as a @propagate_inbounds to allow the removal of that @boundscheck. If any of those pieces are missing, you're going to run into either correctness issues or suboptimal performance. The argument that @inbounds and @boundscheck should not be necessary because the compiler should be able to figure it out is in principle good, but falls apart due to the fact that those annotations are precisely used when the compiler can't figure them out on its own (if we had such a perfect compiler, we'd do all computation at compile time!). Thus, what @inbounds communicates from the developer to the compiler is that you have a mathematical proof that all indexing operations in the block are valid (with the caveat that the code is too concise to communicate that proof to the compiler).

Personally, I'd prefer enforcing the use of @boundscheck when using @inbounds, but since that makes currently working code error, it could be a deprecation at best.

In regards to GPU code (or any code, really) - it seems much more safe to me to use @inbounds and @boundscheck and set --check-bounds=yes during development, than use --check-bounds=no in production. The former will still catch accidental out-of-bounds accesses in dependencies in production instead of introducing correctness (or even security) issues, while the latter will not.

In case it's not clear - I'm in favour of removing --check-bounds=no, while also making the compiler better at interacting with explicit @inbounds and (possibly in a future release, if at all) requiring the use of @boundscheck paired with @inbounds. To bring another point of contention over from triage - the observed behavior of constant folded vs. "interpreted" julia code should not differ at all.

sloede · 2023-01-24T12:14:43Z

Being able to do development with memory safety and then easily switch to faster performance using a simple command line option is literally one of the best and most used features for me as an HPC user. When talking to colleagues from the scientific computing community, maybe the single strongest argument favor of Julia is that it takes away nothing of your performance while making it it much easier to work and prototype with.

Another major points I bring up when advocating Julia is that it makes great strides in solving the two-language problem. By default, everything is memory safe and still reasonably fast. But when necessary, it's as simple as flipping a single switch and you get performance on par with Fortran/C++, and that's for real production code, not just academic setups.
IMHO, this argument would take a serious blow since in the future it means you will have the two languages inside Julia:
The regular, inexperienced-user-friendly and non-optimized code, and the super high performance code with @inbounds and @boundscheck etc. at all the right places. Maybe there will be an array type someone comes up with that will automatically allow switching to @inbounds? And hello numjl, your friendly in-Julia DSL modelled after numpy...

In addition, you now have to annotate (or better yet: first test, then annotate, since no premature optimization) each loop that might be remotely performance critical. I did a quick check; in our code base for Trixi.jl, we have around 2000 for loops. Given that some of them are not performance relevant and some of them are nested, we still need to check ~1000 places and verify if and how they could benefit from manual optimizations. Compare that with the ease of just writing --check-bounds=no and that's it. Yes, it is not a big issue to manually optimize everything for small to medium sized packages, but compare that to production codes with tens to hundreds of thousands of lines.

Finally, here are some numbers. For a production run with Trixi.jl, I see a strong influence of --check-bounds=no on the performance (5 hottest kernels; all numbers per iteration):

	`--check-bounds=no`	Regular	Impact
`calc_volume_integral!`	121ms	122ms	0%
`calc_interface_flux!`	66.9ms	120ms	+79%
`calc_surface_integral!`	23.7ms	46.1ms	+95%
`prolong2interfaces!`	16.4ms	33.9ms	+100%
`apply_jacobian!`	6.66ms	7.78ms	+17%
...	...	...
Overall	240ms	336ms	+40%

Thus while the effect of disabling bounds checking varies, there is a significant impact on many core algorithms.

TL;DR At the current state, I think removing --check-bounds=no is not the best idea. It would cause many HPC codes written in Julia over the past years to immediately cease to be competitive with C++/Fortran codes, unless hand-optimizing all loops again, taking away one of the major benefits (and selling points) Julia currently has. Finally, at least in our case --check-bounds=no has a significant impact on performance (30% reduction in overall runtime) and is thus always used where performance is critical, e.g., production runs on a large number of processors.

KristofferC · 2023-01-24T12:31:10Z

Any indiscriminate use of --check-bounds=no is just as dangerous as just throwing @inbounds on a loop "to make things go fast".

Clearly not, since an @inbounds is restricted to the loop you apply it to while the --check-bounds=no is applied to the whole process.

vchuravy · 2023-01-24T12:34:46Z

I am sympathetic to the performance argument, but the actual issue here is that --check-bounds=no turns a correct program into an incorrect program.

Local annotations have the benefit that they only require local knowledge to reason about them, whereas global flags require global knowledge (in my opinion impossible to obtain).
As an example we could add a mode to Julia that turns all exceptions off (a generalized check-bounds=no). One sometimes comes across some code that uses exception for control-flow (a style that is frowned upon, but nonetheless legal), we even had code like that in Base... Which would mean that turning exceptions of "globally" would have lead to dead-locking Julia programs.

Now out-of-bounds exceptions are hopefully not used like that, but --check-bounds=no can turn correct programs into incorrect programs.

In Julia we have exposed local options to control unsafe behaviours (like fast-math or opting out of bounds-checking) and as others have pointed out @inbounds applies to blocks, for-loops and even functions and is not to onerous to use.

sloede · 2023-01-24T13:05:39Z

@vchuravy I see your point about fine-grained control and its benefits. On the other hand, you lose a big part of what makes Julia currently very flexible. Now the same code can be memory safe or fast, without the user having to make a conscious decision in each (potentially) performance critical section. Why would you give up this awesome feature and, one of the strongest selling points of Julia in a world where you compete with established, fast code languages like C++/Fortran or slow, rapid prototyping langues like Python? I feel like this would instead lead Julia more towards Python, where you can be just as fast as Julia if you restrict yourself to loops that are amenable to @njit/Numba (with @inbounds everywhere, it will even look similar 😬).

And again, adding @inbounds whereever it makes sense from a performance perspective is not a true solution to our misgivings (as I believe was pointed out by someone else above). It is exactly that we do need this ability to have bounds checking enabled by default (for rapid prototyping), with the option to quickly switch to "production mode".

If the main argument in favor is that users abuse the --check-bounds=no option and wreaking havoc on the Julia support channels, I would classify this as an educational (or people-related) issue and not a technical issue. Therefore, imho the correct solution should be found in the instructional realm and not by applying a code patch. Otherwise, I strongly believe that the alternative will be a new package that exports a macro such as @dynamic_inbounds, with the ability to switch between the actual @inbounds and a no-op based on, e.g., Preferences.jl, just to get the old behavior back. This seems bound to cause a lot of new issues, i.e., just kicking the can of user confusion down the road 🤷

vchuravy · 2023-01-24T13:49:14Z

Why would you give up this awesome feature and, one of the strongest selling points of Julia in a world where you compete with established, fast code languages like C++/Fortran or slow, rapid prototyping langues like Python?

For me --check-bounds=no has never been a selling point of Julia and I have never taught it in my performance engineering workshops. In contrast the local control a programmer has is for me a major advantage of Julia in contrast to C/C++ especially w.r.t fast-math. Over the years we have improved the compiler and we are now at a point where @inbounds is needed less and less, but I would expect a HPC application to use it, instead of relying on a global compiler/runtime flag.

The issue is that --check-bounds=no turns correct code into incorrect code. A second issue for me is the social one where the reliance of --check-bounds=no for performance, leads to a worse experience by default for users of your packages and of Julia. Julia is now "slow by default" and you have to tell your users to run Trixi.jl based applications with --check-bounds=no for best performance, instead of using the local mechanism and have a good user experience by default.

I could see a use for development where I would like to answer the question "would @inbounds help", but the current usage scenario you describe leads to a worse experience for everyone. User don't get fast code by default, and if they use --check-bounds=no they can't be sure that an "untested code-path" doesn't lead to a numerical corruption down the road and their climate simulation is wrong.

williamfgc · 2023-01-24T14:03:41Z

My two cents is to document the recommended way for developers to refactor existing code and to write code onwards no matter what the outcome is for the recommended change. Mostly to be sure I'm not giving advice on deprecated functionality as we provide tutorials for the HPC folks new to Julia. Thanks!

Seelengrab · 2023-01-24T16:03:58Z

Clearly not, since an @inbounds is restricted to the loop you apply it to while the --check-bounds=no is applied to the whole process.

A single erroneous @inbounds can have just as disastrous consequences as whole program --check-bounds=no, the latter just creates vastly more opportunities for mistakes to become (in-)visible. Since all that is needed for an incorrect result is a single misuse, both are equally dangerous (just think of the OffsetArrays kerfuffle, spawned by incorrect use of @inbounds..).

Another major points I bring up when advocating Julia is that it makes great strides in solving the two-language problem. By default, everything is memory safe and still reasonably fast. But when necessary, it's as simple as flipping a single switch and you get performance on par with Fortran/C++, and that's for real production code, not just academic setups.

Now the same code can be memory safe or fast, without the user having to make a conscious decision in each (potentially) performance critical section.

In my experience, well written/idiomatic julia code is as fast (or sometimes beats) equivalent C/C++/Fortran code while retaining the safety features a high level language provides. Turning those safety features off and saying "look, without those safety features we are just as fast!" is, in my opinion, misrepresenting the strength of julia, of being able to be strictly better than the "old guard", further perpetuating the myth that the only way to go fast is to not have safe programs. This is not an either/or thing - we can, should (and ultimately MUST) have both at the same time.

Therefore, imho the correct solution should be found in the instructional realm and not by applying a code patch.

The correct way to deal with a tool that you can do nothing but cut yourself with if not held in exactly the right way (which doesn't exist here, as this is a global flag and I'm pretty sure you're not auditing all your dependencies for correct behavior) is not to tell people to only hold it in the exactly right way, it is to fix the tool so you can't cut yourself in the first place.

PallHaraldsson · 2023-01-24T17:55:48Z

Could a compromise be made that you could disable --check-bounds=no at a module level (or alternatively modules disallow that by default, and you could opt into an exception, something you do until you know you've added enough @inbounds)? There is plenty of code that doesn't need to be fast (and/or isn't well tested), and such a module could still opt into @inbounds locally.

chriselrod · 2023-04-24T15:34:25Z

Thus, we do not want to add @inbounds statements into our code in order to be able to develop conveniently with bounds checking; at run time however, we need to deactivate the bounds checking everywhere: the global switch --check-bounds=no which has been available until now` has been a perfect solution.

What about the opposite: develop using --check-bounds=yes, and then run with auto?

Discussion above mostly focused on this being less pretty.

vtjnash · 2023-04-24T16:17:56Z

That is recommended

From version 1.9 onwards, when `--check-bounds=no` is used, concrete-eval is completely disabled. However, it appears `--check-bounds=no` is still being used within the community, causing issues like the one reported in JuliaArrays/StaticArrays.jl#1155. Although we should move forward to a direction of eliminating the flag in the future (#48245), for the time being, there are many requests to carry out a certain level of compiler optimization, even when this flag is enabled. This commit aims to allow concrete-eval "safely" even under `--check-bounds=no`. Specifically, when the method call being analyzed is `:nothrow`, it should be predominantly safe to concrete-eval it under this flag. Technically, however, even `:nothrow` methods could trigger undefined behavior, since `:nothrow` isn't a strict constraint and it's possible for users to annotate potentially risky methods with `Base.@assume_effects :nothrow`. Nonetheless, since this possibility is acknowledged in `Base.@assume_effects` documentation, I feel it's fair to relegate it to user responsibility.

#50107) From version 1.9 onwards, when `--check-bounds=no` is used, concrete-eval is completely disabled. However, it appears `--check-bounds=no` is still being used within the community, causing issues like the one reported in JuliaArrays/StaticArrays.jl#1155. Although we should move forward to a direction of eliminating the flag in the future (#48245), for the time being, there are many requests to carry out a certain level of compiler optimization, even when this flag is enabled. This commit aims to allow concrete-eval "safely" even under `--check-bounds=no`. Specifically, when the method call being analyzed is `:nothrow`, it should be predominantly safe to concrete-eval it under this flag. Technically, however, even `:nothrow` methods could trigger undefined behavior, since `:nothrow` isn't a strict constraint and it's possible for users to annotate potentially risky methods with `Base.@assume_effects :nothrow`. Nonetheless, since this possibility is acknowledged in `Base.@assume_effects` documentation, I feel it's fair to relegate it to user responsibility.

@BoundsCheck

In the current state of the Julia compiler, bounds checking and its related optimization code, such as `@boundscheck` and `@inbounds`, pose a significant handicap for effect analysis. As a result, we're encountering an ironic situation where the application of `@inbounds` annotations, which are intended to optimize performance, instead obstruct the program's optimization, thereby preventing us from achieving optimal performance. This PR is designed to resolve this situation. It aims to enhance the relationship between bounds checking and effect analysis, thereby correctly improving the performance of programs that have `@inbounds` annotations. In the following, I'll first explain the reasons that have led to this situation for better understanding, and then I'll present potential improvements to address these issues. This commit is a collection of various improvement proposals. It's necessary that we incorporate all of them simultaneously to enhance the situation without introducing any regressions. \## Core of the Problem There are fundamentally two reasons why effect analysis of code containing bounds checking is difficult: 1. The evaluation value of `Expr(:boundscheck)` is influenced by the `@inbounds` macro and the `--check-bounds` flag. Hence, when performing a concrete evaluation of a method containing `Expr(:boundscheck)`, it's crucial to respect the `@inbounds` macro context and the `--check-bounds` settings, ensuring the method's behavior is consistent across the compile time concrete evaluation and the runtime execution. 1. If the code, from which bounds checking has been removed due to `@inbounds` or `--check-bounds=no`, is unsafe, it may lead to undefined behavior due to uncertain memory access. \## Current State The current Julia compiler handles these two problems as follows: \### Current State 1 Regarding the first problem, if a code or method call containing `Expr(:boundscheck)` is within an `@inbounds` context, a concrete evaluation is immediately prohibited. For instance, in the following case, when analyzing `bar()`, if you simply perform concrete evaluation of `foo()`, it wouldn't properly respect the `@inbounds` context present in `bar()`. However, since the concrete evaluation of `foo()` is prohibited, it doesn't pose an issue: ```julia foo() = (r = 0; @BoundsCheck r += 1; return r) bar() = @inbounds foo() ``` Conversely, in the following case, there is _no need_ to prohibit the concrete evaluation of `A1_inbounds` due to the presence of `@inbounds`. This is because the execution of the `@boundscheck` block is determined by the presence of local `@inbounds`: ```julia function A1_inbounds() r = 0 @inbounds begin @BoundsCheck r += 1 end return r end ``` However, currently, we prohibit the concrete evaluation of such code as well. Moreover, we are not handling such local `@inbounds` contexts effectively, which results in incorrect execution of `A1_inbounds()` (even our test is incorrect for this example: <https://github.com/JuliaLang/julia/blob/834aad4ab409f4ba65cbed2963b9ab6fa2770354/test/boundscheck_exec.jl#L34>). Furthermore, there is room for improvement when the `--check-bounds` flag is specified. Specifically, when the `--check-bounds` flag is set, the evaluation value of `Expr(:boundscheck)` is determined irrespective of the `@inbounds` context. Hence, there is no need to prohibit concrete evaluations due to inconsistency in the evaluation value of `Expr(:boundscheck)`. \### Current State 2 Next, we've ensured that concrete evaluation isn't performed when there's potentially unsafe code that may have bounds checking removed, or when the `--check-bounds=no` flag is set, which could lead to bounds checking being removed always. For instance, if you perform concrete evaluation for the function call `baz((1,2,3), 4)` in the following example, it may return a value accessed from illicit memory and introduce undefined behaviors into the program: ```julia baz(t::Tuple, i::Int) = @inbounds t[i] baz((1,2,3), 4) ``` However, it's evident that the above code is incorrect and unsafe program and I believe undefined behavior in such programs is deemed, as explicitly stated in the `@inbounds` documentation: > │ Warning > │ > │ Using @inbounds may return incorrect results/crashes/corruption for > │ out-of-bounds indices. The user is responsible for checking it > │ manually. Only use @inbounds when it is certain from the information > │ locally available that all accesses are in bounds. Actually, the `@inbounds` macro is primarily an annotation to "improve performance by removing bounds checks from safe programs". Therefore, I opine that it would be more reasonable to utilize it to alleviate potential errors due to bounds checking within `@inbounds` contexts. To bring up another associated concern, in the current compiler implementation, the `:nothrow` modelings for `getfield`/`arrayref`/`arrayset` is a bit risky, and `:nothrow`-ness is assumed when their bounds checking is turned off by call argument. If our intended direction aligns with the removal of bounds checking based on `@inbounds` as proposed in issue #48245, then assuming `:nothrow`-ness due to `@inbounds` seems reasonable. However, presuming `:nothrow`-ness due to bounds checking argument or the `--check-bounds` flag appears to be risky, especially considering it's not documented. \## This Commit This commit implements all proposed improvements against the current issues as mentioned above. In summary, the enhancements include: - allowing concrete evaluation within a local `@inbounds` context - folding out `Expr(:boundscheck)` when the `--check-bounds` flag is set (and allow concrete evaluation) - changing the `:nothrow` effect bit to `UInt8` type, and refining `:nothrow` information when in an `@inbounds` context - removing dangerous assumptions of `:nothrow`-ness for built-in functions when bounds checking is turned off - replacing the `@_safeindex` hack with `@inbounds`

@BoundsCheck

In the current state of the Julia compiler, bounds checking and its related optimization code, such as `@boundscheck` and `@inbounds`, pose a significant handicap for effect analysis. As a result, we're encountering an ironic situation where the application of `@inbounds` annotations, which are intended to optimize performance, instead obstruct the program's optimization, thereby preventing us from achieving optimal performance. This PR is designed to resolve this situation. It aims to enhance the relationship between bounds checking and effect analysis, thereby correctly improving the performance of programs that have `@inbounds` annotations. In the following, I'll first explain the reasons that have led to this situation for better understanding, and then I'll present potential improvements to address these issues. This commit is a collection of various improvement proposals. It's necessary that we incorporate all of them simultaneously to enhance the situation without introducing any regressions. \## Core of the Problem There are fundamentally two reasons why effect analysis of code containing bounds checking is difficult: 1. The evaluation value of `Expr(:boundscheck)` is influenced by the `@inbounds` macro and the `--check-bounds` flag. Hence, when performing a concrete evaluation of a method containing `Expr(:boundscheck)`, it's crucial to respect the `@inbounds` macro context and the `--check-bounds` settings, ensuring the method's behavior is consistent across the compile time concrete evaluation and the runtime execution. 1. If the code, from which bounds checking has been removed due to `@inbounds` or `--check-bounds=no`, is unsafe, it may lead to undefined behavior due to uncertain memory access. \## Current State The current Julia compiler handles these two problems as follows: \### Current State 1 Regarding the first problem, if a code or method call containing `Expr(:boundscheck)` is within an `@inbounds` context, a concrete evaluation is immediately prohibited. For instance, in the following case, when analyzing `bar()`, if you simply perform concrete evaluation of `foo()`, it wouldn't properly respect the `@inbounds` context present in `bar()`. However, since the concrete evaluation of `foo()` is prohibited, it doesn't pose an issue: ```julia foo() = (r = 0; @BoundsCheck r += 1; return r) bar() = @inbounds foo() ``` Conversely, in the following case, there is _no need_ to prohibit the concrete evaluation of `A1_inbounds` due to the presence of `@inbounds`. This is because ~~the execution of the `@boundscheck` block is determined by the presence of local `@inbounds`~~ `Expr(:boundscheck)` within a local `@inbounds` context does not need to block concrete evaluation: ```julia function A1_inbounds() r = 0 @inbounds begin @BoundsCheck r += 1 end return r end ``` However, currently, we prohibit the concrete evaluation of such code as well. ~~Moreover, we are not handling such local `@inbounds` contexts effectively, which results in incorrect execution of `A1_inbounds()` (even our test is incorrect for this example: <https://github.com/JuliaLang/julia/blob/834aad4ab409f4ba65cbed2963b9ab6fa2770354/test/boundscheck_exec.jl#L34>)~~ EDIT It was an expected behavior as pointed out by Jameson. Furthermore, there is room for improvement when the `--check-bounds` flag is specified. Specifically, when the `--check-bounds` flag is set, the evaluation value of `Expr(:boundscheck)` is determined irrespective of the `@inbounds` context. Hence, there is no need to prohibit concrete evaluations due to inconsistency in the evaluation value of `Expr(:boundscheck)`. \### Current State 2 Next, we've ensured that concrete evaluation isn't performed when there's potentially unsafe code that may have bounds checking removed, or when the `--check-bounds=no` flag is set, which could lead to bounds checking being removed always. For instance, if you perform concrete evaluation for the function call `baz((1,2,3), 4)` in the following example, it may return a value accessed from illicit memory and introduce undefined behaviors into the program: ```julia baz(t::Tuple, i::Int) = @inbounds t[i] baz((1,2,3), 4) ``` However, it's evident that the above code is incorrect and unsafe program and I believe undefined behavior in such programs is deemed, as explicitly stated in the `@inbounds` documentation: > │ Warning > │ > │ Using @inbounds may return incorrect results/crashes/corruption for > │ out-of-bounds indices. The user is responsible for checking it > │ manually. Only use @inbounds when it is certain from the information > │ locally available that all accesses are in bounds. Actually, the `@inbounds` macro is primarily an annotation to "improve performance by removing bounds checks from safe programs". Therefore, I opine that it would be more reasonable to utilize it to alleviate potential errors due to bounds checking within `@inbounds` contexts. To bring up another associated concern, in the current compiler implementation, the `:nothrow` modelings for `getfield`/`arrayref`/`arrayset` is a bit risky, and `:nothrow`-ness is assumed when their bounds checking is turned off by call argument. If our intended direction aligns with the removal of bounds checking based on `@inbounds` as proposed in issue #48245, then assuming `:nothrow`-ness due to `@inbounds` seems reasonable. However, presuming `:nothrow`-ness due to bounds checking argument or the `--check-bounds` flag appears to be risky, especially considering it's not documented. \## This Commit This commit implements all proposed improvements against the current issues as mentioned above. In summary, the enhancements include: - making `Expr(:boundscheck)` within a local `@inbounds` context not block concrete evaluation - folding out `Expr(:boundscheck)` when the `--check-bounds` flag is set (and allow concrete evaluation) - changing the `:nothrow` effect bit to `UInt8` type, and refining `:nothrow` information when in an `@inbounds` context - removing dangerous assumptions of `:nothrow`-ness for built-in functions when bounds checking is turned off - replacing the `@_safeindex` hack with `@inbounds`

@BoundsCheck

In the current state of the Julia compiler, bounds checking and its related optimization code, such as `@boundscheck` and `@inbounds`, pose a significant handicap for effect analysis. As a result, we're encountering an ironic situation where the application of `@inbounds` annotations, which are intended to optimize performance, instead obstruct the program's optimization, thereby preventing us from achieving optimal performance. This PR is designed to resolve this situation. It aims to enhance the relationship between bounds checking and effect analysis, thereby correctly improving the performance of programs that have `@inbounds` annotations. In the following, I'll first explain the reasons that have led to this situation for better understanding, and then I'll present potential improvements to address these issues. This commit is a collection of various improvement proposals. It's necessary that we incorporate all of them simultaneously to enhance the situation without introducing any regressions. \## Core of the Problem There are fundamentally two reasons why effect analysis of code containing bounds checking is difficult: 1. The evaluation value of `Expr(:boundscheck)` is influenced by the `@inbounds` macro and the `--check-bounds` flag. Hence, when performing a concrete evaluation of a method containing `Expr(:boundscheck)`, it's crucial to respect the `@inbounds` macro context and the `--check-bounds` settings, ensuring the method's behavior is consistent across the compile time concrete evaluation and the runtime execution. 1. If the code, from which bounds checking has been removed due to `@inbounds` or `--check-bounds=no`, is unsafe, it may lead to undefined behavior due to uncertain memory access. \## Current State The current Julia compiler handles these two problems as follows: \### Current State 1 Regarding the first problem, if a code or method call containing `Expr(:boundscheck)` is within an `@inbounds` context, a concrete evaluation is immediately prohibited. For instance, in the following case, when analyzing `bar()`, if you simply perform concrete evaluation of `foo()`, it wouldn't properly respect the `@inbounds` context present in `bar()`. However, since the concrete evaluation of `foo()` is prohibited, it doesn't pose an issue: ```julia foo() = (r = 0; @BoundsCheck r += 1; return r) bar() = @inbounds foo() ``` Conversely, in the following case, there is _no need_ to prohibit the concrete evaluation of `A1_inbounds` due to the presence of `@inbounds`. This is because ~~the execution of the `@boundscheck` block is determined by the presence of local `@inbounds`~~ `Expr(:boundscheck)` within a local `@inbounds` context does not need to block concrete evaluation: ```julia function A1_inbounds() r = 0 @inbounds begin @BoundsCheck r += 1 end return r end ``` However, currently, we prohibit the concrete evaluation of such code as well. ~~Moreover, we are not handling such local `@inbounds` contexts effectively, which results in incorrect execution of `A1_inbounds()` (even our test is incorrect for this example: `https://github.com/JuliaLang/julia/blob/834aad4ab409f4ba65cbed2963b9ab6fa2770354/test/boundscheck_exec.jl#L34`)~~ EDIT: It is an expected behavior as pointed out by Jameson. Furthermore, there is room for improvement when the `--check-bounds` flag is specified. Specifically, when the `--check-bounds` flag is set, the evaluation value of `Expr(:boundscheck)` is determined irrespective of the `@inbounds` context. Hence, there is no need to prohibit concrete evaluations due to inconsistency in the evaluation value of `Expr(:boundscheck)`. \### Current State 2 Next, we've ensured that concrete evaluation isn't performed when there's potentially unsafe code that may have bounds checking removed, or when the `--check-bounds=no` flag is set, which could lead to bounds checking being removed always. For instance, if you perform concrete evaluation for the function call `baz((1,2,3), 4)` in the following example, it may return a value accessed from illicit memory and introduce undefined behaviors into the program: ```julia baz(t::Tuple, i::Int) = @inbounds t[i] baz((1,2,3), 4) ``` However, it's evident that the above code is incorrect and unsafe program and I believe undefined behavior in such programs is deemed, as explicitly stated in the `@inbounds` documentation: > │ Warning > │ > │ Using @inbounds may return incorrect results/crashes/corruption for > │ out-of-bounds indices. The user is responsible for checking it > │ manually. Only use @inbounds when it is certain from the information > │ locally available that all accesses are in bounds. Actually, the `@inbounds` macro is primarily an annotation to "improve performance by removing bounds checks from safe programs". Therefore, I opine that it would be more reasonable to utilize it to alleviate potential errors due to bounds checking within `@inbounds` contexts. To bring up another associated concern, in the current compiler implementation, the `:nothrow` modelings for `getfield`/`arrayref`/`arrayset` is a bit risky, and `:nothrow`-ness is assumed when their bounds checking is turned off by call argument. If our intended direction aligns with the removal of bounds checking based on `@inbounds` as proposed in issue #48245, then assuming `:nothrow`-ness due to `@inbounds` seems reasonable. However, presuming `:nothrow`-ness due to bounds checking argument or the `--check-bounds` flag appears to be risky, especially considering it's not documented. \## This Commit This commit implements all proposed improvements against the current issues as mentioned above. In summary, the enhancements include: - making `Expr(:boundscheck)` within a local `@inbounds` context not block concrete evaluation - folding out `Expr(:boundscheck)` when the `--check-bounds` flag is set (and allow concrete evaluation) - changing the `:nothrow` effect bit to `UInt8` type, and refining `:nothrow` information when in an `@inbounds` context - removing dangerous assumptions of `:nothrow`-ness for built-in functions when bounds checking is turned off - replacing the `@_safeindex` hack with `@inbounds`

@BoundsCheck

In the current state of the Julia compiler, bounds checking and its related optimization code, such as `@boundscheck` and `@inbounds`, pose a significant handicap for effect analysis. As a result, we're encountering an ironic situation where the application of `@inbounds` annotations, which are intended to optimize performance, instead obstruct the program's optimization, thereby preventing us from achieving optimal performance. This PR is designed to resolve this situation. It aims to enhance the relationship between bounds checking and effect analysis, thereby correctly improving the performance of programs that have `@inbounds` annotations. In the following, I'll first explain the reasons that have led to this situation for better understanding, and then I'll present potential improvements to address these issues. This commit is a collection of various improvement proposals. It's necessary that we incorporate all of them simultaneously to enhance the situation without introducing any regressions. \## Core of the Problem There are fundamentally two reasons why effect analysis of code containing bounds checking is difficult: 1. The evaluation value of `Expr(:boundscheck)` is influenced by the `@inbounds` macro and the `--check-bounds` flag. Hence, when performing a concrete evaluation of a method containing `Expr(:boundscheck)`, it's crucial to respect the `@inbounds` macro context and the `--check-bounds` settings, ensuring the method's behavior is consistent across the compile time concrete evaluation and the runtime execution. 1. If the code, from which bounds checking has been removed due to `@inbounds` or `--check-bounds=no`, is unsafe, it may lead to undefined behavior due to uncertain memory access. \## Current State The current Julia compiler handles these two problems as follows: \### Current State 1 Regarding the first problem, if a code or method call containing `Expr(:boundscheck)` is within an `@inbounds` context, a concrete evaluation is immediately prohibited. For instance, in the following case, when analyzing `bar()`, if you simply perform concrete evaluation of `foo()`, it wouldn't properly respect the `@inbounds` context present in `bar()`. However, since the concrete evaluation of `foo()` is prohibited, it doesn't pose an issue: ```julia foo() = (r = 0; @BoundsCheck r += 1; return r) bar() = @inbounds foo() ``` Conversely, in the following case, there is _no need_ to prohibit the concrete evaluation of `A1_inbounds` due to the presence of `@inbounds`. This is because ~~the execution of the `@boundscheck` block is determined by the presence of local `@inbounds`~~ `Expr(:boundscheck)` within a local `@inbounds` context does not need to block concrete evaluation: ```julia function A1_inbounds() r = 0 @inbounds begin @BoundsCheck r += 1 end return r end ``` However, currently, we prohibit the concrete evaluation of such code as well. ~~Moreover, we are not handling such local `@inbounds` contexts effectively, which results in incorrect execution of `A1_inbounds()` (even our test is incorrect for this example: `https://github.com/JuliaLang/julia/blob/834aad4ab409f4ba65cbed2963b9ab6fa2770354/test/boundscheck_exec.jl#L34`)~~ EDIT: It is an expected behavior as pointed out by Jameson. Furthermore, there is room for improvement when the `--check-bounds` flag is specified. Specifically, when the `--check-bounds` flag is set, the evaluation value of `Expr(:boundscheck)` is determined irrespective of the `@inbounds` context. Hence, there is no need to prohibit concrete evaluations due to inconsistency in the evaluation value of `Expr(:boundscheck)`. \### Current State 2 Next, we've ensured that concrete evaluation isn't performed when there's potentially unsafe code that may have bounds checking removed, or when the `--check-bounds=no` flag is set, which could lead to bounds checking being removed always. For instance, if you perform concrete evaluation for the function call `baz((1,2,3), 4)` in the following example, it may return a value accessed from illicit memory and introduce undefined behaviors into the program: ```julia baz(t::Tuple, i::Int) = @inbounds t[i] baz((1,2,3), 4) ``` However, it's evident that the above code is incorrect and unsafe program and I believe undefined behavior in such programs is deemed, as explicitly stated in the `@inbounds` documentation: > │ Warning > │ > │ Using @inbounds may return incorrect results/crashes/corruption for > │ out-of-bounds indices. The user is responsible for checking it > │ manually. Only use @inbounds when it is certain from the information > │ locally available that all accesses are in bounds. Actually, the `@inbounds` macro is primarily an annotation to "improve performance by removing bounds checks from safe programs". Therefore, I opine that it would be more reasonable to utilize it to alleviate potential errors due to bounds checking within `@inbounds` contexts. To bring up another associated concern, in the current compiler implementation, the `:nothrow` modelings for `getfield`/`arrayref`/`arrayset` is a bit risky, and `:nothrow`-ness is assumed when their bounds checking is turned off by call argument. If our intended direction aligns with the removal of bounds checking based on `@inbounds` as proposed in issue #48245, then assuming `:nothrow`-ness due to `@inbounds` seems reasonable. However, presuming `:nothrow`-ness due to bounds checking argument or the `--check-bounds` flag appears to be risky, especially considering it's not documented. \## This Commit This commit implements all proposed improvements against the current issues as mentioned above. In summary, the enhancements include: - making `Expr(:boundscheck)` within a local `@inbounds` context not block concrete evaluation - folding out `Expr(:boundscheck)` when the `--check-bounds` flag is set (and allow concrete evaluation) - changing the `:nothrow` effect bit to `UInt8` type, and refining `:nothrow` information when in an `@inbounds` context - removing dangerous assumptions of `:nothrow`-ness for built-in functions when bounds checking is turned off - replacing the `@_safeindex` hack with `@inbounds`

@BoundsCheck

In the current state of the Julia compiler, bounds checking and its related optimization code, such as `@boundscheck` and `@inbounds`, pose a significant handicap for effect analysis. As a result, we're encountering an ironic situation where the application of `@inbounds` annotations, which are intended to optimize performance, instead obstruct the program's optimization, thereby preventing us from achieving optimal performance. This PR is designed to resolve this situation. It aims to enhance the relationship between bounds checking and effect analysis, thereby correctly improving the performance of programs that have `@inbounds` annotations. In the following, I'll first explain the reasons that have led to this situation for better understanding, and then I'll present potential improvements to address these issues. This commit is a collection of various improvement proposals. It's necessary that we incorporate all of them simultaneously to enhance the situation without introducing any regressions. \## Core of the Problem There are fundamentally two reasons why effect analysis of code containing bounds checking is difficult: 1. The evaluation value of `Expr(:boundscheck)` is influenced by the `@inbounds` macro and the `--check-bounds` flag. Hence, when performing a concrete evaluation of a method containing `Expr(:boundscheck)`, it's crucial to respect the `@inbounds` macro context and the `--check-bounds` settings, ensuring the method's behavior is consistent across the compile time concrete evaluation and the runtime execution. 1. If the code, from which bounds checking has been removed due to `@inbounds` or `--check-bounds=no`, is unsafe, it may lead to undefined behavior due to uncertain memory access. \## Current State The current Julia compiler handles these two problems as follows: \### Current State 1 Regarding the first problem, if a code or method call containing `Expr(:boundscheck)` is within an `@inbounds` context, a concrete evaluation is immediately prohibited. For instance, in the following case, when analyzing `bar()`, if you simply perform concrete evaluation of `foo()`, it wouldn't properly respect the `@inbounds` context present in `bar()`. However, since the concrete evaluation of `foo()` is prohibited, it doesn't pose an issue: ```julia foo() = (r = 0; @BoundsCheck r += 1; return r) bar() = @inbounds foo() ``` Conversely, in the following case, there is _no need_ to prohibit the concrete evaluation of `A1_inbounds` due to the presence of `@inbounds`. This is because ~~the execution of the `@boundscheck` block is determined by the presence of local `@inbounds`~~ `Expr(:boundscheck)` within a local `@inbounds` context does not need to block concrete evaluation: ```julia function A1_inbounds() r = 0 @inbounds begin @BoundsCheck r += 1 end return r end ``` However, currently, we prohibit the concrete evaluation of such code as well. ~~Moreover, we are not handling such local `@inbounds` contexts effectively, which results in incorrect execution of `A1_inbounds()` (even our test is incorrect for this example: `https://github.com/JuliaLang/julia/blob/834aad4ab409f4ba65cbed2963b9ab6fa2770354/test/boundscheck_exec.jl#L34`)~~ EDIT: It is an expected behavior as pointed out by Jameson. Furthermore, there is room for improvement when the `--check-bounds` flag is specified. Specifically, when the `--check-bounds` flag is set, the evaluation value of `Expr(:boundscheck)` is determined irrespective of the `@inbounds` context. Hence, there is no need to prohibit concrete evaluations due to inconsistency in the evaluation value of `Expr(:boundscheck)`. \### Current State 2 Next, we've ensured that concrete evaluation isn't performed when there's potentially unsafe code that may have bounds checking removed, or when the `--check-bounds=no` flag is set, which could lead to bounds checking being removed always. For instance, if you perform concrete evaluation for the function call `baz((1,2,3), 4)` in the following example, it may return a value accessed from illicit memory and introduce undefined behaviors into the program: ```julia baz(t::Tuple, i::Int) = @inbounds t[i] baz((1,2,3), 4) ``` However, it's evident that the above code is incorrect and unsafe program and I believe undefined behavior in such programs is deemed, as explicitly stated in the `@inbounds` documentation: > │ Warning > │ > │ Using @inbounds may return incorrect results/crashes/corruption for > │ out-of-bounds indices. The user is responsible for checking it > │ manually. Only use @inbounds when it is certain from the information > │ locally available that all accesses are in bounds. Actually, the `@inbounds` macro is primarily an annotation to "improve performance by removing bounds checks from safe programs". Therefore, I opine that it would be more reasonable to utilize it to alleviate potential errors due to bounds checking within `@inbounds` contexts. To bring up another associated concern, in the current compiler implementation, the `:nothrow` modelings for `getfield`/`arrayref`/`arrayset` is a bit risky, and `:nothrow`-ness is assumed when their bounds checking is turned off by call argument. If our intended direction aligns with the removal of bounds checking based on `@inbounds` as proposed in issue #48245, then assuming `:nothrow`-ness due to `@inbounds` seems reasonable. However, presuming `:nothrow`-ness due to bounds checking argument or the `--check-bounds` flag appears to be risky, especially considering it's not documented. \## This Commit This commit implements all proposed improvements against the current issues as mentioned above. In summary, the enhancements include: - making `Expr(:boundscheck)` within a local `@inbounds` context not block concrete evaluation - folding out `Expr(:boundscheck)` when the `--check-bounds` flag is set (and allow concrete evaluation) - changing the `:nothrow` effect bit to `UInt8` type, and refining `:nothrow` information when in an `@inbounds` context - removing dangerous assumptions of `:nothrow`-ness for built-in functions when bounds checking is turned off - replacing the `@_safeindex` hack with `@inbounds`

@overlay

In 1.9, `--check-bounds=no` has started causing significant performance regressions (e.g. #50110). This is because we switched a number of functions that used to be `@pure` to new effects-based infrastructure, which very closely tracks the the legality conditions for concrete evaluation. Unfortunately, disabling bounds checking completely invalidates all prior legality analysis, so the only realistic path we have is to completely disable it. In general, we are learning that these kinds of global make-things-faster-but-unsafe flags are highly problematic for a language for several reasons: - Code is written with the assumption of a particular mode being chosen, so it is in general not possible or unsafe to compose libraries (which in a language like julia is a huge problem). - Unsafe semantics are often harder for the compiler to reason about, causing unexpected performance issues (although the 1.9 --check-bounds=no issues are worse than just disabling concrete eval for things that use bounds checking) In general, I'd like to remove the `--check-bounds=` option entirely (#48245), but that proposal has encountered major opposition. This PR implements an alternative proposal: We introduce a new function `Core.should_check_bounds(boundscheck::Bool) = boundscheck`. This function is passed the result of `Expr(:boundscheck)` (which is now purely determined by the inliner based on `@inbounds`, without regard for the command line flag). In this proposal, what the command line flag does is simply redefine this function to either `true` or `false` (unconditionally) depending on the value of the flag. Of course, this causes massive amounts of recompilation, but I think this can be addressed by adding logic to loading that loads a pkgimage with appropriate definitions to cure the invalidations. The special logic we have now now to take account of the --check-bounds flag in .ji selection, would be replaced by automatically injecting the special pkgimage as a dependency to every loaded image. This part isn't implemented in this PR, but I think it's reasonable to do. I think with that, the `--check-bounds` flag remains functional, while having much more well defined behavior, as it relies on the standard world age mechanisms. A major benefit of this approach is that it can be scoped appropriately using overlay tables. For exmaple: ``` julia> using CassetteOverlay julia> @MethodTable AssumeInboundsTable; julia> @overlay AssumeInboundsTable Core.should_check_bounds(b::Bool) = false; julia> assume_inbounds = @overlaypass AssumeInboundsTable julia> assume_inbounds(f, args...) # f(args...) with bounds checking disabled dynamically ``` Similar logic applies to GPUCompiler, which already supports overlay tables.

@overlay

In 1.9, `--check-bounds=no` has started causing significant performance regressions (e.g. #50110). This is because we switched a number of functions that used to be `@pure` to new effects-based infrastructure, which very closely tracks the the legality conditions for concrete evaluation. Unfortunately, disabling bounds checking completely invalidates all prior legality analysis, so the only realistic path we have is to completely disable it. In general, we are learning that these kinds of global make-things-faster-but-unsafe flags are highly problematic for a language for several reasons: - Code is written with the assumption of a particular mode being chosen, so it is in general not possible or unsafe to compose libraries (which in a language like julia is a huge problem). - Unsafe semantics are often harder for the compiler to reason about, causing unexpected performance issues (although the 1.9 --check-bounds=no issues are worse than just disabling concrete eval for things that use bounds checking) In general, I'd like to remove the `--check-bounds=` option entirely (#48245), but that proposal has encountered major opposition. This PR implements an alternative proposal: We introduce a new function `Core.should_check_bounds(boundscheck::Bool) = boundscheck`. This function is passed the result of `Expr(:boundscheck)` (which is now purely determined by the inliner based on `@inbounds`, without regard for the command line flag). In this proposal, what the command line flag does is simply redefine this function to either `true` or `false` (unconditionally) depending on the value of the flag. Of course, this causes massive amounts of recompilation, but I think this can be addressed by adding logic to loading that loads a pkgimage with appropriate definitions to cure the invalidations. The special logic we have now now to take account of the --check-bounds flag in .ji selection, would be replaced by automatically injecting the special pkgimage as a dependency to every loaded image. This part isn't implemented in this PR, but I think it's reasonable to do. I think with that, the `--check-bounds` flag remains functional, while having much more well defined behavior, as it relies on the standard world age mechanisms. A major benefit of this approach is that it can be scoped appropriately using overlay tables. For exmaple: ``` julia> using CassetteOverlay julia> @MethodTable AssumeInboundsTable; julia> @overlay AssumeInboundsTable Core.should_check_bounds(b::Bool) = false; julia> assume_inbounds = @overlaypass AssumeInboundsTable julia> assume_inbounds(f, args...) # f(args...) with bounds checking disabled dynamically ``` Similar logic applies to GPUCompiler, which already supports overlay tables.

Keno · 2023-06-20T21:25:36Z

See RFC for one possible approach in #50239

@overlay

In 1.9, `--check-bounds=no` has started causing significant performance regressions (e.g. #50110). This is because we switched a number of functions that used to be `@pure` to new effects-based infrastructure, which very closely tracks the the legality conditions for concrete evaluation. Unfortunately, disabling bounds checking completely invalidates all prior legality analysis, so the only realistic path we have is to completely disable it. In general, we are learning that these kinds of global make-things-faster-but-unsafe flags are highly problematic for a language for several reasons: - Code is written with the assumption of a particular mode being chosen, so it is in general not possible or unsafe to compose libraries (which in a language like julia is a huge problem). - Unsafe semantics are often harder for the compiler to reason about, causing unexpected performance issues (although the 1.9 --check-bounds=no issues are worse than just disabling concrete eval for things that use bounds checking) In general, I'd like to remove the `--check-bounds=` option entirely (#48245), but that proposal has encountered major opposition. This PR implements an alternative proposal: We introduce a new function `Core.should_check_bounds(boundscheck::Bool) = boundscheck`. This function is passed the result of `Expr(:boundscheck)` (which is now purely determined by the inliner based on `@inbounds`, without regard for the command line flag). In this proposal, what the command line flag does is simply redefine this function to either `true` or `false` (unconditionally) depending on the value of the flag. Of course, this causes massive amounts of recompilation, but I think this can be addressed by adding logic to loading that loads a pkgimage with appropriate definitions to cure the invalidations. The special logic we have now now to take account of the --check-bounds flag in .ji selection, would be replaced by automatically injecting the special pkgimage as a dependency to every loaded image. This part isn't implemented in this PR, but I think it's reasonable to do. I think with that, the `--check-bounds` flag remains functional, while having much more well defined behavior, as it relies on the standard world age mechanisms. A major benefit of this approach is that it can be scoped appropriately using overlay tables. For exmaple: ``` julia> using CassetteOverlay julia> @MethodTable AssumeInboundsTable; julia> @overlay AssumeInboundsTable Core.should_check_bounds(b::Bool) = false; julia> assume_inbounds = @overlaypass AssumeInboundsTable julia> assume_inbounds(f, args...) # f(args...) with bounds checking disabled dynamically ``` Similar logic applies to GPUCompiler, which already supports overlay tables.

@overlay

In 1.9, `--check-bounds=no` has started causing significant performance regressions (e.g. #50110). This is because we switched a number of functions that used to be `@pure` to new effects-based infrastructure, which very closely tracks the the legality conditions for concrete evaluation. Unfortunately, disabling bounds checking completely invalidates all prior legality analysis, so the only realistic path we have is to completely disable it. In general, we are learning that these kinds of global make-things-faster-but-unsafe flags are highly problematic for a language for several reasons: - Code is written with the assumption of a particular mode being chosen, so it is in general not possible or unsafe to compose libraries (which in a language like julia is a huge problem). - Unsafe semantics are often harder for the compiler to reason about, causing unexpected performance issues (although the 1.9 --check-bounds=no issues are worse than just disabling concrete eval for things that use bounds checking) In general, I'd like to remove the `--check-bounds=` option entirely (#48245), but that proposal has encountered major opposition. This PR implements an alternative proposal: We introduce a new function `Core.should_check_bounds(boundscheck::Bool) = boundscheck`. This function is passed the result of `Expr(:boundscheck)` (which is now purely determined by the inliner based on `@inbounds`, without regard for the command line flag). In this proposal, what the command line flag does is simply redefine this function to either `true` or `false` (unconditionally) depending on the value of the flag. Of course, this causes massive amounts of recompilation, but I think this can be addressed by adding logic to loading that loads a pkgimage with appropriate definitions to cure the invalidations. The special logic we have now now to take account of the --check-bounds flag in .ji selection, would be replaced by automatically injecting the special pkgimage as a dependency to every loaded image. This part isn't implemented in this PR, but I think it's reasonable to do. I think with that, the `--check-bounds` flag remains functional, while having much more well defined behavior, as it relies on the standard world age mechanisms. A major benefit of this approach is that it can be scoped appropriately using overlay tables. For exmaple: ``` julia> using CassetteOverlay julia> @MethodTable AssumeInboundsTable; julia> @overlay AssumeInboundsTable Core.should_check_bounds(b::Bool) = false; julia> assume_inbounds = @overlaypass AssumeInboundsTable julia> assume_inbounds(f, args...) # f(args...) with bounds checking disabled dynamically ``` Similar logic applies to GPUCompiler, which already supports overlay tables.

johnomotani · 2025-01-31T12:41:11Z

This is still an important issue to resolve for HPC users of Julia. There was some good discussion on https://discourse.julialang.org/t/removing-bounds-checking-for-hpc-check-bounds-unsafe/124897.

I made a suggestion on that thread that is I think goes in a different direction to the discussion here and on #50239, #50641:

Is there an argument here for array-access bounds checks being an important special case?

For myself (and I think I’m representing a reasonably large class of users) the bounds checks on array accesses are the only ones I really care about, because they’re the ones that happen inside a loop over grid points. In a code doing time evolution on a grid, the grid doesn’t change, and if I’ve tested my indexing logic on several small grids, and a couple of timesteps on a large grid, either bounds checks caught my errors already, or they won’t catch the error in my production run anyway, so bounds checks after initial testing are just a waste of time, money, and CO2 emissions. However, due to complicated discretizations of PDEs, etc. it is unlikely that the bounds checks can be optimized away at compile time.

Proposal: have a special type of @boundscheck/@inbounds like @arrayboundscheck/@arrayinbounds that can be used by the Array interface (and other suitably similar types), and have --check-bounds=noarray that is intermediate between --check-bounds=no and --check-bounds=auto in that it only disables @arrayboundscheck checks. If that was a feature, at least for my use-case, I don’t care if --check-bounds=no was removed.

mbauman · 2025-01-31T17:15:53Z

That wouldn't really resolve the crux of your problem here. In short the status quo is that:

The compiler can reason about the effects that code can have
This allows the compiler to know if it's safe to move some computation around — to compile time, outside a loop, etc.
An @inbounds'ed access might have the side-effect of corrupting or crashing your program
The compiler can't safely optimize that in the same manner. If it tried, it might corrupt your program in wild and unpredictable ways or just crash, even if it is never accessed out-of-bounds at runtime.
So therefore @inbounds can sometimes make code slower, whether it was written as @inbounds or unsafe_load or --check-bounds=no.

Having an option that just affects Array would still have the exact same problem, but just for Arrays.

Bounds checks are surprisingly cheap on many architectures unless they inhibit a compiler optimization like SIMD. So what you want isn't "just" no bounds checks. It's also those subsequent compiler optimizations to surrounding code that are only made possible without that branch. But it's ironically that very branch that can (sometimes) inform the compiler that such optimizations are valid.

johnomotani · 2025-01-31T17:32:13Z

I have a case where removing all bounds checks on arrays (or using --check-bounds=no on Julia-1.10) speeds up my code by 2x. Scientific HPC application developers seem to agree that being able to disable bounds checks is what they would expect (see the discourse thread, also for explanations of why this is normal and acceptable). I see 'the compiler might be able to optimize better with bounds checks' as a hypothetical. I have real-world examples where the opposite is true, by a factor of 2. If/when the compiler can optimize those situations, then 'remove --check-bounds=no' becomes an option, but at the moment it seems that some people want to remove the option of disabling safety in favour of performance (after verifying sufficiently that the code is correct). Just to repeat the point, we can provide examples where this has a 2x cost in performance, in exactly the place where this is a huge cost - HPC jobs that could cost $10000s (and associated time and carbon emissions) for a single run. That's why we're very, very keen on having equivalent functionality for --check-bounds=no.

Edited to add: I don't say that the solution I suggested is a good one. My main point is we (scientific HPC developers) have a need for some solution, in a way that wasn't being discussed in this thread.

As a slightly separate note: if the 'solution' is 'use @inbounds where you need it for performance', how are we supposed to know when we've done a good enough job? In the past, if we wanted to implement that solution, we would most likely compare to --check-bounds=no, and if the performance was within 1% or so, you'd know that you've done enough. Without --check-bounds=no being a viable option, we can't even do that (much less profile the code to find where exactly we could add @inbounds to increase performance).

mbauman · 2025-01-31T18:23:12Z

Right, I'm not saying that @inbounds can't improve performance — it is still often required for SIMD transformations which can easily net 2x or 4x or more-x perf. I'm just pointing out that the opposite can and does happen — and this has been the case for a long time (see, e.g., #39340 on v1.5)!

That's what I'm calling the crux of your problem: that @inbounds isn't guaranteed to monotonically improve performance. And it never was. It can and does throw speed-bumps, too, regardless of how it was applied. --check-bounds=no is just a great way to find all these spots.

What happened is that the effects system overhaul changed where some of these speed bumps landed, so there were more regressions in existing codebases as you upgraded from 1.10 to 1.11+.

johnomotani · 2025-01-31T18:32:04Z

OK, fair point. But --check-bounds=no is broken at the moment
JuliaArrays/StaticArrays.jl#1155
JuliaLang/PackageCompiler.jl#1021

and there seems to be no developer will to fix those at the moment. I have the impression that that lack of priority comes from threads like this one effectively meaning --check-bounds=no is unsupported and being allowed to break. It would be nice if it was either fixed, or a comparable replacement was suggested. As you say, being able to profile with --check-bounds=no is how to identify which parts of your code @inbounds would speed up - without it the equivalent profiling would be so much work that it might as well be impossible.

mbauman · 2025-01-31T22:37:26Z

The biggest problem is that --check-bounds=no affects all of Julia, including type promotion code itself, and some packages (especially those targeting compile-time optimizations) live right on the knife's edge of what the compiler can constant-propagate for type stability. And type instabilities are expensive. For example, in #50985 typejoin(Int, Static.StaticInt{2}) isn't constant-folding to Number without bounds checks because the typejoin algorithm itself happens to use indexing. Maybe it should additionally promise it won't throw or have UB, but I suspect that's not safe to do given that Keno has looked at it.

The other problem, though, is that it's assumed to be a "go-faster" easy button — and I think that's underpinning a lot of the commentary around it being "broken" throughout this thread. That's not necessarily true and I actually hope that it'll continue to become less true as Julia's compiler gets more capable.

That segfault is interesting, though. It'd be great to minimize it if at all possible.

Keno added the triage This should be discussed on a triage call label Jan 11, 2023

This was referenced Jan 23, 2023

--check-bounds=no results in invalid IR with AMDGPU JuliaGPU/GPUCompiler.jl#387

Closed

--check-bounds=no is broken on Julia 1.9.0-beta3 JuliaGPU/AMDGPU.jl#354

Closed

ranocha mentioned this issue Jan 24, 2023

Taal: Should we use @inbounds + @boundscheck to increase standard performance? trixi-framework/Trixi.jl#210

Open

non-Jedi mentioned this issue Feb 2, 2023

Allow macros to apply to modules #48501

Closed

vtjnash mentioned this issue Apr 24, 2023

Removal of @pure makes constant-prop less effective. #49472

Open

aviatesk mentioned this issue Jun 8, 2023

effects: allow concrete-eval when --check-bounds=no if proven "safe" #50107

Merged

oscardssmith mentioned this issue Jun 8, 2023

performance regression: Multiple allocations with "--check-bounds=no" on map after 1.9 #50110

Open

aviatesk mentioned this issue Jun 14, 2023

improve the interplay between bounds checking system and effect system #50167

Open

Keno mentioned this issue Jun 20, 2023

RFC: A path forward on --check-bounds #50239

Closed

leios mentioned this issue Nov 2, 2023

adding attempt to force inbounds at the kernel level JuliaGPU/KernelAbstractions.jl#429

Merged

LilithHafner removed the triage This should be discussed on a triage call label Jan 3, 2024

JordiManyer mentioned this issue Jul 12, 2024

Performance mode gridap/Gridap.jl#1014

Merged

sloede mentioned this issue Dec 13, 2024

Segfault when creating sysimage with --check-bounds=no under Julia-1.11 JuliaLang/PackageCompiler.jl#1021

Open

Consider removing --check-bounds=no? #48245

Consider removing --check-bounds=no? #48245

Comments

Keno commented Jan 11, 2023 • edited by oscardssmith Loading

matthias314 commented Jan 14, 2023

vtjnash commented Jan 14, 2023

JeffBezanson commented Jan 19, 2023

gbaraldi commented Jan 19, 2023

vtjnash commented Jan 19, 2023

matthias314 commented Jan 19, 2023

LilithHafner commented Jan 20, 2023

oscardssmith commented Jan 20, 2023 • edited Loading

LilithHafner commented Jan 21, 2023

maleadt commented Jan 23, 2023

omlins commented Jan 23, 2023 • edited Loading

vtjnash commented Jan 23, 2023

omlins commented Jan 23, 2023 • edited Loading

KristofferC commented Jan 23, 2023

omlins commented Jan 23, 2023 • edited Loading

maleadt commented Jan 23, 2023

omlins commented Jan 23, 2023 • edited Loading

Seelengrab commented Jan 24, 2023 • edited Loading

sloede commented Jan 24, 2023

KristofferC commented Jan 24, 2023

vchuravy commented Jan 24, 2023

sloede commented Jan 24, 2023

vchuravy commented Jan 24, 2023

williamfgc commented Jan 24, 2023

Seelengrab commented Jan 24, 2023

PallHaraldsson commented Jan 24, 2023

chriselrod commented Apr 24, 2023 • edited Loading

vtjnash commented Apr 24, 2023

Keno commented Jun 20, 2023

johnomotani commented Jan 31, 2025 • edited Loading

mbauman commented Jan 31, 2025 • edited Loading

johnomotani commented Jan 31, 2025 • edited Loading

mbauman commented Jan 31, 2025

johnomotani commented Jan 31, 2025 • edited Loading

mbauman commented Jan 31, 2025 • edited Loading

Consider removing `--check-bounds=no`? #48245

Consider removing `--check-bounds=no`? #48245

Keno commented Jan 11, 2023 •

edited by oscardssmith

Loading

oscardssmith commented Jan 20, 2023 •

edited

Loading

omlins commented Jan 23, 2023 •

edited

Loading

omlins commented Jan 23, 2023 •

edited

Loading

omlins commented Jan 23, 2023 •

edited

Loading

omlins commented Jan 23, 2023 •

edited

Loading

Seelengrab commented Jan 24, 2023 •

edited

Loading

chriselrod commented Apr 24, 2023 •

edited

Loading

johnomotani commented Jan 31, 2025 •

edited

Loading

mbauman commented Jan 31, 2025 •

edited

Loading

johnomotani commented Jan 31, 2025 •

edited

Loading

johnomotani commented Jan 31, 2025 •

edited

Loading

mbauman commented Jan 31, 2025 •

edited

Loading