Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider removing --check-bounds=no? #48245

Open
Keno opened this issue Jan 11, 2023 · 57 comments
Open

Consider removing --check-bounds=no? #48245

Keno opened this issue Jan 11, 2023 · 57 comments

Comments

@Keno
Copy link
Member

Keno commented Jan 11, 2023

I think we should consider removing the --check-bounds=no option.
I can't really think of any situation in which it would be safe or sensible to turn it on.
If you really must have something like it, a Cassette pass that disables bounds checking
in a particular region of code, could achieve any residual benefits without being as massive a footgun.
Worse, turning on --check-bounds=no on current master can actually result in significantly
worse performance because it removes inference's ability to do concrete evaluation (constant folding).
Note that that we can make the option a noop in a minor release since throwing a BoundsError
is allowable undefined behavior.

EDIT: note for future readers that the recommended replacement is to mark @inbounds code that is discovered to benefit from this flag from @profile analysis.

@Keno Keno added the triage This should be discussed on a triage call label Jan 11, 2023
@matthias314
Copy link
Contributor

Here is a case where I regularly use --check-bounds=no: I write a program that does some computation, say based on some integer parameter n. The larger n is, the longer the program runs. I want to get the result for as large a value of n as possible. I run the program for small n with bounds checking turned on to make sure everything works. Then I turn bounds checking off to push n as high as possible. That's quick and easy and does the trick for me.
(This is of course for code that I use myself, not for packages released to the public.)

@vtjnash
Copy link
Member

vtjnash commented Jan 14, 2023

Just to be clear, the reason we want to remove it is it requires us to compile the code more conservatively, leading to significant losses in inference accuracy, leading to significant losses of performance when running with check-bounds=no

@JeffBezanson
Copy link
Member

@matthias314 Out of curiosity, what kind of speedup do you get?

@gbaraldi
Copy link
Member

Triage discussed this in length and the conclusion was that issues that are present in --check-bounds=no are similarly present with @inbounds meaning that this is probably a bandaid.
One idea @oscardssmith proposed but wasn't sure of the feasibility is to refine the inbounds effect to not taint if we can prove that the acesses are always inbounds.

@vtjnash
Copy link
Member

vtjnash commented Jan 19, 2023

That sounds easy to prove: just remove --check-bounds=no and the required property is proved exactly when expected, and more often than it could be proven under --check-bounds=no in the current system (e.g. we can eliminate taint on bounds more easily with check-bounds=yes/auto than with check-bounds=no. That is the statement of purpose for this issue.)

@matthias314
Copy link
Contributor

@JeffBezanson The speedup varies; often it is indeed not impressive.

My point was not so much about the effectiveness of the current implementation of --check-bounds=no, but about the general idea: Instead of sprinkling my code with @inbounds, @boundscheck and @propagate_inbounds, I find it easier not to worry about it when I write my code and then turn all checks off once I think my code is correct.

@LilithHafner
Copy link
Member

leading to significant losses of performance when running with check-bounds=no

Could we get a specific example of @inbounds and/or check-bounds=no decreasing performance? I feel like there should be an open issue for this but I'm having trouble finding either a gh issue or an example of this behavior.

@oscardssmith
Copy link
Member

oscardssmith commented Jan 20, 2023

julia> Base.@assume_effects :terminates_locally function f(s)
           t = 0.
           for i in 1:2^20
               for m in 1:length(s)
                   t += 1/(i + s[m])
               end
           end
           t
       end
f (generic function with 1 method)

julia> Base.@assume_effects :terminates_locally function f_inbounds(s)
           t = 0.
           for i in 1:2^20
               for m in 1:length(s)
                   @inbounds t += 1/(i + s[m])
               end
           end
           t
       end
f_inbounds (generic function with 1 method)

julia> g() = f((2,3,4))
g (generic function with 1 method)

julia> g_inbounds() = f_inbounds((2,3,4))
g_inbounds (generic function with 1 method)

julia> @btime g()
  0.861 ns (0 allocations: 0 bytes)
37.903821175206865

julia> @btime g_inbounds()
  2.683 ms (0 allocations: 0 bytes)
37.903821175206865

So a nice example where @inbounds a 3 million times regression.

@LilithHafner
Copy link
Member

Thanks! I figured out the reason I couldn't reproduce this is that I wasn't using the latest master.

@maleadt
Copy link
Member

maleadt commented Jan 23, 2023

Apparently there's GPU users that care about this, because GPUs are very sensitive to the branch-heavy code introduced by bounds checks (typically because it increases register pressure which hurts occupancy) and they are using unoptimized kernels that don't have the necessary @inbounds annotations. At the same time, --check-bounds=no is now (on 1.9) unusable to the because the regression in const-prop breaks static compilation, so that's not great.

@Keno Can you elaborate on the Cassette-like solution? What's a reasonable way to implement this for GPUCompiler's abstract interpreter without the const-prop regressions?

@omlins
Copy link

omlins commented Jan 23, 2023

The feature check-bounds=no is one of the top beautiful things of Julia in my opinion: it allows to develop code with the benefit of bounds checking, and once everything works as it should, with a single switch we can remove the bounds checking not necessary anymore to have it run faster. I think there is a large community, in particular domain scientists (who in the end are the probably largest part of the target users) that will strongly appreciate if this feature can be conserved. :)

@vtjnash
Copy link
Member

vtjnash commented Jan 23, 2023

The biggest security flaw in C come from the fact it does not have bounds checking enabled in releases.

@omlins
Copy link

omlins commented Jan 23, 2023

Bounds checking can naturally drastically impact performance as we can see running the following example The performance achieved without bounds checking is here over three times higher than with bounds checking activated (executed on a P100 GPU):

omlins@nid02356:~/tmpwdir/ParallelStencil/examples> julia -O3 --check-bounds=no --math-mode=fast diffusion2D_shmem_novis.jl
time_s=1.1297500133514404 t_it=0.012552777926127115 T_eff=513.2291021090084
Benchmarktools (min): t_it=0.012316079 T_eff=523.0926940302998
omlins@nid02356:~/tmpwdir/ParallelStencil/examples> julia -O3 --math-mode=fast diffusion2D_shmem_novis.jl
time_s=3.5698001384735107 t_it=0.03966444598303901 T_eff=162.42382275438484
Benchmarktools (min): t_it=0.039542728 T_eff=162.9237857337511

@KristofferC
Copy link
Member

I think the argument is to use @inbounds where it matters which prevents making your whole application vulnerable to index mistakes with --check-bounds=false.

@omlins
Copy link

omlins commented Jan 23, 2023

Bounds checking can naturally drastically impact performance as we can see running the following example The performance achieved without bounds checking is here over three times higher than with bounds checking activated (executed on a P100 GPU):

omlins@nid02356:~/tmpwdir/ParallelStencil/examples> julia -O3 --check-bounds=no --math-mode=fast diffusion2D_shmem_novis.jl
time_s=1.1297500133514404 t_it=0.012552777926127115 T_eff=513.2291021090084
Benchmarktools (min): t_it=0.012316079 T_eff=523.0926940302998
omlins@nid02356:~/tmpwdir/ParallelStencil/examples> julia -O3 --math-mode=fast diffusion2D_shmem_novis.jl
time_s=3.5698001384735107 t_it=0.03966444598303901 T_eff=162.42382275438484
Benchmarktools (min): t_it=0.039542728 T_eff=162.9237857337511

This example shows that bounds checking drastically impacts performance for high performance (GPU) applications. As a result, when running scientific high performance code, we do not want a single bounds check to happen . In the same time, while developing a scientific high performance code, we would like bounds check everywhere.
Thus, we do not want to add @inbounds statements into our code in order to be able to develop conveniently with bounds checking; at run time however, we need to deactivate the bounds checking everywhere: the global switch --check-bounds=no which has been available until now` has been a perfect solution.

@maleadt
Copy link
Member

maleadt commented Jan 23, 2023

when running scientific high performance code, we do not want to have a single bounds check to happen

The point is that your entire session shouldn't be running under --check-bounds=no, or a single mistake can kill your it. You apparently want GPU code to run without bounds checks (even though it would still be better to properly optimize your kernels using appropriate @inbounds annotations), so the option should be narrowed to GPU code generation, which as I mentioned already is something that could be added to e.g. @cuda calls, or as an env var to GPUCompiler.jl.

@omlins
Copy link

omlins commented Jan 23, 2023

The point is that your entire session shouldn't be running under --check-bounds=no, or a single mistake can kill your it.

At the moment we run the code without bounds checks (meaning for a production run), there will be no errors almost 100% of the time.

so the option should be narrowed to GPU code generation

That could be a good solution indeed, if bounds checking would not impact performance of high performance cpu code. However, running the same code on cpu we can already see that this is not true (problem size: 1024^2):

omlins@nid02061:~/tmpwdir/ParallelStencil/examples> julia --threads=12 -O3 --check-bounds=no --math-mode=fast diffusion2D_shmem_novis.jl
time_s=0.032401084899902344 t_it=0.0003600120544433594 T_eff=69.90272600430198
Benchmarktools (min): t_it=0.000321402 T_eff=78.30014747885825
omlins@nid02061:~/tmpwdir/ParallelStencil/examples> julia --threads=12 -O3 --math-mode=fast diffusion2D_shmem_novis.jl
time_s=0.044291019439697266 t_it=0.0004921224382188586 T_eff=51.137322839988364
Benchmarktools (min): t_it=0.000453692 T_eff=55.46896132177777

@Seelengrab
Copy link
Contributor

Seelengrab commented Jan 24, 2023

Any indiscriminate use of --check-bounds=no is just as dangerous as just throwing @inbounds on a loop "to make things go fast".

Any use of @inbounds should always be accompanied by a matching @boundscheck before the use, as well as a @propagate_inbounds to allow the removal of that @boundscheck. If any of those pieces are missing, you're going to run into either correctness issues or suboptimal performance. The argument that @inbounds and @boundscheck should not be necessary because the compiler should be able to figure it out is in principle good, but falls apart due to the fact that those annotations are precisely used when the compiler can't figure them out on its own (if we had such a perfect compiler, we'd do all computation at compile time!). Thus, what @inbounds communicates from the developer to the compiler is that you have a mathematical proof that all indexing operations in the block are valid (with the caveat that the code is too concise to communicate that proof to the compiler).

Personally, I'd prefer enforcing the use of @boundscheck when using @inbounds, but since that makes currently working code error, it could be a deprecation at best.

In regards to GPU code (or any code, really) - it seems much more safe to me to use @inbounds and @boundscheck and set --check-bounds=yes during development, than use --check-bounds=no in production. The former will still catch accidental out-of-bounds accesses in dependencies in production instead of introducing correctness (or even security) issues, while the latter will not.


In case it's not clear - I'm in favour of removing --check-bounds=no, while also making the compiler better at interacting with explicit @inbounds and (possibly in a future release, if at all) requiring the use of @boundscheck paired with @inbounds. To bring another point of contention over from triage - the observed behavior of constant folded vs. "interpreted" julia code should not differ at all.

@sloede
Copy link
Contributor

sloede commented Jan 24, 2023

Being able to do development with memory safety and then easily switch to faster performance using a simple command line option is literally one of the best and most used features for me as an HPC user. When talking to colleagues from the scientific computing community, maybe the single strongest argument favor of Julia is that it takes away nothing of your performance while making it it much easier to work and prototype with.

Another major points I bring up when advocating Julia is that it makes great strides in solving the two-language problem. By default, everything is memory safe and still reasonably fast. But when necessary, it's as simple as flipping a single switch and you get performance on par with Fortran/C++, and that's for real production code, not just academic setups.
IMHO, this argument would take a serious blow since in the future it means you will have the two languages inside Julia:
The regular, inexperienced-user-friendly and non-optimized code, and the super high performance code with @inbounds and @boundscheck etc. at all the right places. Maybe there will be an array type someone comes up with that will automatically allow switching to @inbounds? And hello numjl, your friendly in-Julia DSL modelled after numpy...

In addition, you now have to annotate (or better yet: first test, then annotate, since no premature optimization) each loop that might be remotely performance critical. I did a quick check; in our code base for Trixi.jl, we have around 2000 for loops. Given that some of them are not performance relevant and some of them are nested, we still need to check ~1000 places and verify if and how they could benefit from manual optimizations. Compare that with the ease of just writing --check-bounds=no and that's it. Yes, it is not a big issue to manually optimize everything for small to medium sized packages, but compare that to production codes with tens to hundreds of thousands of lines.

Finally, here are some numbers. For a production run with Trixi.jl, I see a strong influence of --check-bounds=no on the performance (5 hottest kernels; all numbers per iteration):

--check-bounds=no Regular Impact
calc_volume_integral! 121ms 122ms 0%
calc_interface_flux! 66.9ms 120ms +79%
calc_surface_integral! 23.7ms 46.1ms +95%
prolong2interfaces! 16.4ms 33.9ms +100%
apply_jacobian! 6.66ms 7.78ms +17%
... ... ...
Overall 240ms 336ms +40%

Thus while the effect of disabling bounds checking varies, there is a significant impact on many core algorithms.


TL;DR At the current state, I think removing --check-bounds=no is not the best idea. It would cause many HPC codes written in Julia over the past years to immediately cease to be competitive with C++/Fortran codes, unless hand-optimizing all loops again, taking away one of the major benefits (and selling points) Julia currently has. Finally, at least in our case --check-bounds=no has a significant impact on performance (30% reduction in overall runtime) and is thus always used where performance is critical, e.g., production runs on a large number of processors.

@KristofferC
Copy link
Member

Any indiscriminate use of --check-bounds=no is just as dangerous as just throwing @inbounds on a loop "to make things go fast".

Clearly not, since an @inbounds is restricted to the loop you apply it to while the --check-bounds=no is applied to the whole process.

@vchuravy
Copy link
Member

I am sympathetic to the performance argument, but the actual issue here is that --check-bounds=no turns a correct program into an incorrect program.

Local annotations have the benefit that they only require local knowledge to reason about them, whereas global flags require global knowledge (in my opinion impossible to obtain).
As an example we could add a mode to Julia that turns all exceptions off (a generalized check-bounds=no). One sometimes comes across some code that uses exception for control-flow (a style that is frowned upon, but nonetheless legal), we even had code like that in Base... Which would mean that turning exceptions of "globally" would have lead to dead-locking Julia programs.

Now out-of-bounds exceptions are hopefully not used like that, but --check-bounds=no can turn correct programs into incorrect programs.

In Julia we have exposed local options to control unsafe behaviours (like fast-math or opting out of bounds-checking) and as others have pointed out @inbounds applies to blocks, for-loops and even functions and is not to onerous to use.

@sloede
Copy link
Contributor

sloede commented Jan 24, 2023

@vchuravy I see your point about fine-grained control and its benefits. On the other hand, you lose a big part of what makes Julia currently very flexible. Now the same code can be memory safe or fast, without the user having to make a conscious decision in each (potentially) performance critical section. Why would you give up this awesome feature and, one of the strongest selling points of Julia in a world where you compete with established, fast code languages like C++/Fortran or slow, rapid prototyping langues like Python? I feel like this would instead lead Julia more towards Python, where you can be just as fast as Julia if you restrict yourself to loops that are amenable to @njit/Numba (with @inbounds everywhere, it will even look similar 😬).

And again, adding @inbounds whereever it makes sense from a performance perspective is not a true solution to our misgivings (as I believe was pointed out by someone else above). It is exactly that we do need this ability to have bounds checking enabled by default (for rapid prototyping), with the option to quickly switch to "production mode".

If the main argument in favor is that users abuse the --check-bounds=no option and wreaking havoc on the Julia support channels, I would classify this as an educational (or people-related) issue and not a technical issue. Therefore, imho the correct solution should be found in the instructional realm and not by applying a code patch. Otherwise, I strongly believe that the alternative will be a new package that exports a macro such as @dynamic_inbounds, with the ability to switch between the actual @inbounds and a no-op based on, e.g., Preferences.jl, just to get the old behavior back. This seems bound to cause a lot of new issues, i.e., just kicking the can of user confusion down the road 🤷

@vchuravy
Copy link
Member

Why would you give up this awesome feature and, one of the strongest selling points of Julia in a world where you compete with established, fast code languages like C++/Fortran or slow, rapid prototyping langues like Python?

For me --check-bounds=no has never been a selling point of Julia and I have never taught it in my performance engineering workshops. In contrast the local control a programmer has is for me a major advantage of Julia in contrast to C/C++ especially w.r.t fast-math. Over the years we have improved the compiler and we are now at a point where @inbounds is needed less and less, but I would expect a HPC application to use it, instead of relying on a global compiler/runtime flag.

The issue is that --check-bounds=no turns correct code into incorrect code. A second issue for me is the social one where the reliance of --check-bounds=no for performance, leads to a worse experience by default for users of your packages and of Julia. Julia is now "slow by default" and you have to tell your users to run Trixi.jl based applications with --check-bounds=no for best performance, instead of using the local mechanism and have a good user experience by default.

I could see a use for development where I would like to answer the question "would @inbounds help", but the current usage scenario you describe leads to a worse experience for everyone. User don't get fast code by default, and if they use --check-bounds=no they can't be sure that an "untested code-path" doesn't lead to a numerical corruption down the road and their climate simulation is wrong.

@williamfgc
Copy link

My two cents is to document the recommended way for developers to refactor existing code and to write code onwards no matter what the outcome is for the recommended change. Mostly to be sure I'm not giving advice on deprecated functionality as we provide tutorials for the HPC folks new to Julia. Thanks!

@Seelengrab
Copy link
Contributor

Clearly not, since an @inbounds is restricted to the loop you apply it to while the --check-bounds=no is applied to the whole process.

A single erroneous @inbounds can have just as disastrous consequences as whole program --check-bounds=no, the latter just creates vastly more opportunities for mistakes to become (in-)visible. Since all that is needed for an incorrect result is a single misuse, both are equally dangerous (just think of the OffsetArrays kerfuffle, spawned by incorrect use of @inbounds..).

Another major points I bring up when advocating Julia is that it makes great strides in solving the two-language problem. By default, everything is memory safe and still reasonably fast. But when necessary, it's as simple as flipping a single switch and you get performance on par with Fortran/C++, and that's for real production code, not just academic setups.

Now the same code can be memory safe or fast, without the user having to make a conscious decision in each (potentially) performance critical section.

In my experience, well written/idiomatic julia code is as fast (or sometimes beats) equivalent C/C++/Fortran code while retaining the safety features a high level language provides. Turning those safety features off and saying "look, without those safety features we are just as fast!" is, in my opinion, misrepresenting the strength of julia, of being able to be strictly better than the "old guard", further perpetuating the myth that the only way to go fast is to not have safe programs. This is not an either/or thing - we can, should (and ultimately MUST) have both at the same time.

Therefore, imho the correct solution should be found in the instructional realm and not by applying a code patch.

The correct way to deal with a tool that you can do nothing but cut yourself with if not held in exactly the right way (which doesn't exist here, as this is a global flag and I'm pretty sure you're not auditing all your dependencies for correct behavior) is not to tell people to only hold it in the exactly right way, it is to fix the tool so you can't cut yourself in the first place.

@PallHaraldsson
Copy link
Contributor

Could a compromise be made that you could disable --check-bounds=no at a module level (or alternatively modules disallow that by default, and you could opt into an exception, something you do until you know you've added enough @inbounds)? There is plenty of code that doesn't need to be fast (and/or isn't well tested), and such a module could still opt into @inbounds locally.

@chriselrod
Copy link
Contributor

chriselrod commented Apr 24, 2023

Thus, we do not want to add @inbounds statements into our code in order to be able to develop conveniently with bounds checking; at run time however, we need to deactivate the bounds checking everywhere: the global switch --check-bounds=no which has been available until now` has been a perfect solution.

What about the opposite: develop using --check-bounds=yes, and then run with auto?

Discussion above mostly focused on this being less pretty.

@vtjnash
Copy link
Member

vtjnash commented Apr 24, 2023

That is recommended

aviatesk added a commit that referenced this issue Jun 8, 2023
From version 1.9 onwards, when `--check-bounds=no` is used,
concrete-eval is completely disabled. However, it appears
`--check-bounds=no` is still being used within the community, causing
issues like the one reported in JuliaArrays/StaticArrays.jl#1155.
Although we should move forward to a direction of eliminating the flag
in the future (#48245), for the time being, there are many requests to
carry out a certain level of compiler optimization, even when this flag
is enabled.

This commit aims to allow concrete-eval "safely" even under
`--check-bounds=no`. Specifically, when the method call being analyzed
is `:nothrow`, it should be predominantly safe to concrete-eval it under
this flag. Technically, however, even `:nothrow` methods could trigger
undefined behavior, since `:nothrow` isn't a strict constraint and it's
possible for users to annotate potentially risky methods with
`Base.@assume_effects :nothrow`. Nonetheless, since this possibility is
acknowledged in `Base.@assume_effects` documentation, I feel it's fair
to relegate it to user responsibility.
vchuravy pushed a commit that referenced this issue Jun 12, 2023
#50107)

From version 1.9 onwards, when `--check-bounds=no` is used,
concrete-eval is completely disabled. However, it appears
`--check-bounds=no` is still being used within the community, causing
issues like the one reported in JuliaArrays/StaticArrays.jl#1155.
Although we should move forward to a direction of eliminating the flag
in the future (#48245), for the time being, there are many requests to
carry out a certain level of compiler optimization, even when this flag
is enabled.

This commit aims to allow concrete-eval "safely" even under
`--check-bounds=no`. Specifically, when the method call being analyzed
is `:nothrow`, it should be predominantly safe to concrete-eval it under
this flag. Technically, however, even `:nothrow` methods could trigger
undefined behavior, since `:nothrow` isn't a strict constraint and it's
possible for users to annotate potentially risky methods with
`Base.@assume_effects :nothrow`. Nonetheless, since this possibility is
acknowledged in `Base.@assume_effects` documentation, I feel it's fair
to relegate it to user responsibility.
aviatesk added a commit that referenced this issue Jun 14, 2023
In the current state of the Julia compiler, bounds checking and its
related optimization code, such as `@boundscheck` and `@inbounds`, pose
a significant handicap for effect analysis. As a result, we're
encountering an ironic situation where the application of `@inbounds`
annotations, which are intended to optimize performance, instead
obstruct the program's optimization, thereby preventing us from
achieving optimal performance.

This PR is designed to resolve this situation. It aims to enhance the
relationship between bounds checking and effect analysis, thereby
correctly improving the performance of programs that have `@inbounds`
annotations.

In the following, I'll first explain the reasons that have led to this
situation for better understanding, and then I'll present potential
improvements to address these issues. This commit is a collection of
various improvement proposals. It's necessary that we incorporate all
of them simultaneously to enhance the situation without introducing
any regressions.

\## Core of the Problem

There are fundamentally two reasons why effect analysis of code
containing bounds checking is difficult:
1. The evaluation value of `Expr(:boundscheck)` is influenced by the
  `@inbounds` macro and the `--check-bounds` flag. Hence, when
  performing a concrete evaluation of a method containing
  `Expr(:boundscheck)`, it's crucial to respect the `@inbounds` macro
  context and the `--check-bounds` settings, ensuring the method's
  behavior is consistent across the compile time concrete evaluation
  and the runtime execution.
1. If the code, from which bounds checking has been removed due to
  `@inbounds` or `--check-bounds=no`, is unsafe, it may lead to
  undefined behavior due to uncertain memory access.

\## Current State

The current Julia compiler handles these two problems as follows:

\### Current State 1

Regarding the first problem, if a code or method call containing
`Expr(:boundscheck)` is within an `@inbounds` context, a concrete
evaluation is immediately prohibited. For instance, in the following
case, when analyzing `bar()`, if you simply perform concrete evaluation
of `foo()`, it wouldn't properly respect the `@inbounds` context present
in `bar()`. However, since the concrete evaluation of `foo()` is
prohibited, it doesn't pose an issue:
```julia
foo() = (r = 0; @BoundsCheck r += 1; return r)

bar() = @inbounds foo()
```

Conversely, in the following case, there is _no need_ to prohibit the
concrete evaluation of `A1_inbounds` due to the presence of `@inbounds`.
This is because the execution of the `@boundscheck` block is determined
by the presence of local `@inbounds`:
```julia
function A1_inbounds()
    r = 0
    @inbounds begin
        @BoundsCheck r += 1
    end
    return r
end
```

However, currently, we prohibit the concrete evaluation of such code as
well. Moreover, we are not handling such local `@inbounds` contexts
effectively, which results in incorrect execution of `A1_inbounds()`
(even our test is incorrect for this example:
<https://github.com/JuliaLang/julia/blob/834aad4ab409f4ba65cbed2963b9ab6fa2770354/test/boundscheck_exec.jl#L34>).

Furthermore, there is room for improvement when the `--check-bounds`
flag is specified. Specifically, when the `--check-bounds` flag is set,
the evaluation value of `Expr(:boundscheck)` is determined irrespective
of the `@inbounds` context. Hence, there is no need to prohibit concrete
evaluations due to inconsistency in the evaluation value of
`Expr(:boundscheck)`.

\### Current State 2

Next, we've ensured that concrete evaluation isn't performed when
there's potentially unsafe code that may have bounds checking removed,
or when the `--check-bounds=no` flag is set, which could lead to bounds
checking being removed always.
For instance, if you perform concrete evaluation for the function call
`baz((1,2,3), 4)` in the following example, it may return a value
accessed from illicit memory and introduce undefined behaviors into the
program:
```julia
baz(t::Tuple, i::Int) = @inbounds t[i]

baz((1,2,3), 4)
```

However, it's evident that the above code is incorrect and unsafe
program and I believe undefined behavior in such programs is deemed,
as explicitly stated in the `@inbounds` documentation:

> │ Warning
> │
> │  Using @inbounds may return incorrect results/crashes/corruption for
> │  out-of-bounds indices. The user is responsible for checking it
> │  manually. Only use @inbounds when it is certain from the information
> │  locally available that all accesses are in bounds.

Actually, the `@inbounds` macro is primarily an annotation to
"improve performance by removing bounds checks from safe programs".
Therefore, I opine that it would be more reasonable to utilize it to
alleviate potential errors due to bounds checking within `@inbounds`
contexts.

To bring up another associated concern, in the current compiler
implementation, the `:nothrow` modelings for `getfield`/`arrayref`/`arrayset`
is a bit risky, and `:nothrow`-ness is assumed when their bounds checking
is turned off by call argument.
If our intended direction aligns with the removal of bounds checking
based on `@inbounds` as proposed in issue #48245, then assuming
`:nothrow`-ness due to `@inbounds` seems reasonable. However, presuming
`:nothrow`-ness due to bounds checking argument or the `--check-bounds`
flag appears to be risky, especially considering it's not documented.

\## This Commit

This commit implements all proposed improvements against the current
issues as mentioned above. In summary, the enhancements include:
- allowing concrete evaluation within a local `@inbounds` context
- folding out `Expr(:boundscheck)` when the `--check-bounds` flag is set
  (and allow concrete evaluation)
- changing the `:nothrow` effect bit to `UInt8` type, and refining
  `:nothrow` information when in an `@inbounds` context
- removing dangerous assumptions of `:nothrow`-ness for built-in
  functions when bounds checking is turned off
- replacing the `@_safeindex` hack with `@inbounds`
aviatesk added a commit that referenced this issue Jun 14, 2023
In the current state of the Julia compiler, bounds checking and its
related optimization code, such as `@boundscheck` and `@inbounds`, pose
a significant handicap for effect analysis. As a result, we're
encountering an ironic situation where the application of `@inbounds`
annotations, which are intended to optimize performance, instead
obstruct the program's optimization, thereby preventing us from
achieving optimal performance.

This PR is designed to resolve this situation. It aims to enhance the
relationship between bounds checking and effect analysis, thereby
correctly improving the performance of programs that have `@inbounds`
annotations.

In the following, I'll first explain the reasons that have led to this
situation for better understanding, and then I'll present potential
improvements to address these issues. This commit is a collection of
various improvement proposals. It's necessary that we incorporate all
of them simultaneously to enhance the situation without introducing
any regressions.

\## Core of the Problem

There are fundamentally two reasons why effect analysis of code
containing bounds checking is difficult:
1. The evaluation value of `Expr(:boundscheck)` is influenced by the
  `@inbounds` macro and the `--check-bounds` flag. Hence, when
  performing a concrete evaluation of a method containing
  `Expr(:boundscheck)`, it's crucial to respect the `@inbounds` macro
  context and the `--check-bounds` settings, ensuring the method's
  behavior is consistent across the compile time concrete evaluation
  and the runtime execution.
1. If the code, from which bounds checking has been removed due to
  `@inbounds` or `--check-bounds=no`, is unsafe, it may lead to
  undefined behavior due to uncertain memory access.

\## Current State

The current Julia compiler handles these two problems as follows:

\### Current State 1

Regarding the first problem, if a code or method call containing
`Expr(:boundscheck)` is within an `@inbounds` context, a concrete
evaluation is immediately prohibited. For instance, in the following
case, when analyzing `bar()`, if you simply perform concrete evaluation
of `foo()`, it wouldn't properly respect the `@inbounds` context present
in `bar()`. However, since the concrete evaluation of `foo()` is
prohibited, it doesn't pose an issue:
```julia
foo() = (r = 0; @BoundsCheck r += 1; return r)

bar() = @inbounds foo()
```

Conversely, in the following case, there is _no need_ to prohibit the
concrete evaluation of `A1_inbounds` due to the presence of `@inbounds`.
This is because ~~the execution of the `@boundscheck` block is determined
by the presence of local `@inbounds`~~ `Expr(:boundscheck)` within a
local `@inbounds` context does not need to block concrete evaluation:
```julia
function A1_inbounds()
    r = 0
    @inbounds begin
        @BoundsCheck r += 1
    end
    return r
end
```

However, currently, we prohibit the concrete evaluation of such code as
well. ~~Moreover, we are not handling such local `@inbounds` contexts
effectively, which results in incorrect execution of `A1_inbounds()`
(even our test is incorrect for this example:
<https://github.com/JuliaLang/julia/blob/834aad4ab409f4ba65cbed2963b9ab6fa2770354/test/boundscheck_exec.jl#L34>)~~
EDIT It was an expected behavior as pointed out by Jameson.

Furthermore, there is room for improvement when the `--check-bounds`
flag is specified. Specifically, when the `--check-bounds` flag is set,
the evaluation value of `Expr(:boundscheck)` is determined irrespective
of the `@inbounds` context. Hence, there is no need to prohibit concrete
evaluations due to inconsistency in the evaluation value of
`Expr(:boundscheck)`.

\### Current State 2

Next, we've ensured that concrete evaluation isn't performed when
there's potentially unsafe code that may have bounds checking removed,
or when the `--check-bounds=no` flag is set, which could lead to bounds
checking being removed always.
For instance, if you perform concrete evaluation for the function call
`baz((1,2,3), 4)` in the following example, it may return a value
accessed from illicit memory and introduce undefined behaviors into the
program:
```julia
baz(t::Tuple, i::Int) = @inbounds t[i]

baz((1,2,3), 4)
```

However, it's evident that the above code is incorrect and unsafe
program and I believe undefined behavior in such programs is deemed,
as explicitly stated in the `@inbounds` documentation:

> │ Warning
> │
> │  Using @inbounds may return incorrect results/crashes/corruption for
> │  out-of-bounds indices. The user is responsible for checking it
> │  manually. Only use @inbounds when it is certain from the information
> │  locally available that all accesses are in bounds.

Actually, the `@inbounds` macro is primarily an annotation to
"improve performance by removing bounds checks from safe programs".
Therefore, I opine that it would be more reasonable to utilize it to
alleviate potential errors due to bounds checking within `@inbounds`
contexts.

To bring up another associated concern, in the current compiler
implementation, the `:nothrow` modelings for `getfield`/`arrayref`/`arrayset`
is a bit risky, and `:nothrow`-ness is assumed when their bounds checking
is turned off by call argument.
If our intended direction aligns with the removal of bounds checking
based on `@inbounds` as proposed in issue #48245, then assuming
`:nothrow`-ness due to `@inbounds` seems reasonable. However, presuming
`:nothrow`-ness due to bounds checking argument or the `--check-bounds`
flag appears to be risky, especially considering it's not documented.

\## This Commit

This commit implements all proposed improvements against the current
issues as mentioned above. In summary, the enhancements include:
- making `Expr(:boundscheck)` within a local `@inbounds` context not
  block concrete evaluation
- folding out `Expr(:boundscheck)` when the `--check-bounds` flag is set
  (and allow concrete evaluation)
- changing the `:nothrow` effect bit to `UInt8` type, and refining
  `:nothrow` information when in an `@inbounds` context
- removing dangerous assumptions of `:nothrow`-ness for built-in
  functions when bounds checking is turned off
- replacing the `@_safeindex` hack with `@inbounds`
aviatesk added a commit that referenced this issue Jun 15, 2023
In the current state of the Julia compiler, bounds checking and its
related optimization code, such as `@boundscheck` and `@inbounds`, pose
a significant handicap for effect analysis. As a result, we're
encountering an ironic situation where the application of `@inbounds`
annotations, which are intended to optimize performance, instead
obstruct the program's optimization, thereby preventing us from
achieving optimal performance.

This PR is designed to resolve this situation. It aims to enhance the
relationship between bounds checking and effect analysis, thereby
correctly improving the performance of programs that have `@inbounds`
annotations.

In the following, I'll first explain the reasons that have led to this
situation for better understanding, and then I'll present potential
improvements to address these issues. This commit is a collection of
various improvement proposals. It's necessary that we incorporate all
of them simultaneously to enhance the situation without introducing
any regressions.

\## Core of the Problem

There are fundamentally two reasons why effect analysis of code
containing bounds checking is difficult:
1. The evaluation value of `Expr(:boundscheck)` is influenced by the
  `@inbounds` macro and the `--check-bounds` flag. Hence, when
  performing a concrete evaluation of a method containing
  `Expr(:boundscheck)`, it's crucial to respect the `@inbounds` macro
  context and the `--check-bounds` settings, ensuring the method's
  behavior is consistent across the compile time concrete evaluation
  and the runtime execution.
1. If the code, from which bounds checking has been removed due to
  `@inbounds` or `--check-bounds=no`, is unsafe, it may lead to
  undefined behavior due to uncertain memory access.

\## Current State

The current Julia compiler handles these two problems as follows:

\### Current State 1

Regarding the first problem, if a code or method call containing
`Expr(:boundscheck)` is within an `@inbounds` context, a concrete
evaluation is immediately prohibited. For instance, in the following
case, when analyzing `bar()`, if you simply perform concrete evaluation
of `foo()`, it wouldn't properly respect the `@inbounds` context present
in `bar()`. However, since the concrete evaluation of `foo()` is
prohibited, it doesn't pose an issue:
```julia
foo() = (r = 0; @BoundsCheck r += 1; return r)

bar() = @inbounds foo()
```

Conversely, in the following case, there is _no need_ to prohibit the
concrete evaluation of `A1_inbounds` due to the presence of `@inbounds`.
This is because ~~the execution of the `@boundscheck` block is determined
by the presence of local `@inbounds`~~ `Expr(:boundscheck)` within a
local `@inbounds` context does not need to block concrete evaluation:
```julia
function A1_inbounds()
    r = 0
    @inbounds begin
        @BoundsCheck r += 1
    end
    return r
end
```

However, currently, we prohibit the concrete evaluation of such code as
well. ~~Moreover, we are not handling such local `@inbounds` contexts
effectively, which results in incorrect execution of `A1_inbounds()`
(even our test is incorrect for this example:
`https://github.com/JuliaLang/julia/blob/834aad4ab409f4ba65cbed2963b9ab6fa2770354/test/boundscheck_exec.jl#L34`)~~
EDIT: It is an expected behavior as pointed out by Jameson.

Furthermore, there is room for improvement when the `--check-bounds`
flag is specified. Specifically, when the `--check-bounds` flag is set,
the evaluation value of `Expr(:boundscheck)` is determined irrespective
of the `@inbounds` context. Hence, there is no need to prohibit concrete
evaluations due to inconsistency in the evaluation value of
`Expr(:boundscheck)`.

\### Current State 2

Next, we've ensured that concrete evaluation isn't performed when
there's potentially unsafe code that may have bounds checking removed,
or when the `--check-bounds=no` flag is set, which could lead to bounds
checking being removed always.
For instance, if you perform concrete evaluation for the function call
`baz((1,2,3), 4)` in the following example, it may return a value
accessed from illicit memory and introduce undefined behaviors into the
program:
```julia
baz(t::Tuple, i::Int) = @inbounds t[i]

baz((1,2,3), 4)
```

However, it's evident that the above code is incorrect and unsafe
program and I believe undefined behavior in such programs is deemed,
as explicitly stated in the `@inbounds` documentation:

> │ Warning
> │
> │  Using @inbounds may return incorrect results/crashes/corruption for
> │  out-of-bounds indices. The user is responsible for checking it
> │  manually. Only use @inbounds when it is certain from the information
> │  locally available that all accesses are in bounds.

Actually, the `@inbounds` macro is primarily an annotation to
"improve performance by removing bounds checks from safe programs".
Therefore, I opine that it would be more reasonable to utilize it to
alleviate potential errors due to bounds checking within `@inbounds`
contexts.

To bring up another associated concern, in the current compiler
implementation, the `:nothrow` modelings for `getfield`/`arrayref`/`arrayset`
is a bit risky, and `:nothrow`-ness is assumed when their bounds checking
is turned off by call argument.
If our intended direction aligns with the removal of bounds checking
based on `@inbounds` as proposed in issue #48245, then assuming
`:nothrow`-ness due to `@inbounds` seems reasonable. However, presuming
`:nothrow`-ness due to bounds checking argument or the `--check-bounds`
flag appears to be risky, especially considering it's not documented.

\## This Commit

This commit implements all proposed improvements against the current
issues as mentioned above. In summary, the enhancements include:
- making `Expr(:boundscheck)` within a local `@inbounds` context not
  block concrete evaluation
- folding out `Expr(:boundscheck)` when the `--check-bounds` flag is set
  (and allow concrete evaluation)
- changing the `:nothrow` effect bit to `UInt8` type, and refining
  `:nothrow` information when in an `@inbounds` context
- removing dangerous assumptions of `:nothrow`-ness for built-in
  functions when bounds checking is turned off
- replacing the `@_safeindex` hack with `@inbounds`
aviatesk added a commit that referenced this issue Jun 15, 2023
In the current state of the Julia compiler, bounds checking and its
related optimization code, such as `@boundscheck` and `@inbounds`, pose
a significant handicap for effect analysis. As a result, we're
encountering an ironic situation where the application of `@inbounds`
annotations, which are intended to optimize performance, instead
obstruct the program's optimization, thereby preventing us from
achieving optimal performance.

This PR is designed to resolve this situation. It aims to enhance the
relationship between bounds checking and effect analysis, thereby
correctly improving the performance of programs that have `@inbounds`
annotations.

In the following, I'll first explain the reasons that have led to this
situation for better understanding, and then I'll present potential
improvements to address these issues. This commit is a collection of
various improvement proposals. It's necessary that we incorporate all
of them simultaneously to enhance the situation without introducing
any regressions.

\## Core of the Problem

There are fundamentally two reasons why effect analysis of code
containing bounds checking is difficult:
1. The evaluation value of `Expr(:boundscheck)` is influenced by the
  `@inbounds` macro and the `--check-bounds` flag. Hence, when
  performing a concrete evaluation of a method containing
  `Expr(:boundscheck)`, it's crucial to respect the `@inbounds` macro
  context and the `--check-bounds` settings, ensuring the method's
  behavior is consistent across the compile time concrete evaluation
  and the runtime execution.
1. If the code, from which bounds checking has been removed due to
  `@inbounds` or `--check-bounds=no`, is unsafe, it may lead to
  undefined behavior due to uncertain memory access.

\## Current State

The current Julia compiler handles these two problems as follows:

\### Current State 1

Regarding the first problem, if a code or method call containing
`Expr(:boundscheck)` is within an `@inbounds` context, a concrete
evaluation is immediately prohibited. For instance, in the following
case, when analyzing `bar()`, if you simply perform concrete evaluation
of `foo()`, it wouldn't properly respect the `@inbounds` context present
in `bar()`. However, since the concrete evaluation of `foo()` is
prohibited, it doesn't pose an issue:
```julia
foo() = (r = 0; @BoundsCheck r += 1; return r)

bar() = @inbounds foo()
```

Conversely, in the following case, there is _no need_ to prohibit the
concrete evaluation of `A1_inbounds` due to the presence of `@inbounds`.
This is because ~~the execution of the `@boundscheck` block is determined
by the presence of local `@inbounds`~~ `Expr(:boundscheck)` within a
local `@inbounds` context does not need to block concrete evaluation:
```julia
function A1_inbounds()
    r = 0
    @inbounds begin
        @BoundsCheck r += 1
    end
    return r
end
```

However, currently, we prohibit the concrete evaluation of such code as
well. ~~Moreover, we are not handling such local `@inbounds` contexts
effectively, which results in incorrect execution of `A1_inbounds()`
(even our test is incorrect for this example:
`https://github.com/JuliaLang/julia/blob/834aad4ab409f4ba65cbed2963b9ab6fa2770354/test/boundscheck_exec.jl#L34`)~~
EDIT: It is an expected behavior as pointed out by Jameson.

Furthermore, there is room for improvement when the `--check-bounds`
flag is specified. Specifically, when the `--check-bounds` flag is set,
the evaluation value of `Expr(:boundscheck)` is determined irrespective
of the `@inbounds` context. Hence, there is no need to prohibit concrete
evaluations due to inconsistency in the evaluation value of
`Expr(:boundscheck)`.

\### Current State 2

Next, we've ensured that concrete evaluation isn't performed when
there's potentially unsafe code that may have bounds checking removed,
or when the `--check-bounds=no` flag is set, which could lead to bounds
checking being removed always.
For instance, if you perform concrete evaluation for the function call
`baz((1,2,3), 4)` in the following example, it may return a value
accessed from illicit memory and introduce undefined behaviors into the
program:
```julia
baz(t::Tuple, i::Int) = @inbounds t[i]

baz((1,2,3), 4)
```

However, it's evident that the above code is incorrect and unsafe
program and I believe undefined behavior in such programs is deemed,
as explicitly stated in the `@inbounds` documentation:

> │ Warning
> │
> │  Using @inbounds may return incorrect results/crashes/corruption for
> │  out-of-bounds indices. The user is responsible for checking it
> │  manually. Only use @inbounds when it is certain from the information
> │  locally available that all accesses are in bounds.

Actually, the `@inbounds` macro is primarily an annotation to
"improve performance by removing bounds checks from safe programs".
Therefore, I opine that it would be more reasonable to utilize it to
alleviate potential errors due to bounds checking within `@inbounds`
contexts.

To bring up another associated concern, in the current compiler
implementation, the `:nothrow` modelings for `getfield`/`arrayref`/`arrayset`
is a bit risky, and `:nothrow`-ness is assumed when their bounds checking
is turned off by call argument.
If our intended direction aligns with the removal of bounds checking
based on `@inbounds` as proposed in issue #48245, then assuming
`:nothrow`-ness due to `@inbounds` seems reasonable. However, presuming
`:nothrow`-ness due to bounds checking argument or the `--check-bounds`
flag appears to be risky, especially considering it's not documented.

\## This Commit

This commit implements all proposed improvements against the current
issues as mentioned above. In summary, the enhancements include:
- making `Expr(:boundscheck)` within a local `@inbounds` context not
  block concrete evaluation
- folding out `Expr(:boundscheck)` when the `--check-bounds` flag is set
  (and allow concrete evaluation)
- changing the `:nothrow` effect bit to `UInt8` type, and refining
  `:nothrow` information when in an `@inbounds` context
- removing dangerous assumptions of `:nothrow`-ness for built-in
  functions when bounds checking is turned off
- replacing the `@_safeindex` hack with `@inbounds`
aviatesk added a commit that referenced this issue Jun 15, 2023
In the current state of the Julia compiler, bounds checking and its
related optimization code, such as `@boundscheck` and `@inbounds`, pose
a significant handicap for effect analysis. As a result, we're
encountering an ironic situation where the application of `@inbounds`
annotations, which are intended to optimize performance, instead
obstruct the program's optimization, thereby preventing us from
achieving optimal performance.

This PR is designed to resolve this situation. It aims to enhance the
relationship between bounds checking and effect analysis, thereby
correctly improving the performance of programs that have `@inbounds`
annotations.

In the following, I'll first explain the reasons that have led to this
situation for better understanding, and then I'll present potential
improvements to address these issues. This commit is a collection of
various improvement proposals. It's necessary that we incorporate all
of them simultaneously to enhance the situation without introducing
any regressions.

\## Core of the Problem

There are fundamentally two reasons why effect analysis of code
containing bounds checking is difficult:
1. The evaluation value of `Expr(:boundscheck)` is influenced by the
  `@inbounds` macro and the `--check-bounds` flag. Hence, when
  performing a concrete evaluation of a method containing
  `Expr(:boundscheck)`, it's crucial to respect the `@inbounds` macro
  context and the `--check-bounds` settings, ensuring the method's
  behavior is consistent across the compile time concrete evaluation
  and the runtime execution.
1. If the code, from which bounds checking has been removed due to
  `@inbounds` or `--check-bounds=no`, is unsafe, it may lead to
  undefined behavior due to uncertain memory access.

\## Current State

The current Julia compiler handles these two problems as follows:

\### Current State 1

Regarding the first problem, if a code or method call containing
`Expr(:boundscheck)` is within an `@inbounds` context, a concrete
evaluation is immediately prohibited. For instance, in the following
case, when analyzing `bar()`, if you simply perform concrete evaluation
of `foo()`, it wouldn't properly respect the `@inbounds` context present
in `bar()`. However, since the concrete evaluation of `foo()` is
prohibited, it doesn't pose an issue:
```julia
foo() = (r = 0; @BoundsCheck r += 1; return r)

bar() = @inbounds foo()
```

Conversely, in the following case, there is _no need_ to prohibit the
concrete evaluation of `A1_inbounds` due to the presence of `@inbounds`.
This is because ~~the execution of the `@boundscheck` block is determined
by the presence of local `@inbounds`~~ `Expr(:boundscheck)` within a
local `@inbounds` context does not need to block concrete evaluation:
```julia
function A1_inbounds()
    r = 0
    @inbounds begin
        @BoundsCheck r += 1
    end
    return r
end
```

However, currently, we prohibit the concrete evaluation of such code as
well. ~~Moreover, we are not handling such local `@inbounds` contexts
effectively, which results in incorrect execution of `A1_inbounds()`
(even our test is incorrect for this example:
`https://github.com/JuliaLang/julia/blob/834aad4ab409f4ba65cbed2963b9ab6fa2770354/test/boundscheck_exec.jl#L34`)~~
EDIT: It is an expected behavior as pointed out by Jameson.

Furthermore, there is room for improvement when the `--check-bounds`
flag is specified. Specifically, when the `--check-bounds` flag is set,
the evaluation value of `Expr(:boundscheck)` is determined irrespective
of the `@inbounds` context. Hence, there is no need to prohibit concrete
evaluations due to inconsistency in the evaluation value of
`Expr(:boundscheck)`.

\### Current State 2

Next, we've ensured that concrete evaluation isn't performed when
there's potentially unsafe code that may have bounds checking removed,
or when the `--check-bounds=no` flag is set, which could lead to bounds
checking being removed always.
For instance, if you perform concrete evaluation for the function call
`baz((1,2,3), 4)` in the following example, it may return a value
accessed from illicit memory and introduce undefined behaviors into the
program:
```julia
baz(t::Tuple, i::Int) = @inbounds t[i]

baz((1,2,3), 4)
```

However, it's evident that the above code is incorrect and unsafe
program and I believe undefined behavior in such programs is deemed,
as explicitly stated in the `@inbounds` documentation:

> │ Warning
> │
> │  Using @inbounds may return incorrect results/crashes/corruption for
> │  out-of-bounds indices. The user is responsible for checking it
> │  manually. Only use @inbounds when it is certain from the information
> │  locally available that all accesses are in bounds.

Actually, the `@inbounds` macro is primarily an annotation to
"improve performance by removing bounds checks from safe programs".
Therefore, I opine that it would be more reasonable to utilize it to
alleviate potential errors due to bounds checking within `@inbounds`
contexts.

To bring up another associated concern, in the current compiler
implementation, the `:nothrow` modelings for `getfield`/`arrayref`/`arrayset`
is a bit risky, and `:nothrow`-ness is assumed when their bounds checking
is turned off by call argument.
If our intended direction aligns with the removal of bounds checking
based on `@inbounds` as proposed in issue #48245, then assuming
`:nothrow`-ness due to `@inbounds` seems reasonable. However, presuming
`:nothrow`-ness due to bounds checking argument or the `--check-bounds`
flag appears to be risky, especially considering it's not documented.

\## This Commit

This commit implements all proposed improvements against the current
issues as mentioned above. In summary, the enhancements include:
- making `Expr(:boundscheck)` within a local `@inbounds` context not
  block concrete evaluation
- folding out `Expr(:boundscheck)` when the `--check-bounds` flag is set
  (and allow concrete evaluation)
- changing the `:nothrow` effect bit to `UInt8` type, and refining
  `:nothrow` information when in an `@inbounds` context
- removing dangerous assumptions of `:nothrow`-ness for built-in
  functions when bounds checking is turned off
- replacing the `@_safeindex` hack with `@inbounds`
Keno added a commit that referenced this issue Jun 20, 2023
In 1.9, `--check-bounds=no` has started causing significant performance
regressions (e.g. #50110). This is because we switched a number of functions that
used to be `@pure` to new effects-based infrastructure, which very closely tracks
the the legality conditions for concrete evaluation. Unfortunately, disabling
bounds checking completely invalidates all prior legality analysis, so the only
realistic path we have is to completely disable it.

In general, we are learning that these kinds of global make-things-faster-but-unsafe
flags are highly problematic for a language for several reasons:

- Code is written with the assumption of a particular mode being chosen, so
  it is in general not possible or unsafe to compose libraries (which in a language
  like julia is a huge problem).

- Unsafe semantics are often harder for the compiler to reason about, causing
  unexpected performance issues (although the 1.9 --check-bounds=no issues are
  worse than just disabling concrete eval for things that use bounds checking)

In general, I'd like to remove the `--check-bounds=` option entirely (#48245),
but that proposal has encountered major opposition.

This PR implements an alternative proposal: We introduce a new function
`Core.should_check_bounds(boundscheck::Bool) = boundscheck`. This function is
passed the result of `Expr(:boundscheck)` (which is now purely determined by
the inliner based on `@inbounds`, without regard for the command line flag).

In this proposal, what the command line flag does is simply redefine this
function to either `true` or `false` (unconditionally) depending on the
value of the flag.

Of course, this causes massive amounts of recompilation, but I think this can
be addressed by adding logic to loading that loads a pkgimage with appropriate
definitions to cure the invalidations. The special logic we have now now
to take account of the --check-bounds flag in .ji selection, would be replaced
by automatically injecting the special pkgimage as a dependency to every
loaded image. This part isn't implemented in this PR, but I think it's reasonable
to do.

I think with that, the `--check-bounds` flag remains functional, while having
much more well defined behavior, as it relies on the standard world age
mechanisms.

A major benefit of this approach is that it can be scoped appropriately using
overlay tables. For exmaple:

```
julia> using CassetteOverlay

julia> @MethodTable AssumeInboundsTable;

julia> @overlay AssumeInboundsTable Core.should_check_bounds(b::Bool) = false;

julia> assume_inbounds = @overlaypass AssumeInboundsTable

julia> assume_inbounds(f, args...) # f(args...) with bounds checking disabled dynamically
```

Similar logic applies to GPUCompiler, which already supports overlay tables.
Keno added a commit that referenced this issue Jun 20, 2023
In 1.9, `--check-bounds=no` has started causing significant performance
regressions (e.g. #50110). This is because we switched a number of functions that
used to be `@pure` to new effects-based infrastructure, which very closely tracks
the the legality conditions for concrete evaluation. Unfortunately, disabling
bounds checking completely invalidates all prior legality analysis, so the only
realistic path we have is to completely disable it.

In general, we are learning that these kinds of global make-things-faster-but-unsafe
flags are highly problematic for a language for several reasons:

- Code is written with the assumption of a particular mode being chosen, so
  it is in general not possible or unsafe to compose libraries (which in a language
  like julia is a huge problem).

- Unsafe semantics are often harder for the compiler to reason about, causing
  unexpected performance issues (although the 1.9 --check-bounds=no issues are
  worse than just disabling concrete eval for things that use bounds checking)

In general, I'd like to remove the `--check-bounds=` option entirely (#48245),
but that proposal has encountered major opposition.

This PR implements an alternative proposal: We introduce a new function
`Core.should_check_bounds(boundscheck::Bool) = boundscheck`. This function is
passed the result of `Expr(:boundscheck)` (which is now purely determined by
the inliner based on `@inbounds`, without regard for the command line flag).

In this proposal, what the command line flag does is simply redefine this
function to either `true` or `false` (unconditionally) depending on the
value of the flag.

Of course, this causes massive amounts of recompilation, but I think this can
be addressed by adding logic to loading that loads a pkgimage with appropriate
definitions to cure the invalidations. The special logic we have now now
to take account of the --check-bounds flag in .ji selection, would be replaced
by automatically injecting the special pkgimage as a dependency to every
loaded image. This part isn't implemented in this PR, but I think it's reasonable
to do.

I think with that, the `--check-bounds` flag remains functional, while having
much more well defined behavior, as it relies on the standard world age
mechanisms.

A major benefit of this approach is that it can be scoped appropriately using
overlay tables. For exmaple:

```
julia> using CassetteOverlay

julia> @MethodTable AssumeInboundsTable;

julia> @overlay AssumeInboundsTable Core.should_check_bounds(b::Bool) = false;

julia> assume_inbounds = @overlaypass AssumeInboundsTable

julia> assume_inbounds(f, args...) # f(args...) with bounds checking disabled dynamically
```

Similar logic applies to GPUCompiler, which already supports overlay tables.
@Keno
Copy link
Member Author

Keno commented Jun 20, 2023

See RFC for one possible approach in #50239

Keno added a commit that referenced this issue Jul 18, 2023
In 1.9, `--check-bounds=no` has started causing significant performance
regressions (e.g. #50110). This is because we switched a number of functions that
used to be `@pure` to new effects-based infrastructure, which very closely tracks
the the legality conditions for concrete evaluation. Unfortunately, disabling
bounds checking completely invalidates all prior legality analysis, so the only
realistic path we have is to completely disable it.

In general, we are learning that these kinds of global make-things-faster-but-unsafe
flags are highly problematic for a language for several reasons:

- Code is written with the assumption of a particular mode being chosen, so
  it is in general not possible or unsafe to compose libraries (which in a language
  like julia is a huge problem).

- Unsafe semantics are often harder for the compiler to reason about, causing
  unexpected performance issues (although the 1.9 --check-bounds=no issues are
  worse than just disabling concrete eval for things that use bounds checking)

In general, I'd like to remove the `--check-bounds=` option entirely (#48245),
but that proposal has encountered major opposition.

This PR implements an alternative proposal: We introduce a new function
`Core.should_check_bounds(boundscheck::Bool) = boundscheck`. This function is
passed the result of `Expr(:boundscheck)` (which is now purely determined by
the inliner based on `@inbounds`, without regard for the command line flag).

In this proposal, what the command line flag does is simply redefine this
function to either `true` or `false` (unconditionally) depending on the
value of the flag.

Of course, this causes massive amounts of recompilation, but I think this can
be addressed by adding logic to loading that loads a pkgimage with appropriate
definitions to cure the invalidations. The special logic we have now now
to take account of the --check-bounds flag in .ji selection, would be replaced
by automatically injecting the special pkgimage as a dependency to every
loaded image. This part isn't implemented in this PR, but I think it's reasonable
to do.

I think with that, the `--check-bounds` flag remains functional, while having
much more well defined behavior, as it relies on the standard world age
mechanisms.

A major benefit of this approach is that it can be scoped appropriately using
overlay tables. For exmaple:

```
julia> using CassetteOverlay

julia> @MethodTable AssumeInboundsTable;

julia> @overlay AssumeInboundsTable Core.should_check_bounds(b::Bool) = false;

julia> assume_inbounds = @overlaypass AssumeInboundsTable

julia> assume_inbounds(f, args...) # f(args...) with bounds checking disabled dynamically
```

Similar logic applies to GPUCompiler, which already supports overlay tables.
Keno added a commit that referenced this issue Jul 19, 2023
In 1.9, `--check-bounds=no` has started causing significant performance
regressions (e.g. #50110). This is because we switched a number of functions that
used to be `@pure` to new effects-based infrastructure, which very closely tracks
the the legality conditions for concrete evaluation. Unfortunately, disabling
bounds checking completely invalidates all prior legality analysis, so the only
realistic path we have is to completely disable it.

In general, we are learning that these kinds of global make-things-faster-but-unsafe
flags are highly problematic for a language for several reasons:

- Code is written with the assumption of a particular mode being chosen, so
  it is in general not possible or unsafe to compose libraries (which in a language
  like julia is a huge problem).

- Unsafe semantics are often harder for the compiler to reason about, causing
  unexpected performance issues (although the 1.9 --check-bounds=no issues are
  worse than just disabling concrete eval for things that use bounds checking)

In general, I'd like to remove the `--check-bounds=` option entirely (#48245),
but that proposal has encountered major opposition.

This PR implements an alternative proposal: We introduce a new function
`Core.should_check_bounds(boundscheck::Bool) = boundscheck`. This function is
passed the result of `Expr(:boundscheck)` (which is now purely determined by
the inliner based on `@inbounds`, without regard for the command line flag).

In this proposal, what the command line flag does is simply redefine this
function to either `true` or `false` (unconditionally) depending on the
value of the flag.

Of course, this causes massive amounts of recompilation, but I think this can
be addressed by adding logic to loading that loads a pkgimage with appropriate
definitions to cure the invalidations. The special logic we have now now
to take account of the --check-bounds flag in .ji selection, would be replaced
by automatically injecting the special pkgimage as a dependency to every
loaded image. This part isn't implemented in this PR, but I think it's reasonable
to do.

I think with that, the `--check-bounds` flag remains functional, while having
much more well defined behavior, as it relies on the standard world age
mechanisms.

A major benefit of this approach is that it can be scoped appropriately using
overlay tables. For exmaple:

```
julia> using CassetteOverlay

julia> @MethodTable AssumeInboundsTable;

julia> @overlay AssumeInboundsTable Core.should_check_bounds(b::Bool) = false;

julia> assume_inbounds = @overlaypass AssumeInboundsTable

julia> assume_inbounds(f, args...) # f(args...) with bounds checking disabled dynamically
```

Similar logic applies to GPUCompiler, which already supports overlay tables.
@LilithHafner LilithHafner removed the triage This should be discussed on a triage call label Jan 3, 2024
@johnomotani
Copy link

johnomotani commented Jan 31, 2025

This is still an important issue to resolve for HPC users of Julia. There was some good discussion on https://discourse.julialang.org/t/removing-bounds-checking-for-hpc-check-bounds-unsafe/124897.

I made a suggestion on that thread that is I think goes in a different direction to the discussion here and on #50239, #50641:

Is there an argument here for array-access bounds checks being an important special case?

For myself (and I think I’m representing a reasonably large class of users) the bounds checks on array accesses are the only ones I really care about, because they’re the ones that happen inside a loop over grid points. In a code doing time evolution on a grid, the grid doesn’t change, and if I’ve tested my indexing logic on several small grids, and a couple of timesteps on a large grid, either bounds checks caught my errors already, or they won’t catch the error in my production run anyway, so bounds checks after initial testing are just a waste of time, money, and CO2 emissions. However, due to complicated discretizations of PDEs, etc. it is unlikely that the bounds checks can be optimized away at compile time.

Proposal: have a special type of @boundscheck/@inbounds like @arrayboundscheck/@arrayinbounds that can be used by the Array interface (and other suitably similar types), and have --check-bounds=noarray that is intermediate between --check-bounds=no and --check-bounds=auto in that it only disables @arrayboundscheck checks. If that was a feature, at least for my use-case, I don’t care if --check-bounds=no was removed.

@mbauman
Copy link
Member

mbauman commented Jan 31, 2025

That wouldn't really resolve the crux of your problem here. In short the status quo is that:

  • The compiler can reason about the effects that code can have
  • This allows the compiler to know if it's safe to move some computation around — to compile time, outside a loop, etc.
  • An @inbounds'ed access might have the side-effect of corrupting or crashing your program
  • The compiler can't safely optimize that in the same manner. If it tried, it might corrupt your program in wild and unpredictable ways or just crash, even if it is never accessed out-of-bounds at runtime.
  • So therefore @inbounds can sometimes make code slower, whether it was written as @inbounds or unsafe_load or --check-bounds=no.

Having an option that just affects Array would still have the exact same problem, but just for Arrays.

Bounds checks are surprisingly cheap on many architectures unless they inhibit a compiler optimization like SIMD. So what you want isn't "just" no bounds checks. It's also those subsequent compiler optimizations to surrounding code that are only made possible without that branch. But it's ironically that very branch that can (sometimes) inform the compiler that such optimizations are valid.

@johnomotani
Copy link

johnomotani commented Jan 31, 2025

I have a case where removing all bounds checks on arrays (or using --check-bounds=no on Julia-1.10) speeds up my code by 2x. Scientific HPC application developers seem to agree that being able to disable bounds checks is what they would expect (see the discourse thread, also for explanations of why this is normal and acceptable). I see 'the compiler might be able to optimize better with bounds checks' as a hypothetical. I have real-world examples where the opposite is true, by a factor of 2. If/when the compiler can optimize those situations, then 'remove --check-bounds=no' becomes an option, but at the moment it seems that some people want to remove the option of disabling safety in favour of performance (after verifying sufficiently that the code is correct). Just to repeat the point, we can provide examples where this has a 2x cost in performance, in exactly the place where this is a huge cost - HPC jobs that could cost $10000s (and associated time and carbon emissions) for a single run. That's why we're very, very keen on having equivalent functionality for --check-bounds=no.

Edited to add: I don't say that the solution I suggested is a good one. My main point is we (scientific HPC developers) have a need for some solution, in a way that wasn't being discussed in this thread.

As a slightly separate note: if the 'solution' is 'use @inbounds where you need it for performance', how are we supposed to know when we've done a good enough job? In the past, if we wanted to implement that solution, we would most likely compare to --check-bounds=no, and if the performance was within 1% or so, you'd know that you've done enough. Without --check-bounds=no being a viable option, we can't even do that (much less profile the code to find where exactly we could add @inbounds to increase performance).

@mbauman
Copy link
Member

mbauman commented Jan 31, 2025

Right, I'm not saying that @inbounds can't improve performance — it is still often required for SIMD transformations which can easily net 2x or 4x or more-x perf. I'm just pointing out that the opposite can and does happen — and this has been the case for a long time (see, e.g., #39340 on v1.5)!

That's what I'm calling the crux of your problem: that @inbounds isn't guaranteed to monotonically improve performance. And it never was. It can and does throw speed-bumps, too, regardless of how it was applied. --check-bounds=no is just a great way to find all these spots.

What happened is that the effects system overhaul changed where some of these speed bumps landed, so there were more regressions in existing codebases as you upgraded from 1.10 to 1.11+.

@johnomotani
Copy link

johnomotani commented Jan 31, 2025

OK, fair point. But --check-bounds=no is broken at the moment
JuliaArrays/StaticArrays.jl#1155
JuliaLang/PackageCompiler.jl#1021

and there seems to be no developer will to fix those at the moment. I have the impression that that lack of priority comes from threads like this one effectively meaning --check-bounds=no is unsupported and being allowed to break. It would be nice if it was either fixed, or a comparable replacement was suggested. As you say, being able to profile with --check-bounds=no is how to identify which parts of your code @inbounds would speed up - without it the equivalent profiling would be so much work that it might as well be impossible.

@mbauman
Copy link
Member

mbauman commented Jan 31, 2025

The biggest problem is that --check-bounds=no affects all of Julia, including type promotion code itself, and some packages (especially those targeting compile-time optimizations) live right on the knife's edge of what the compiler can constant-propagate for type stability. And type instabilities are expensive. For example, in #50985 typejoin(Int, Static.StaticInt{2}) isn't constant-folding to Number without bounds checks because the typejoin algorithm itself happens to use indexing. Maybe it should additionally promise it won't throw or have UB, but I suspect that's not safe to do given that Keno has looked at it.

The other problem, though, is that it's assumed to be a "go-faster" easy button — and I think that's underpinning a lot of the commentary around it being "broken" throughout this thread. That's not necessarily true and I actually hope that it'll continue to become less true as Julia's compiler gets more capable.

That segfault is interesting, though. It'd be great to minimize it if at all possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests