Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make @big_str allocate at runtime by default #41792

Closed
jakobnissen opened this issue Aug 5, 2021 · 22 comments
Closed

Make @big_str allocate at runtime by default #41792

jakobnissen opened this issue Aug 5, 2021 · 22 comments
Labels
bignums BigInt and BigFloat

Comments

@jakobnissen
Copy link
Contributor

We fairly often see newcomers to Julia post something like this:

Why doesn't julia> 10^100 work in Julia?

Where the response is

Just do big"10"^100

And so people get the impression that big"X" simply creates a BigInt from X. But it doesn't, not really. Instead, the value is created as a constant at compile-time. This is correct behaviour, as documented in the manual section on string literals, but it can lead to nasty bugs:

julia> function as_big_sum(x)
           Base.GMP.MPZ.add_ui!(big"0", x) # inplace add, for efficiency
       end
as_big_sum (generic function with 1 method)

julia> as_big_sum(2) # works!
2

Spot the problem?

julia> as_big_sum(2) # uhh
4

julia> as_big_sum(2) # what?
6

Again, this is not a bug, it's correct behaviour, but it would be nicer if the default behaviour was safe (new allocation with the macro) and the opt-in behaviour was fast (re-use same object), instead of the other way around. I realize this cannot be solved in general for @X_str, but perhaps just for @big_str?

For what it's worth, BioSequences uses a flag on its @dna_str macro which controls "dynamic" vs "static" allocation for DNA strings.

@KristofferC
Copy link
Member

Since Julia's numbers are considered immutable the current behavior is already "safe". Mucking around with internals can almost always turn something safe into unsafe. So using this specific example doesn't look super compelling to me.

@rfourquet rfourquet added the bignums BigInt and BigFloat label Aug 5, 2021
@DNF2
Copy link

DNF2 commented Aug 5, 2021

It's very common to use mutating operations on BigInts, for performance reasons. And I don't think there is anything in the docstring of big"" at least that lets on that it doesn't return a new object, especially since (as you said) numbers in Julia are considered immutable.

I've never read anywhere that big"" had any purpose except for avoiding truncation at construction, so finding that it has a somewhat dangerous optimization comes as a bit of a surprise.

@KristofferC
Copy link
Member

The world that the Julia documentation presents is that of immutable big numbers so the manual will not mention anything that has to do with their mutability since that is outside the "world". It is on the people using the internals to make sure that this is done safely, for example using parse(..., BigInt). As been said, the documentation also says that the "object is actually created when the code is compiled".

Why should everyone get worse performance in the common use case instead of the people doing the mutation just swapping over to the correct thing to use?

@DNF2
Copy link

DNF2 commented Aug 5, 2021

Personally, I would be happy with a warning in the docstring of the big"" macro, not sure how BioSequences handles this.

As been said, the documentation also says that the "object is actually created when the code is compiled".

That seems to refer to the regex macro (and don't think it's common to mutate those), it's not mentioned near the big macro, I think. I would never guess that this referred to both, let alone would I be able to predict the consequences of this somewhat obscure remark.

I think a mention of this in the docstring of big"" would be warranted, but I'm not sure if it would be welcome.

(As for performance, I would guess most users concerned with the performance of BigInts are using the mutating operations.)

@KristofferC
Copy link
Member

KristofferC commented Aug 5, 2021

The problem with adding "a warning in the docstring of the big"" macro" is that it raises the question to someone reading when this would matter. And there is nothing in the manual that explains this, so now the manual is "inconsistent" in the sense that there are unexplainable statements in it. If something should be added it should be clear that this is outside the Julia "world" in order for it not to be confusing.

@DNF2
Copy link

DNF2 commented Aug 5, 2021

I understand your argument, and it's compelling.

It just seems like having 'semantically immutable' numbers that are mutable under the hood, and then 'semi-exposing' it through this macro (and obliquely documenting it in the macro chapter) is unfortunate.

Is this optimization used by the big macro important when using bignums in 'immutable mode', do you think?

@oscardssmith
Copy link
Member

I think we should change this because users who aren't mutating big nums won't see a major performance impact, since every operation they do will construct a new BigNum anyway.

@Seelengrab
Copy link
Contributor

Seelengrab commented Aug 6, 2021

To me it's already inconsistent. If big"1" is supposed to be equivalent to Big(1) (or parse(..., BigInt) for that matter) from an immutability point of view, having it return something different when the returned object is forcibly mutated feels bad semantically (feels too much like "let's change the value of our immutables", Fortran style). What's the compelling case for exposing the behavior here? Surely the common cases are

  • only use big"1" once, in which case it doesn't matter whether the returned object is forcibly mutated or not
  • use big"1" more than once, but immediately deepcopy anyway to get a mutatable copy (or use Big(1), in which case it's weird that the two behave differently since they're supposed to be semantically immutable/the same anyway)
  • use big"1" more than once, but use the expensive copying interface and thus don't care either way since you won't notice the difference
  • explicitly modify big"1" to use big"1" as a shorthand for your own custom global constant, in which case.. please don't? (This usecase seems very cursed)

so changing this really shouldn't have an impact on naive uses of big"1", right? Or rather, what current usecase would be impacted by having big"1" return a distinct object? As far as I can tell, it's pretty much limited to "I want a fast direct comparison with one(BigInt) that doesn't have to construct the object" (though I'd argue that isone should be used there instead, since that also doesn't allocate and is imo cleaner for that purpose).

@vtjnash
Copy link
Member

vtjnash commented Aug 6, 2021

big(1), one(BigInt), big"1" and deepcopy(big"1") are all permitted to return the same object, since we consider <: Number to mean constant.

@oscardssmith
Copy link
Member

@vtjnash, do we have any examples where making a copy would be a noticable slowdown? If not, I don't see a good reason to not do this. Sure, it only helps people that access internals, but those internals are necessary foror writing high performance code

@yurivish
Copy link
Contributor

yurivish commented Aug 6, 2021

big(1), one(BigInt), big"1" and deepcopy(big"1") are all permitted to return the same object, since we consider <: Number to mean constant.

Out of curiosity, is this documented anywhere? I know that the iteration interface, abstract array interface, and several others are documented in the section on interfaces but I haven't encountered any documentation on what it means to be a Number.

@StefanKarpinski
Copy link
Member

Agree with @KristofferC and @vtjnash here: the problem in this example is mutating a semantically number, not that big"1" returns the same BigInt instance every time. We have an unofficial API for mutating BigInts, but anyone who uses that needs to understand the dangers of what they're doing.

@StefanKarpinski
Copy link
Member

I would add that this might seem like a harmless safety precaution, but let's say people start relying on this and later we improve compiler support for BigInts so that the compiler is able to recognize that a BigInt value doesn't escape and can be modified? That is much harder to do if people have started to rely on the ability to mutate BigInts. If they're genuinely immutable, it's much easier to optimize them. Consider this code:

function f(n::Integer)
    s = n
    while n > 0
        n ÷= 2
        s += n
    end
    return s
end

If we can count on n being immutable, then the compiler can optimize this to the equivalent of this:

function f(n::Integer)
    s = copy(n)
    n = copy(n)
    while n > 0
        div!(n, 2)
        add!(s, n)
    end
    return s
end

However, if we have to worry about someone mutating n then this optimization cannot be done since n needs to remain a reference to the same object that was passed in at the top, since it could be changed from somewhere else. It also becomes harder to be sure that s can't be changed from somewhere else, even though it is allocated in the function. What if it escapes via div! or add!? That would be weird, but the compiler doesn't know that.

@simeonschaub
Copy link
Member

Wouldn't the proposed change here be strictly better if we ever wanted to do some escape analysis though? Right now the allocation is hidden somewhere in macro expansion, so it can't really ever be elided. If it happened at runtime, the compiler would theoretically at least have a lot more leeway.

Since BigInt wraps a C library which inherently relies on mutation, I am sceptical our compiler can ever do such optimizations which rely on immutability in any safe manner. Any function calling into GMP would have to become a compiler primitive, since otherwise compiler optimizations would wreak complete havoc of your program and enough code in the wild probably already relies on such internals that this likely can't be done in 1.x.

Has there been previous discussion about just embracing the mutability of BigInt? The current situation of basically treating it as semantically immutable makes writing performant code with them almost impossible.

@KristofferC
Copy link
Member

#29168 (comment) is one comment about it.

@DNF2
Copy link

DNF2 commented Aug 6, 2021

but anyone who uses that needs to understand the dangers of what they're doing.

But this is difficult to understand for those who use the mutating interface, because documenting it is discouraged, and there's a bit of a mental leap required to realize it.

@Seelengrab
Copy link
Contributor

Seelengrab commented Aug 7, 2021

I don't mean to take over this discussion about big"1" returning a distinct object or not with a discussion about how to implement BigInt - I think that question is orthogonal to how we think BigInt should work, because as a matter of fact right now they are most definitely not immutable (even if we document they are) if you want any kind of performance from them. I think having big"1" return a distinct object for now would solve the immediate problem when using the unexported mutable API, while ensuring it behaves the same as Big(1) (i.e., making it consistent with the believe that all BigInts are immutable and no matter how we write them, the behavior is the same).


I see the argument for having BigInt be immutable, but I just don't see a way to do that in the current (or future) environment without either running into out of memory or performance problems due to lack of mutation of memory. By its very nature, you're going to have some Vector{BigDigit} that's wrapped by a BigInt type (immutable or not, doesn't really matter), which you want to avoid copying around as much as possible. An alternative to copying would be that the compiler is smart enough to know when it doesn't have to allocate a new Vector{BigDigit}, which seems like a much more gargantuan task than embracing the mutable API. Additionally, at least internally, a BigInt library has to mutate, else out of memory errors are thrown left and right.

Maybe I'm missing something here though and having a ReadOnlyArray or better array intrinsics would solve that 🤷‍♂️ (I don't think it would, because we'd have to guarantee that all elements would stay valid until we're done with it, meaning we can't reuse the memory, leading to OOM yet again..)


Regarding the transform mentioned by @StefanKarpinski:

The reason this can't be done today by a library author right now is because s += n is the same as s = s + n. The fact that = is transparent to the operation means we can't specialize on this and without compiler support, we can't distinguish a user writing s += n from s = s + n. If BigInt is immutable for the compiler, from what I can tell we'd either have to be able to eliminate potentially huge allocations and do rigorous escape analysis on arrays together with life times to allow for very efficient memory reuse (potentially also while those arrays are being used? would require knowing that we won't access a certain region of memory after a certain point in our program), or be able to tell the compiler that only code in our module is allowed to write to memory allocated via BigInt, to allow for a mutating API internally (and have + mutate one of its arguments by default) but to prevent users from meddling with our internals. Both seem very hard to achieve, compared to embracing a mutable API (and a copying one for the default behavior of +, to at least give a best effort try).

I've thought about using s .+= n for indicating mutation in one iteration of trying to implement julia native BigInts, since it makes the intent of mutation clear and keeps the assignment semantics. I haven't yet explored how something like s .+= n .+ 2 or s .+= s would work and more problems arise when we're getting to Vector{BigInt} - do we still mutate the BigInt's in the vector on the LHS or are we not allowed to? Do we need two dots..? Or do we need a different operator, to not conflate meaning with broadcasting (would be a big disadvantage, since + on numbers is extremely well established)? There is a lot of design space left to explore (how does Haskell do it, for example? They've got immutable BigInts, don't they? Do they just take advantage of Haskells laziness?).

@rfourquet
Copy link
Member

rfourquet commented Aug 7, 2021

big(1), one(BigInt), big"1" and deepcopy(big"1") are all permitted to return the same object, since we consider <: Number to mean constant.

FWIW, there was a PR trying to make deepcopy(x::BigInt) return the same object x, but this was quite controversial!

@jakobnissen
Copy link
Contributor Author

jakobnissen commented Jun 21, 2023

From Slack, here is a MWE of a recent problem caused by this issue:

function foo(prec)
    setprecision(BigFloat, prec)
    return big"0.23"
end

foo(50) === foo(100) # return true, unexpectedly

The issue here is that setprecision change what big"0.23" returns, so computing it at compile time is invalid.

@nsajko
Copy link
Contributor

nsajko commented Aug 4, 2023

I'd say the MutableArithmetics package is the solution here. It's an established package, with JuliaHub counting about twenty direct dependents and about a thousand packages that depend on MutableArithmetics transitively.

While the Julia standard library pretends that all Number values are immutable, the MutableArithmetics API in principle provides all the necessary control, including:

  1. operations that avoid allocating by exploiting mutability

  2. explicit copying using the MA.mutable_copy (unconditional) and MA.copy_if_mutable (conditional) functions.

MutableArithmetics should really be mentioned in the Julia manual, in the Performance tips section at the very least.

@vtjnash
Copy link
Member

vtjnash commented Oct 31, 2023

This seems to be a performance footgun, for something that is defined as being non-mutated in Base.

@Seelengrab
Copy link
Contributor

But to work efficiently with BigInt, you more or less need to use the mutating API!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bignums BigInt and BigFloat
Projects
None yet
Development

No branches or pull requests