Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

std.math.log_int: implement integer logarithm without using float math #17143

Merged
merged 5 commits into from
Sep 14, 2023

Conversation

FedericoStra
Copy link
Contributor

@FedericoStra FedericoStra commented Sep 13, 2023

This addresses #17142.

Important

This patch leaves the comptime_int as is, i.e. broken. I haven't had the time to figure out where should I put the implementation for that.

Given that the code for log_int is extremely simple, it should be straightforward for someone with knowledge of the compiler to port the implementation to cover that case too.

Disclaimer

Unfortunately, I'm not able to run the tests locally directly from the source tree. I tested the code as a separate module, not integrated in the rest of the standard library.

Edit: I managed to run the tests directly in the standard library source tree.

Copy link
Contributor

@matu3ba matu3ba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A quick review without looking too much into the algorithm. Test cases for signed integers are missing.

lib/std/math/log_int.zig Show resolved Hide resolved
lib/std/math/log_int.zig Outdated Show resolved Hide resolved
@@ -23,14 +23,17 @@ pub fn log(comptime T: type, base: T, x: T) T {
.ComptimeFloat => {
return @as(comptime_float, @log(@as(f64, x)) / @log(float_base));
},

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add, that we should special case comptime-known divisions of power of two with shifts and/or make an issue.

lib/std/math/log_int.zig Outdated Show resolved Hide resolved
lib/std/math/log_int.zig Outdated Show resolved Hide resolved
if (@typeInfo(T) != .Int or @typeInfo(T).Int.signedness != .unsigned)
@compileError("log_int requires an unsigned integer, found " ++ @typeName(T));

assert(base > 1 and x != 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiousity: Did you test, if the assertion fires, if either base or x is known at comptime and satisfies the subexpressions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand the question. Do you mean whether an expression such as log_int(u8, 3, 82) panics? In this case, it does not: it returns @as(u3, 4).

@matu3ba
Copy link
Contributor

matu3ba commented Sep 13, 2023

Unfortunately, I'm not able to run the tests locally directly from the source tree. I tested the code as a separate module, not integrated in the rest of the standard library.

Did the methods explained here https://github.com/ziglang/zig/wiki/Contributing#directly-testing-the-standard-library-with-zig-test not work for you? Please link the errors and/or explain the behavior.

@FedericoStra
Copy link
Contributor Author

Test cases for signed integers are missing.

They are missing because signed integers are unsupported:

pub fn log_int(comptime T: type, base: T, x: T) Log2Int(T) {
    if (@typeInfo(T) != .Int or @typeInfo(T).Int.signedness != .unsigned)
        @compileError("log_int requires an unsigned integer, found " ++ @typeName(T));
    // ...
}

This is coherent with log2_int and log10_int:

zig/lib/std/math.zig

Lines 1245 to 1247 in 4a44b79

pub fn log2_int(comptime T: type, x: T) Log2Int(T) {
if (@typeInfo(T) != .Int or @typeInfo(T).Int.signedness != .unsigned)
@compileError("log2_int requires an unsigned integer, found " ++ @typeName(T));

pub fn log10_int(x: anytype) Log2Int(@TypeOf(x)) {
const T = @TypeOf(x);
const OutT = Log2Int(T);
if (@typeInfo(T) != .Int or @typeInfo(T).Int.signedness != .unsigned)
@compileError("log10_int requires an unsigned integer, found " ++ @typeName(T));

@FedericoStra
Copy link
Contributor Author

Unfortunately, I'm not able to run the tests locally directly from the source tree. I tested the code as a separate module, not integrated in the rest of the standard library.

Did the methods explained here https://github.com/ziglang/zig/wiki/Contributing#directly-testing-the-standard-library-with-zig-test not work for you? Please link the errors and/or explain the behavior.

Yep, it worked! I was missing the --zig-lib-dir lib flag... 😬

@FedericoStra
Copy link
Contributor Author

FedericoStra commented Sep 13, 2023

What I forgot to mention is that, apart from all the tests which are always nice to have, the algorithm is provably correct. The idea is the following.

  • During the iteration the invariant $power = base^{exponent}$ is preserved.

  • We never overflow in the computations inside the loop because when we enter the loop we have $power \leq \lfloor x/base\rfloor$ so:

    • $power \times base \leq x \leq maxInt(T)$ is a valid multiplication for the type T;
    • $base^{exponent+1} = power \times base \leq maxInt(T)$, so $exponent+1 \leq log(base, maxInt(T)) \leq log2(maxInt(T)) \leq maxInt(Log2Int(T))$ is a valid addition for the type Log2Int(T).
  • As a consequence we have that the algorithm terminates because $power$ is strictly increasing, hence it must eventually surpass $\lfloor x/base\rfloor < maxInt(T)$ and we then exit the loop.

  • Correctness of the returned value:

    • If we never entered the loop we must have $\lfloor x / base\rfloor < 1$, hence $x \leq \lfloor x / base\rfloor \times base < base$, thus the result is $0$. We can then return exponent, which is still 0.
    • Otherwise, if we entered the loop at least once, when we finally exit the loop we have that $power$ is exactly divisible by $base$ and $power / base \leq \lfloor x / base\rfloor < power$, hence $power \leq \lfloor x / base\rfloor \times base \leq x < power \times base$. This means that $base^{exponent} \leq x < base^{exponent+1}$, hence the result is exponent.

@IntegratedQuantum
Copy link
Contributor

IntegratedQuantum commented Sep 13, 2023

There got to be faster algorithms than this.

By repeatedly squaring the base for example, we can reduce the complexity to 𝒪 (log_b(log_b(x))):

Quick and dirty example implementation
pub fn log_int2(comptime T: type, base: T, x: T) Log2Int(T) {
	if (@typeInfo(T) != .Int or @typeInfo(T).Int.signedness != .unsigned)
		@compileError("log_int requires an unsigned integer, found " ++ @typeName(T));

	assert(base > 1 and x != 0);

	var precomputedBases: [std.math.log2_int_ceil(usize, @bitSizeOf(T))]T = undefined;
	precomputedBases[0] = base;
	var i: usize = 0;
	
	while (precomputedBases[i] <= x / precomputedBases[i]) {
		precomputedBases[i+1] = precomputedBases[i]*precomputedBases[i];
		i += 1;
	}
	i += 1;
	
	var exponent: Log2Int(T) = 0;
	var power: T = 1;
	while(i > 0) {
		i -= 1;
		if(power <= x / precomputedBases[i]) {
			power *= precomputedBases[i];
			exponent += @as(Log2Int(T), 1) << @intCast(i);
		}
	}

	return exponent;
}

However this will likely cause slowdowns for small log_b(x). Maybe someone else has an idea for a better algorithm or implementation?

@squeek502
Copy link
Collaborator

squeek502 commented Sep 14, 2023

CI failure is #17134, should hopefully be properly mitigated by #17147

const Log2Int = math.Log2Int;

/// Returns the logarithm of `x` for the provided `base`, rounding down to the nearest integer.
/// Asserts that `base > 1` and `x != 0`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make it x > 0 then, if log_int is intended to only support unsigned integers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed: 8a29847

@matu3ba
Copy link
Contributor

matu3ba commented Sep 14, 2023

What I forgot to mention is that, apart from all the tests which are always nice to have, the algorithm is provably correct. The idea is the following.

I do usually provide a link to the original idea etc in a brief manner and/or very brief reasoning. See

inline fn addoXi4_generic(comptime ST: type, a: ST, b: ST, overflow: *c_int) ST {
.
Not sure, if this is also needed in this case, but at least I find it usually nicer to have something on readign+reviewing the algorithms.

In this case tests are mainly to uncover regressions and miscompilations (in Zig and LLVM or another backend).

@FedericoStra
Copy link
Contributor Author

@IntegratedQuantum

There got to be faster algorithms than this. [...] Maybe someone else has an idea for a better algorithm or implementation?

I agree, however I thought that as a starting point the priority was to replace the incorrect implementation with a correct one.

@Vexu Vexu enabled auto-merge (squash) September 14, 2023 15:07
auto-merge was automatically disabled September 14, 2023 15:59

Head branch was pushed to by a user without write access

@FedericoStra
Copy link
Contributor Author

@matu3ba

I do usually provide a link to the original idea etc in a brief manner and/or very brief reasoning. See

inline fn addoXi4_generic(comptime ST: type, a: ST, b: ST, overflow: *c_int) ST {

.
Not sure, if this is also needed in this case, but at least I find it usually nicer to have something on readign+reviewing the algorithms.

Added comments in 55573d8.

@Vexu Vexu enabled auto-merge (squash) September 14, 2023 17:24
@Vexu Vexu merged commit 4f952c7 into ziglang:master Sep 14, 2023
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants