Skip to content

Conversation

@nordlow
Copy link
Contributor

@nordlow nordlow commented Jan 27, 2015

The goal here is to avoid boundschecking for slice expression whose lower or/and upper bounds can never be outside of [0 .. $]. Further, if a slice index can be proven to be outside of bounds an error is issued.

Step 1:

x[0 .. $]
x[0 .. 0]
x[$ .. $]

Step 2:

x[0 .. $/n]
x[$/n .. $]

where n is a compile-time constant >= 1

Step 3:

x[$/m .. $/n]

where m and n are compile-time constants both >= 1 and m >= n.

Step 4:

x[$*p/q .. $*r/s]

where p, q, r and s are compile-time constants both >= 1 and p/q <= r/s, p <= q, r <= s.

Step 5 (error reporting only):

x[$+o .. $+s]

I put this out in the open in a very early stage to get feedback as soon as possible. This so I don't diverge in a direction that you dislike.

I use a warning for now for debug purpose.

My prel tests seems to show that the current logic holds water and at least doesn't crash DMD :)

Most of my questions are given as TODO comments.

Specifically I'm curious

  • Why isn't there a specific class DollarExp instead of VarExp. That would have made isOpDollar trivial.
  • If (x->op == TOKdiv) is it guaranteed that x is of type DivExp? This seems like an insecure pattern. I see that DMD has isVarDeclaration so why isn't there corresponding is.*Exp?
  • Could any explain what resolveOpDollar is used for. Have any use of it here?

Note that the naming of the variable lowerIsLessThanUpper is misleading as it's value is true iff lwr <= upr. lowerIsLessThanOrEqualToUpper or lowerIsLTEToUpper would be a better name.

Update: Another key issue is whether this should include analysis of immutable slice initializations such as, for instance, line 37 to to 46 in std.ascii.

Destroy and thanks in advance!

src/expression.c Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like

return decl && decl->ident->len == 8 && memcmp(decl->ident->string, "__dollar", decl->ident->len) == 0;

to me

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. I'll wait and see if any more logic has to be added :) If not I'll change it.

@nordlow nordlow changed the title Avoid boundscheck in Slice Expressions that use opDollar in some specific cases Avoid Boundscheck in Slice Expressions Jan 27, 2015
src/expression.c Outdated
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

decl->ident == Id::dollar?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh! Great.

@nordlow nordlow changed the title Avoid Boundscheck in Slice Expressions Avoid Boundschecking in Slice Expressions Jan 27, 2015
src/expression.c Outdated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a = a || b -> a |= b

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, I get it. I misread -> for a pointer member operator :)

According to

https://en.wikipedia.org/wiki/Operators_in_C_and_C%2B%2B#Compound_assignment_operators

this does bitwise logic. But for zeros and ones I guess that's ok right?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would use if(lwrRange.imax <= uprRange.imin) this->lowerIsLessThanUpper = true; here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this does bitwise logic. But for zeros and ones I guess that's ok right?

Bitwise logic on a bool is boolean logic. It makes sense if you don't think about it too much.

@yebblies
Copy link
Contributor

SliceExp has a member lengthVar which I'm guessing is the __dollar variable, you should probably be checking against that rather than checking the name.

src/expression.c Outdated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No comments inside expressions please, move them outside the condition.

@nordlow
Copy link
Contributor Author

nordlow commented Jan 29, 2015

For better reuse I broke out logic into SliceExp::analyzeRelativeBound() that returns a Boundness status. There is now an extra member lowerIsInBounds that is currently not used where corresponding upperIsInBounds is. Do we have any use of SliceExp::lowerIsInBounds? I guess there can be cases where upper cannot be proven to be in bounds when lower is. The current states SliceExp::upperIsInBounds and SliceExp::lowerIsLessThanUpper cannot describe this fact.

AFAIK Step 1, 2, 3 and 4 should now be covered.

You are free to express desire in my namings. They are quite arbitrary.

BTW: What's the policy on using C++ function out parameter references instead of pointers in DMD? I can't find any usage of it anywhere in expression.c at least.

src/expression.c Outdated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't need to be a member function, and if it's private to this file you can also move Boundness out of the header.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about lengthVar? How do I get that Without this pointer?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pass it to the function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is lengthVar a member of OpSlice? Shouldn't it be better to have one global instance of it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because each slice has its own length variable. eg x[$-{ return a[$-5..b[$-2]+$/2]; }()].

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha.

@yebblies
Copy link
Contributor

BTW: What's the policy on using C++ function out parameter references instead of pointers in DMD? I can't find any usage of it anywhere in expression.c at least.

The policy is don't.

@9rnsr
Copy link
Contributor

9rnsr commented Jan 29, 2015

I'm really not sure that kind of those heuristic checks will be accepted. In #4209, I'm doing similar implementation, but it's currently pending because it's looks dirty.

@nordlow
Copy link
Contributor Author

nordlow commented Jan 29, 2015

Are there any relevant dollar expressions for Step 5 (more complex than Step 4)?

@nordlow
Copy link
Contributor Author

nordlow commented Jan 29, 2015

BTW: How do I handle signed integer literateral offsets, in my case for example [$-1 .. $] when toInteger() returns an unsigned integer (dinteger_t)?

@nordlow
Copy link
Contributor Author

nordlow commented Jan 29, 2015

BTW: I'm missing Expression try-cast functions such as

  • AddExp* isAddExp(Expression* e) { return e->op == TOKadd ? (AddExp*)e : NULL; }
  • etc

for simplifying code for pattern matching on subclasses to Expression.

Why hasn't these been added?

Is it ok to add these on demand in this pull?

@nordlow nordlow changed the title Avoid Boundschecking in Slice Expressions Add Compile-Time Error-Checking and Avoid Boundschecking in Slice Expressions Jan 29, 2015
@yebblies
Copy link
Contributor

BTW: How do I handle signed integer literateral offsets, in my case for example [$-1 .. $] when toInteger() returns an unsigned integer (dinteger_t)?

(sinteger_t)e->toInteger()

@nordlow
Copy link
Contributor Author

nordlow commented Feb 3, 2015

My only conculsion is that something is using a slice bound out of range perhaps in a traits compiles. Infinite range?

@yebblies
Copy link
Contributor

yebblies commented Feb 3, 2015

From the type in the error it doesn't look a like an infinite range. I don't know why it's happening.

@nordlow
Copy link
Contributor Author

nordlow commented Feb 3, 2015

@andralex @WalterBright @9rnsr : Do you have any clues to why all 64-bit builds fail but not the 32-bit ones?

@9rnsr
Copy link
Contributor

9rnsr commented Feb 3, 2015

std.algorithm.map is testing its input by slicing with ulong.max. With this PR, it would fail always. Finally map would have opSlice(uint, uint) even in 64bit platforms, and it would not work with size_t bounds.

Part of MapResult code in std/algorithm/iteration.d:

    static if (hasSlicing!R)
    {
        static if (is(typeof(_input[ulong.max .. ulong.max])))   // <--- 
            private alias opSlice_t = ulong;
        else
            private alias opSlice_t = uint;

        static if (hasLength!R)
        {
            auto opSlice(opSlice_t low, opSlice_t high)
            {
                return typeof(this)(_input[low .. high]);
            }
        }
        else static if (is(typeof(_input[opSlice_t.max .. $])))
        {

@nordlow
Copy link
Contributor Author

nordlow commented Feb 3, 2015

So the natural question now becomes...should we drop bounds checking for these "negative" integer literals or fix Phobos? @andralex ?

@9rnsr
Copy link
Contributor

9rnsr commented Feb 3, 2015

No. In that case (is(typeof(_input[ulong.max .. ulong.max]))), it's just validating an expression by using is(typeof(exp)) idiom. I can say that the extended bounds check should not work inside typeof.

@yebblies
Copy link
Contributor

yebblies commented Feb 3, 2015

Don't we have functions in phobos explicitly for use in is(typeof()) expressions? lvalueOf!T() or something?

@9rnsr
Copy link
Contributor

9rnsr commented Feb 3, 2015

@yebblies With is(typeof(exp)), exp may be an invalid expression. But, an invalid expression cannot appear in template parameters. As the conclusion, wrapping the idiom by a library template is impossible.

@yebblies
Copy link
Contributor

yebblies commented Feb 3, 2015

I mean writing is(typeof(_input[rvalueOf!ulong..rvalueOf!ulong])) instead of using ulong.max, which would prevent these errors from occurring as the actual values wouldn't be known. I assume the expression is really meant to be checking "can _input be sliced with ulong indices" rather than explicitly checking that the value ulong.max works.

@9rnsr
Copy link
Contributor

9rnsr commented Feb 3, 2015

ulong.max is a valid expression for a slice expression bounds in 64-bit platforms. I argue that we should not break any existing compile-time check code by the enhanced bounds check. It's only useful for actual runtime slicing, so breaking any compile-time checks is definitely unreasonable.

And, I think the std.algorithm.map breaking is definitely a problem of this PR. In the expression is(typeof(_input[ulong.max .. ulong.max])), _input is an instance field of MapResult, so its bounds are clearly unknown at compile time. So _input.length might be ulong.max in 64bit platform. Therefore it's equivalent with is(typeof(_input[$ .. $])) _input[$ .. $] and this PR should not cause errors for the slicing. No?

@nordlow
Copy link
Contributor Author

nordlow commented Feb 3, 2015

What about index bound expressions such as $+1? In what cases would the user want to do that? I always prefer compile-time before run-time checking when possible.

@nordlow
Copy link
Contributor Author

nordlow commented Feb 3, 2015

Both x[0..0] and x[$..$] are allowed by this pull.

Maybe I should add unittests :)

@9rnsr
Copy link
Contributor

9rnsr commented Feb 3, 2015

If a slice arr[0 .. $+1] is in runtime code, the error detection would be useful. Otherwise it would be useless breaking change. Please don't do it.

@9rnsr
Copy link
Contributor

9rnsr commented Feb 3, 2015

Also add following case:

if (0)
{
    int[] arr = [1,2,3];
    auto a = arr[size_t.max .. size_t.max];  // should compile
}

@nordlow
Copy link
Contributor Author

nordlow commented Feb 3, 2015

So do you all agree on no compile-time error checking here?

@justanotheranonymoususer

If a slice arr[0 .. $+1] is in runtime code, the error detection would be useful. Otherwise it would be useless breaking change. Please don't do it.

When is arr[0 .. $+1] going to be a legit code?

@9rnsr
Copy link
Contributor

9rnsr commented Feb 3, 2015

When is arr[0 .. $+1] going to be a legit code?

If it's inside is(typeof()), yes. But __traits(compiles, arr[0 .. $+1]) can be changed to return false.

@nordlow
Copy link
Contributor Author

nordlow commented Feb 3, 2015

If we follow the convention of being backwards compatible with __traits(compiles) then hardly any additions to the error diagnostics in DMD can be added. This hasn't quite been the case in recent years has it?

@justanotheranonymoususer

If it's inside is(typeof()), yes. But __traits(compiles, arr[0 .. $+1]) can be changed to return false.

Why would it be valid inside typeof if it doesn't compile?

@9rnsr
Copy link
Contributor

9rnsr commented Feb 3, 2015

Inside typeof, only the expression type is validated. Bounds check doesn't modify the expression type. Therefore it should not work in typeof.

@ibuclaw
Copy link
Member

ibuclaw commented Dec 24, 2017

@nordlow - Status update on this?

@ibuclaw
Copy link
Member

ibuclaw commented Jan 11, 2018

I'm going to say 👎 to this. Though I have taken the inspiration for the idea and put it in a new PR #7677.

@ibuclaw ibuclaw closed this Jan 11, 2018
ibuclaw added a commit to ibuclaw/dmd that referenced this pull request Jan 11, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants