Random slowdown clean #2296

hvasbath · 2017-06-12T07:53:09Z

Cherry picked commits into new clean commit. Originally here:
#2233

twiecki · 2017-06-12T10:08:10Z

@hvasbath If this is identical I'd rather keep this one than the other one (despite what I said earlier).

hvasbath · 2017-06-12T10:35:47Z

As you wish

junpenglao · 2017-06-12T10:54:06Z

@twiecki should we set it to 3.2? We can have more time to test it.

twiecki · 2017-06-12T10:54:56Z

@junpenglao Good call.

aseyboldt · 2017-06-13T10:35:39Z

I don't think we should just cast values provided by a user to a different type. If users want to work with float32/int16 they should be required to cast their own values themselves. Otherwise this could lead to very nasty bugs. Say you're modelling counts of some kind, and some values don't fit in a int16. If we cast this silently, we end up with different values. The results will be wrong, and maybe only for a few values. This might be very hard to figure out or even notice.
tt.as_tensor_variable seems to do the right thing: Convert whatever value we got to a theano value of the same type and optionally print a warning if this is a float64. We should only care about floatX, intX when we ourselves create new variables.
Sorry, I guess I'm a bit late to the discussion, I somehow missed this PR before :-(

hvasbath · 2017-06-13T10:40:44Z

The point of this is that having theano tensors there is making the random method horribly slow. Having them as theano tensors there was to get GPU support running. Thats why we agreed to the casting to have both, the fast random and still right casting.
I can totally understand your point! Maybe we issue a warning or even test during casting if the number can be expressed with this datatype- mainly necessary for the intX case I guess...

Edit: There is the casting argument in numpy cast, which we should set to safe and it throws an error if the new datatype cannot express the number correctly ...

b=num.array(num.power(2,15))
b.astype('int16', casting='safe')
TypeError: Cannot cast array from dtype('int64') to dtype('int16') according to the rule 'safe'

b=num.array(num.power(2,15)-1)
b.astype('int16', casting='safe')
array(32767, dtype=int16)

aseyboldt · 2017-06-13T11:14:14Z

Why would it be slow if we have a theano.constant instead of a numpy array? We can always access the numpy array with val.value, which is fast. But I think this is a different issue than the question whether we want to cast user input.
Safe casting is definitely better than the default, but this still changes the behaviour of the array fundamentally:

>>> a = np.array(127, dtype=np.int16)
>>> a + np.int8(1)
128
>>> a.astype(np.int8, casting='safe') + np.int8(1)
-128

If users want this, they should cast themselves. (In which case it's their fault if it gives wrong answers).
int16 is small, this is going to happen...

hvasbath · 2017-06-13T11:26:03Z

Then I suggest you have a look at that issue please thats where all this effort is coming from.
#1824

If you know how to fix it in a different way we are more than happy to read your suggestions ... :)

aseyboldt · 2017-06-13T11:42:48Z

The problem is in distribution.draw_value. TheanoConstants have a name attribute, so _compile_theano_function is called for each random draw. We could simply check first if it is a constant, and use the value if so. I'll look into it tomorrow and try to clean up draw_value and draw_values a bit. It is somewhat hard to figure out what is happening in there right now.

hvasbath · 2017-06-13T11:56:31Z

Its sad that you saw this so late, I already spent like 2 full days on that ...

aseyboldt · 2017-06-13T12:08:07Z

I feared as much when I saw the issue. I'm sorry about that, this must have been a lot of work.

twiecki · 2017-06-17T11:48:02Z

Closing in favor of #2309.

hvasbath and others added 7 commits June 12, 2017 10:27

cast Distribution Inputs to floatX instead of tensor_variable

8a1c22d

remove double assignments, intX casting, get name function

8083e48

fix failed test

768167c

WIP continued floatX casting and rebase

5b2c5bc

fixes tests in variational_inference

05aad6a

fix test_distributions_random

9231310

fixes selfs ...

0ba7ec0

hvasbath closed this Jun 12, 2017

hvasbath reopened this Jun 12, 2017

twiecki added this to the 3.2 milestone Jun 12, 2017

This was referenced Jun 13, 2017

suppress error raise in #2258 #2305

Closed

Fix sample_ppc #2309

Merged

twiecki closed this Jun 17, 2017

hvasbath deleted the random_slowdown_clean branch July 2, 2017 09:50

Random slowdown clean #2296

Random slowdown clean #2296

Uh oh!

Conversation

hvasbath commented Jun 12, 2017

Uh oh!

twiecki commented Jun 12, 2017

Uh oh!

hvasbath commented Jun 12, 2017

Uh oh!

junpenglao commented Jun 12, 2017

Uh oh!

twiecki commented Jun 12, 2017

Uh oh!

aseyboldt commented Jun 13, 2017

Uh oh!

hvasbath commented Jun 13, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aseyboldt commented Jun 13, 2017

Uh oh!

hvasbath commented Jun 13, 2017

Uh oh!

aseyboldt commented Jun 13, 2017

Uh oh!

hvasbath commented Jun 13, 2017

Uh oh!

aseyboldt commented Jun 13, 2017

Uh oh!

twiecki commented Jun 17, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

hvasbath commented Jun 13, 2017 •

edited

Loading