Use random permutation of points when selecting base in centralizer #2878

hulpke · 2018-09-27T15:29:38Z

This gets (generically) around the problem observed in #2783.

The assumption is that there are few badly behaved cases, and the random
permutation makes it unlikely that any specific setup will out of habit fall
into bad cases. (The prior strategy clearly only relied on lexicography of
points and so is not specific).

codecov · 2018-09-27T21:25:52Z

Codecov Report

❗ No coverage uploaded for pull request base (master@482d81f). Click here to learn what that means.
The diff coverage is 85.71%.

@@            Coverage Diff            @@
##             master    #2878   +/-   ##
=========================================
  Coverage          ?   78.53%           
=========================================
  Files             ?      680           
  Lines             ?   346260           
  Branches          ?        0           
=========================================
  Hits              ?   271924           
  Misses            ?    74336           
  Partials          ?        0

Impacted Files	Coverage Δ
lib/ctbl.gd	`97.35% <ø> (ø)`
lib/grp.gd	`100% <ø> (ø)`
lib/ctblfuns.gd	`100% <ø> (ø)`
lib/stbcbckt.gi	`84.15% <85.71%> (ø)`

stevelinton · 2018-09-28T08:55:11Z

This worries me a little bit, because it makes things less repeatable, although I admit that this is also, to some extent, the point. If we were to fix the seed of random source used here, then we would avoid that problem -- the same input would always produce the same result, but there would be (hopefully rare and unnatural) cases which performed badly every time you ran them.

I'm not sure what is better, or if there is some way to provide user control.

hulpke · 2018-09-28T15:09:35Z

@stevelinton I agree this makes reproduction awkward, but the whole point is to have any selection that could deterministically bad.making the random seed depend on the input again risks having certain examples fall consistently in the bad case. (Of course the best solution would be to find out why the choice is bad and use this to make consistently a good choice.)

What about an option (or flag) to turn this kind of randomization off (or set its seed)?

ChrisJefferson · 2018-09-28T15:12:05Z

Just to say, this problem is (at some level) unfixable. Maybe we can do better for this instance, but in general if we could solve the problem of "choosing the right value order for finding the conjugate" I'm fairly sure we would have solved graph isomorphism in polytime :)

stevelinton · 2018-09-29T17:24:34Z

@hulpke an option would be fine. My thinking was that (like quicksort) the problem was that the bad case was somehow "natural" so using a fixed seed we could instead make it unnatural and therefore much rarer in practice, but this might be wrong.

fingolfin · 2018-10-02T11:18:21Z

If you use a fixed seed, doesn't that just shift the issue, i.e., now it's other cases that might misbehave? Of course if one is lucky, then no cases misbehaves, but I don't see why that should be the case.

There are also many other algorithms in GAP which depend on the RNG state. They are still useful; one can record the RNG state before any computation, if one is concerned about reproducibility; in recog I have to constantly do that in order to create bug reports that can be reliably reproduced.

So I don't quite follow why randomization is suddenly a big concern in this case? What am I missing?

Independent of that: I am conflicted about the "backport to 4.10" label. While this PR fixes an issue, it also has the potential to break other, unforeseen things.

fingolfin

I have concerns about a comment, but other than that, I'd approve this.

lib/stbcbckt.gi

ChrisJefferson · 2018-10-02T13:02:12Z

So, there l always be cases where randomisation doesn't help -- in fact it would be fairly easy to prove that over the space of all problems, any fixed randomisation strategy does nothing.

However, in general people tend to specify groups in a "to them" sensible way, which will tend to lead to the natural order of the integers having some meaning to them. This means the natural order of the integers has a strong chance of being either an extremely good, or extremely bad, ordering for search.

While both old and new Sort are not stable, they do tend, when given mostly sorted sequences, to produce certain obvious patterns, which can lead to us hitting these worst-case behaviours.

There is past experience of this. McKay discusses this type of problem in Nauty, where he orders canonical images by their hash value exactly to avoid the problem of going through a sorted range and hitting a worst case behaviour.

So the question is, is this random shuffle moving us out of a specific degenerate case (which would be good), or just bouncing us from some random place to some other random place (which would mean we weren't really improving things).

As a general life rule, I tend to randomly shuffle anything where I don't have a good ordering, to avoid these kinds of degenerate behaviours anyway.

As a separate issue, I would very much like to extend/change this PR with something like #2882, which would get rid of the problem that you could get different behaviour every time you ran the function (which would be annoying).

hulpke · 2018-10-02T21:03:00Z

@ChrisJefferson

As a separate issue, I would very much like to extend/change this PR with something like #2882, which would get rid of the problem that you could get different behaviour every time you ran the function (which would be annoying).

What is the easiest way to do so? Would you be OK to merge this and then change the random generator once #2882 (with the GPL license issue resolved) has been merged? Or would you like me to merge your PR into this one?

hulpke · 2018-10-02T21:10:38Z

@fingolfin
I set the backport to 4.10 as a suggestion, indicating a change that is localized and (at least in principle) should not affect other routines.

Feel free to remove (or frankly ignore) the flag if this use seems to be hopelessly naive.

ChrisJefferson · 2018-10-03T10:50:59Z

@hulpke : If you were happy I would ignore my PR, and do:

i:=FLOYDS_ALGORITHM(RandomSource(IsMersenneTwister),Length(cycles.firsts),false);

Running RandomSource(IsMersenneTwister) takes about .1ms, so I don't think it's worth worrying about -- having now benchmarked I'm not totally convinced it's work adding PCG32 (although if the licence can get sorted out, it might still be nice for this kind of situation).

hulpke · 2018-10-07T09:48:10Z

@ChrisJefferson I have changed to RandomSource(IsMersenneTwister) as you suggested.

This gets (generically) around the problem observed in gap-system#2783. The assumption is that there are few badly behaved cases, and the random permutation makes it unlikely that any specific setup will out of habit fall into bad cases. (The prior strategy clearly only relied on lexicography of points and so is not specific).

Also expanded comment.

ChrisJefferson · 2018-10-09T08:17:30Z

I'm now happy with this, except (obivously) for the manual example at grp.gd:1880 needs changing, as it is failing the manual tests.

fingolfin · 2018-10-18T09:08:13Z

Backported to 4.10 via 502780e 214ed65 87022cb 61b1c5a

This issue was labelled as not needing release notes. I removed that label, as I think it does warrant a release notes entry: this fixes an issue present in GAP 4.8 and 4.9; and in fact, with the fix, the sample code posted by @frankluebeck in issue #2783 gets much faster for me (from 110 seconds in GAP 4.7 down to 6 seconds in master).

hulpke added release notes: not needed PRs introducing changes that are wholly irrelevant to the release notes backport-to-4.10 labels Sep 27, 2018

ChrisJefferson mentioned this pull request Sep 28, 2018

Add PCG32 random number generator #2882

Closed

fingolfin reviewed Oct 2, 2018

View reviewed changes

lib/stbcbckt.gi Outdated Show resolved Hide resolved

hulpke force-pushed the ah/permgp branch from 1c4d3be to c8925a6 Compare October 2, 2018 20:56

hulpke force-pushed the ah/permgp branch from c8925a6 to 1386e53 Compare October 7, 2018 09:47

hulpke added 3 commits October 9, 2018 09:21

Added option norandom to turn off randomization.

fd55a9a

Also expanded comment.

Create fresh random source each time.

8128ddb

hulpke force-pushed the ah/permgp branch from 1386e53 to 2c480be Compare October 9, 2018 07:22

hulpke requested a review from ChrisJefferson October 9, 2018 07:22

ChrisJefferson approved these changes Oct 9, 2018

View reviewed changes

Subsequent manual changes.

41e8394

hulpke force-pushed the ah/permgp branch from 2c480be to 41e8394 Compare October 9, 2018 09:19

hulpke mentioned this pull request Oct 9, 2018

Regression in conjugation tests in permutation groups #2783

Closed

hulpke merged commit 36ca67e into gap-system:master Oct 10, 2018

fingolfin removed the release notes: not needed PRs introducing changes that are wholly irrelevant to the release notes label Oct 18, 2018

fingolfin changed the title ~~Random permutation of points when selecting base in centralizer~~ Use random permutation of points when selecting base in centralizer Oct 25, 2018

fingolfin added release notes: added PRs introducing changes that have since been mentioned in the release notes and removed release notes: to be added PRs introducing changes that should be (but have not yet been) mentioned in the release notes labels Oct 25, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use random permutation of points when selecting base in centralizer #2878

Use random permutation of points when selecting base in centralizer #2878

hulpke commented Sep 27, 2018

codecov bot commented Sep 27, 2018 •

edited

Loading

stevelinton commented Sep 28, 2018

hulpke commented Sep 28, 2018

ChrisJefferson commented Sep 28, 2018

stevelinton commented Sep 29, 2018

fingolfin commented Oct 2, 2018

fingolfin left a comment

ChrisJefferson commented Oct 2, 2018

hulpke commented Oct 2, 2018

hulpke commented Oct 2, 2018

ChrisJefferson commented Oct 3, 2018

hulpke commented Oct 7, 2018

ChrisJefferson commented Oct 9, 2018

fingolfin commented Oct 18, 2018

Use random permutation of points when selecting base in centralizer #2878

Use random permutation of points when selecting base in centralizer #2878

Conversation

hulpke commented Sep 27, 2018

codecov bot commented Sep 27, 2018 • edited Loading

Codecov Report

stevelinton commented Sep 28, 2018

hulpke commented Sep 28, 2018

ChrisJefferson commented Sep 28, 2018

stevelinton commented Sep 29, 2018

fingolfin commented Oct 2, 2018

fingolfin left a comment

Choose a reason for hiding this comment

ChrisJefferson commented Oct 2, 2018

hulpke commented Oct 2, 2018

hulpke commented Oct 2, 2018

ChrisJefferson commented Oct 3, 2018

hulpke commented Oct 7, 2018

ChrisJefferson commented Oct 9, 2018

fingolfin commented Oct 18, 2018

codecov bot commented Sep 27, 2018 •

edited

Loading