Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed-up for __contains__ in linear codes #18607

Closed
sagetrac-dlucas mannequin opened this issue Jun 4, 2015 · 12 comments
Closed

Speed-up for __contains__ in linear codes #18607

sagetrac-dlucas mannequin opened this issue Jun 4, 2015 · 12 comments

Comments

@sagetrac-dlucas
Copy link
Mannequin

sagetrac-dlucas mannequin commented Jun 4, 2015

The actual implementation of `__contains__` for linear codes is quite slow.
It can be improved using the syndrome computation instead of checking if the vector belongs to a specific subspace of the ambient space.

Test:

```
F = GF(1009)
n, k = 1000, 500
C = codes.RandomLinearCode(n, k, F)

subspace = []
syndrome = []

for i in range(20):
    c = C.random_element()
    start = time.clock()
    A = C.ambient_space()
    S = A.subspace(C.gens())
    res = S.__contains__(c)
    elapsed = (time.clock() - start)
    assert res == True
    subspace.append(elapsed)

    start = time.clock()
    if not c in C.ambient_space() or len(c) != C.length():
        res = False
    else:
        res = (C.syndrome(c) == 0)
    elapsed = (time.clock() - start)
    assert res == True
    syndrome.append(elapsed)
```


Results:

```
sage: median(subspace)
1.526604500000019
sage: median(syndrome)
0.00408399999997755
```

CC: @johanrosenkilde

Component: coding theory

Author: David Lucas

Branch/Commit: 6319903

Reviewer: Johan Sebastian Rosenkilde Nielsen

Issue created by migration from https://trac.sagemath.org/ticket/18607

@sagetrac-dlucas sagetrac-dlucas mannequin added this to the sage-6.8 milestone Jun 4, 2015
@sagetrac-dlucas
Copy link
Mannequin Author

sagetrac-dlucas mannequin commented Jun 4, 2015

Branch: u/dlucas/speedup_in_contains

@sagetrac-dlucas
Copy link
Mannequin Author

sagetrac-dlucas mannequin commented Jun 4, 2015

Commit: 6319903

@sagetrac-dlucas
Copy link
Mannequin Author

sagetrac-dlucas mannequin commented Jun 4, 2015

Author: David Lucas

@sagetrac-dlucas
Copy link
Mannequin Author

sagetrac-dlucas mannequin commented Jun 4, 2015

New commits:

6319903Changed code of __contains__

@johanrosenkilde
Copy link
Contributor

comment:3

An obvious alternative is to use the existing implementation but cache the space S for later use. Let's call that solution "sug" (for suggestion), and yours "new".

Now, your tests are not very convincing: you are testing only one set of parameters over one field. And furthermore, you are testing only for the speed of contains on codeword - not on non-codewords.

I quickly wrote a more comprehensive set of tests. Here are some timings (element=True means that we are testing on codewords, otherwise we test on random elements of the ambient space):

    Results for N=3 and C=Linear code of length 1200, dimension 300 over Finite Field of size 2 and elements=True
    new:  [0.26615000000003874, 0.003176999999993768, 0.0032519999999749416]
    sug:  [0.16259099999996351, 2.3999999996249244e-05, 2.2999999998774e-05]

    Results for N=3 and C=Linear code of length 1200, dimension 300 over Finite Field of size 2 and elements=False
    new:  [0.26317399999999225, 0.0001400000000444379, 0.00011000000000649379]
    sug:  [0.5222530000000347, 0.0015950000000088949, 0.0015670000000227446]



    Results for N=3 and C=Linear code of length 12, dimension 3 over Finite Field of size 2 and elements=True
    new:  [0.001967999999976655, 0.0005400000000008731, 0.0005019999999831271]
    sug:  [0.0005810000000110449, 1.3999999964653398e-05, 1.8000000011397788e-05]

    Results for N=3 and C=Linear code of length 12, dimension 3 over Finite Field of size 2 and elements=False
    new:  [0.0007109999999670435, 5.1000000041767635e-05, 5.9000000021569576e-05]
    sug:  [0.0015510000000062973, 0.0002230000000054133, 0.00023799999996754195]



    Results for N=5 and C=Linear code of length 1000, dimension 5 over Finite Field of size 1009 and elements=True
    new:  [1.7585720000000151, 0.0033129999999914617, 0.0033280000000104337, 0.0032279999999786924, 0.0033119999999939864]
    sug:  [0.007593999999983225, 6.600000000389628e-05, 4.999999998744897e-05, 5.7999999967250915e-05, 6.799999999884676e-05]

    Results for N=5 and C=Linear code of length 1000, dimension 5 over Finite Field of size 1009 and elements=False
    new:  [1.5777009999999905, 0.002039000000024771, 0.0020959999999945467, 0.0020149999999716783, 0.0018599999999651118]
    sug:  [0.06748300000003837, 0.0026629999999840948, 0.0029540000000451982, 0.0026410000000396394, 0.002763000000015836]



    Results for N=5 and C=Linear code of length 1000, dimension 950 over Finite Field of size 1009 and elements=True
    new:  [0.2903909999999996, 0.11106599999999389, 0.1123590000000263, 0.11180000000001655, 0.11880000000002156]
    sug:  [2.45943699999998, 0.0011589999999728207, 0.0011220000000093933, 0.0009630000000129257, 0.0010689999999726751]

    Results for N=5 and C=Linear code of length 1000, dimension 950 over Finite Field of size 1009 and elements=False
    new:  [0.18305399999997007, 0.00016400000004068715, 0.00014800000002423985, 0.00017199999996364568, 0.00015700000000151704]
    sug:  [4.64434399999999, 0.010827000000006137, 0.01069799999999077, 0.010796000000027561, 0.010704999999973097]



    Results for N=5 and C=Linear code of length 300, dimension 150 over Finite Field in a of size 2^8 and elements=True
    new:  [0.1043430000000285, 0.025561999999979435, 0.025893999999993866, 0.025636000000019976, 0.02579400000001897]
    sug:  [0.04122899999998708, 3.000000003794412e-05, 3.100000003541936e-05, 3.1999999976051186e-05, 3.500000002532033e-05]
    
    Results for N=5 and C=Linear code of length 300, dimension 150 over Finite Field in a of size 2^8 and elements=False
    new:  [0.07121999999998252, 0.002735000000029686, 0.0026799999999980173, 0.0026340000000004693, 0.002746999999999389]
    sug:  [0.11262400000003936, 0.0019409999999879801, 0.0019139999999993051, 0.0019269999999664833, 0.0019049999999651845]



    Results for N=5 and C=Linear code of length 300, dimension 280 over Finite Field in a of size 2^8 and elements=True
    new:  [0.056963000000052944, 0.040061000000036984, 0.04052200000000994, 0.039447999999993044, 0.04015300000003208]
    sug:  [0.061540999999976975, 4.0000000012696546e-05, 3.90000000152213e-05, 4.199999995080361e-05, 3.69999999634274e-05]

    Results for N=5 and C=Linear code of length 300, dimension 280 over Finite Field in a of size 2^8 and elements=False
    new:  [0.01576399999999012, 0.0006789999999909924, 0.0006390000000351392, 0.0006480000000124164, 0.0007170000000087384]
    sug:  [0.19535200000001396, 0.002287000000023909, 0.0021719999999731954, 0.0021679999999832944, 0.0024019999999609354]

As can be seen, "sug" does much better than "new" when we are testing codewords, except that the first call is sometimes exorbitantly expensive (for high-rate codes). On non-codewords "new" seems to beat "sug" more or less always, but less so on low-rate codes.

@johanrosenkilde
Copy link
Contributor

comment:4

All in all, I think I vote for "new", i.e. the current ticket's solution. The use case for codes would be to often check codeword membership that fails. Also, I'm quite concerned about the exorbitant price for the first call that "sug" has.

@johanrosenkilde
Copy link
Contributor

comment:5

It think it's better to write self.syndrome(v).is_zero than do the explicit comparison with 0.

@johanrosenkilde
Copy link
Contributor

comment:6

So if the author has not changed his mind as a consequence of my comments, I give this the green light. All tests pass and documentation builds.

(I fixed the bug in the test code)

@johanrosenkilde

This comment has been minimized.

@vbraun
Copy link
Member

vbraun commented Jun 4, 2015

comment:7

Reviewer name missing

@johanrosenkilde
Copy link
Contributor

Reviewer: Johan Sebastian Rosenkilde Nielsen

@vbraun
Copy link
Member

vbraun commented Jun 6, 2015

Changed branch from u/dlucas/speedup_in_contains to 6319903

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants