Skip to content
This repository has been archived by the owner on Dec 11, 2023. It is now read-only.

strange failure on linux and mac #122

Closed
cpcloud opened this issue Jan 16, 2015 · 15 comments
Closed

strange failure on linux and mac #122

cpcloud opened this issue Jan 16, 2015 · 15 comments

Comments

@cpcloud
Copy link
Contributor

cpcloud commented Jan 16, 2015

I'm unable to use bcolz with multiprocessing when my data is a particular size:

here's a minimal failing example

import shutil
import numpy as np
from math import ceil
import bcolz
from multiprocessing import Pool

n = 2 ** 14
rootdir = 'foo'

try:
    shutil.rmtree(rootdir)
except OSError:
    pass

ct = bcolz.carray(np.random.rand(n) * 1000, rootdir=rootdir)
cs = 1000

slices = [slice(cs * i, cs * (i + 1)) for i in range(int(ceil(n / cs)))]


class Sum(object):
    def __init__(self, data):
        self.data = data

    def __call__(self, index):
        return self.data[index].sum()


if __name__ == '__main__':
    p = Pool()
    caller = Sum(ct)
    result = p.map(caller, slices)
    print(result)
@cpcloud
Copy link
Contributor Author

cpcloud commented Jan 16, 2015

for some reason, the failure happens when n = 2 ** 14, but not when n = 2 ** 14 - 1 or n = 2 ** 20.

@cpcloud
Copy link
Contributor Author

cpcloud commented Jan 16, 2015

i would imagine that since ctables can be pickled that this wouldn't be an issue, and i've been able to use ctable with multiprocessing in the past

@cpcloud cpcloud changed the title stange failure on linux and mac strange failure on linux and mac Jan 16, 2015
@cpcloud
Copy link
Contributor Author

cpcloud commented Jan 16, 2015

I'm also able to reproduce this on both linux and mac osx yosemite

mac environment:

$ conda list | grep -e 'numpy\|bcolz\|python'
bcolz                     0.7.3                np19py27_0
numpy                     1.9.1                    py27_0
python                    2.7.9                         1

linux environment:

$ conda list | grep -e 'numpy\|bcolz\|python'
bcolz                     0.8.0                np19py27_0
numpy                     1.9.1                    py27_0
python                    2.7.9                         1

@cpcloud
Copy link
Contributor Author

cpcloud commented Jan 16, 2015

i've found a working revision, so i'm going to bisect

@cpcloud
Copy link
Contributor Author

cpcloud commented Jan 16, 2015

looks like 89253f1 is the culprit

guessing that blosc introduced something

don't have the bandwidth to delve into blosc right now

@esc
Copy link
Member

esc commented Jan 16, 2015

Check #121 also seems blosc v1.5.2 related.

@esc
Copy link
Member

esc commented Jan 16, 2015

What is the failure btw? For me, it just hangs w/o doing anything at all.

@cpcloud
Copy link
Contributor Author

cpcloud commented Jan 16, 2015

Yep forgot to say what was actually happening. It hangs for me without doing anything.

@cpcloud
Copy link
Contributor Author

cpcloud commented Feb 5, 2015

Is there any way we can revert the version of blosc that introduced this bug?

@esc
Copy link
Member

esc commented Feb 6, 2015

We are currently discussing action for this on the mailinglist(s):

If it's urgent you can always link bcolz against and older blosc, would suggest using 1.4.1:

http://bcolz.blosc.org/install.html#installing-from-tarball-sources

@FrancescAlted
Copy link
Member

After thinking a bit about this, my hunch is that this problem has something to do with the new multithreading mode implemented in c-blosc in 1.5. To make this run with c-blosc 1.5 series, just add:

bcolz.blosc_set_nthreads(1)

at the beginning of your script.

Hmm, the new threading implementation in 1.5 is having more implications than anticipated :/

@FrancescAlted
Copy link
Member

After holding the GIL during compression/decompression (PR #166) fixes this issue.

@mrocklin
Copy link
Contributor

Hooray for the GIL! (part of me just died saying that ;))

@FrancescAlted
Copy link
Member

In fact, Stefan Behnel (from the Cython team) always said that the GIL was actually a good thing. I think I understand him more now :)

@esc
Copy link
Member

esc commented Mar 26, 2015

Does that mean, we can bump to Blosc 1.5.x again?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants