xarray groupby monthly mean fail case #99

rabernat · 2018-02-06T21:41:07Z

I made a notebook for pangeo.pydata.org that reproduces part of the dask / xarray groupby fail case discussed in pydata/xarray#1832.

https://gist.github.com/rabernat/7fe92f2a41dbfe651493d6864e46031a

Would be great if anyone (e.g. @mrocklin) wants to have a look at this and try to debug...

rabernat · 2018-02-06T21:48:48Z

Here is the error that shows up in the notebook

tornado.application - ERROR - Exception in callback <bound method WorkStealing.balance of <distributed.stealing.WorkStealing object at 0x7f0747afe780>>
Traceback (most recent call last):
  File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 1026, in _run
    return self.callback()
  File "/opt/conda/lib/python3.6/site-packages/distributed/stealing.py", line 332, in balance
    stealable.remove(ts)
KeyError: <Task "('zarr-vgos-getitem-29a065de9258b983a53e9570b918b0c6', 65, 0, 0)" processing>
tornado.application - ERROR - Exception in callback <bound method WorkStealing.balance of <distributed.stealing.WorkStealing object at 0x7f0747afe780>>
Traceback (most recent call last):
  File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 1026, in _run
    return self.callback()
  File "/opt/conda/lib/python3.6/site-packages/distributed/stealing.py", line 332, in balance
    stealable.remove(ts)
KeyError: <Task "('zarr-sla-getitem-453ed0cf1aab33146a7d8b830a0f63af', 70, 0, 0)" memory>
tornado.application - ERROR - Exception in callback <bound method WorkStealing.balance of <distributed.stealing.WorkStealing object at 0x7f0747afe780>>
Traceback (most recent call last):
  File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 1026, in _run
    return self.callback()
  File "/opt/conda/lib/python3.6/site-packages/distributed/stealing.py", line 332, in balance
    stealable.remove(ts)
KeyError: <Task "('zarr-err-getitem-b4ef20c160bc0e632d685a6c35581650', 82, 0, 0)" released>
tornado.application - ERROR - Exception in callback <bound method WorkStealing.balance of <distributed.stealing.WorkStealing object at 0x7f0747afe780>>
Traceback (most recent call last):
  File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 1026, in _run
    return self.callback()
  File "/opt/conda/lib/python3.6/site-packages/distributed/stealing.py", line 332, in balance
    stealable.remove(ts)
KeyError: <Task "('zarr-err-getitem-56fe902ed1eff1d89793bc1e03b65b9b', 9, 0, 0)" released>
tornado.application - ERROR - Exception in callback <bound method WorkStealing.balance of <distributed.stealing.WorkStealing object at 0x7f0747afe780>>
Traceback (most recent call last):
  File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 1026, in _run
    return self.callback()
  File "/opt/conda/lib/python3.6/site-packages/distributed/stealing.py", line 332, in balance
    stealable.remove(ts)
KeyError: <Task "('zarr-sla-getitem-453ed0cf1aab33146a7d8b830a0f63af', 144, 0, 0)" memory>
tornado.application - ERROR - Exception in callback <bound method WorkStealing.balance of <distributed.stealing.WorkStealing object at 0x7f0747afe780>>
Traceback (most recent call last):
  File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 1026, in _run
    return self.callback()
  File "/opt/conda/lib/python3.6/site-packages/distributed/stealing.py", line 332, in balance
    stealable.remove(ts)
KeyError: <Task "('zarr-sla-getitem-ef09b564f2602108276c7193a05bef9d', 134, 0, 0)" processing>
tornado.application - ERROR - Exception in callback <bound method WorkStealing.balance of <distributed.stealing.WorkStealing object at 0x7f0747afe780>>
Traceback (most recent call last):
  File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 1026, in _run
    return self.callback()
  File "/opt/conda/lib/python3.6/site-packages/distributed/stealing.py", line 332, in balance
    stealable.remove(ts)
KeyError: <Task "('zarr-err-getitem-ac0f88df71851092cb63dc607d305fe7', 133, 0, 0)" released>
tornado.application - ERROR - Exception in callback <bound method WorkStealing.balance of <distributed.stealing.WorkStealing object at 0x7f0747afe780>>
Traceback (most recent call last):
  File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 1026, in _run
    return self.callback()
  File "/opt/conda/lib/python3.6/site-packages/distributed/stealing.py", line 332, in balance
    stealable.remove(ts)
KeyError: <Task "('zarr-vgos-getitem-29a065de9258b983a53e9570b918b0c6', 162, 0, 0)" released>
tornado.application - ERROR - Exception in callback <bound method WorkStealing.balance of <distributed.stealing.WorkStealing object at 0x7f0747afe780>>
Traceback (most recent call last):
  File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 1026, in _run
    return self.callback()
  File "/opt/conda/lib/python3.6/site-packages/distributed/stealing.py", line 332, in balance
    stealable.remove(ts)
KeyError: <Task "('zarr-vgos-getitem-bde6e2bbcfa88594415ca26d6bb0e3da', 96, 0, 0)" released>
tornado.application - ERROR - Exception in callback <bound method WorkStealing.balance of <distributed.stealing.WorkStealing object at 0x7f0747afe780>>
Traceback (most recent call last):
  File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 1026, in _run
    return self.callback()
  File "/opt/conda/lib/python3.6/site-packages/distributed/stealing.py", line 332, in balance
    stealable.remove(ts)
KeyError: <Task "('zarr-sla-getitem-80db18454a7927074b4b6a9dd7053b8a', 113, 0, 0)" memory>
tornado.application - ERROR - Exception in callback <bound method WorkStealing.balance of <distributed.stealing.WorkStealing object at 0x7f0747afe780>>
Traceback (most recent call last):
  File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 1026, in _run
    return self.callback()
  File "/opt/conda/lib/python3.6/site-packages/distributed/stealing.py", line 332, in balance
    stealable.remove(ts)
KeyError: <Task "('zarr-vgosa-getitem-6be14088263b14471b4e8ffb060a8929', 10, 0, 0)" erred>
tornado.application - ERROR - Exception in callback <bound method WorkStealing.balance of <distributed.stealing.WorkStealing object at 0x7f0747afe780>>
Traceback (most recent call last):
  File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 1026, in _run
    return self.callback()
  File "/opt/conda/lib/python3.6/site-packages/distributed/stealing.py", line 349, in balance
    duration, cost_multiplier)
  File "/opt/conda/lib/python3.6/site-packages/distributed/stealing.py", line 274, in perhaps_move_task
    self.move_task_request(ts, sat, idl)
  File "/opt/conda/lib/python3.6/site-packages/distributed/stealing.py", line 176, in move_task_request
    self.scheduler.worker_comms[victim.address].send(
KeyError: 'tcp://10.20.11.26:33793'

jhamman · 2018-02-06T21:55:57Z

@rabernat et al. I just wanted to add that I've also been seeing this error, plus the one you mentioned in dask/distributed#1736, on Cheyenne as well.

mrocklin · 2018-02-07T00:48:28Z

@jhamman do you run into this in the context of workers dying?

jhamman · 2018-02-07T00:56:14Z

Yes, that is generally what I was assuming was a byproduct of these two sets of errors I was getting. I should say, I've been spinning up a new workflow over the past few days and I hadn't gotten to the point of determining the source of my lack of success.

mrocklin · 2018-02-07T01:02:45Z

Glad to hear you're trying new things. My apologies for the frustration. I think I've identified the source of the KeyErrors above. I would hope that they wouldn't halt execution though. We did see some odd behavior on the pangeo.pydata.org setup though.

mrocklin · 2018-02-07T01:12:31Z

Actually no, I haven't identified the source. If anyone can check the scheduler for the transition story of such a key I would be grateful.

cluster.scheduler.story(key_name)
# like 
cluster.scheduler.story("('zarr-vgosa-getitem-6be14088263b14471b4e8ffb060a8929', 10, 0, 0)")

jhamman · 2018-02-07T04:04:08Z

Okay, so my workers do die but I'm not convinced for the exact same reasons as @rabernat's.

My notebook gets a traceback like this one:

distributed.utils - ERROR - ('open_dataset-9fd2c1a157b50dbacfd6eb5462d93a4ftotal runoff-106687d7775cb8f7409fcc2c9aeb8122', 'tcp://10.148.9.198:42458')
Traceback (most recent call last):
  File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/distributed/utils.py", line 236, in f
    result[0] = yield make_coro()
  File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/gen.py", line 1055, in run
    value = future.result()
  File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/concurrent.py", line 238, in result
    raise_exc_info(self._exc_info)
  File "<string>", line 4, in raise_exc_info
  File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/gen.py", line 1063, in run
    yielded = self.gen.throw(*exc_info)
  File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/distributed/client.py", line 1343, in _gather
    traceback)
  File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/six.py", line 693, in reraise
    raise value
distributed.scheduler.KilledWorker: ('open_dataset-9fd2c1a157b50dbacfd6eb5462d93a4ftotal runoff-106687d7775cb8f7409fcc2c9aeb8122', 'tcp://10.148.9.198:42458')

with a reraise of:

---------------------------------------------------------------------------
KilledWorker                              Traceback (most recent call last)
<ipython-input-12-6b7b12cb4545> in <module>()
      1 test = plot_data[stat]['RCP 8.5']['bcsd']
      2 
----> 3 test.plot(robust=True, col='gcm')

~/anaconda/envs/pangeo/lib/python3.6/site-packages/xarray/plot/plot.py in __call__(self, **kwargs)
    316 
    317     def __call__(self, **kwargs):
--> 318         return plot(self._da, **kwargs)
    319 
    320     @functools.wraps(hist)

~/anaconda/envs/pangeo/lib/python3.6/site-packages/xarray/plot/plot.py in plot(darray, row, col, col_wrap, ax, rtol, subplot_kws, **kwargs)
    148     kwargs['ax'] = ax
    149 
--> 150     return plotfunc(darray, **kwargs)
    151 
    152 

~/anaconda/envs/pangeo/lib/python3.6/site-packages/xarray/plot/plot.py in newplotfunc(darray, x, y, figsize, size, aspect, ax, row, col, col_wrap, xincrease, yincrease, add_colorbar, add_labels, vmin, vmax, cmap, center, robust, extend, levels, infer_intervals, colors, subplot_kws, cbar_ax, cbar_kwargs, **kwargs)
    498             allargs['plotfunc'] = globals()[plotfunc.__name__]
    499 
--> 500             return _easy_facetgrid(**allargs)
    501 
    502         plt = import_matplotlib_pyplot()

~/anaconda/envs/pangeo/lib/python3.6/site-packages/xarray/plot/plot.py in _easy_facetgrid(darray, plotfunc, x, y, row, col, col_wrap, sharex, sharey, aspect, size, subplot_kws, **kwargs)
     78                   sharex=sharex, sharey=sharey, figsize=figsize,
     79                   aspect=aspect, size=size, subplot_kws=subplot_kws)
---> 80     return g.map_dataarray(plotfunc, x, y, **kwargs)
     81 
     82 

~/anaconda/envs/pangeo/lib/python3.6/site-packages/xarray/plot/facetgrid.py in map_dataarray(self, func, x, y, **kwargs)
    223         """
    224         # These should be consistent with xarray.plot._plot2d
--> 225         cmap_kwargs = {'plot_data': self.data.values,
    226                        # MPL default
    227                        'levels': 7 if 'contour' in func.__name__ else None,

~/anaconda/envs/pangeo/lib/python3.6/site-packages/xarray/core/dataarray.py in values(self)
    409     def values(self):
    410         """The array's data as a numpy.ndarray"""
--> 411         return self.variable.values
    412 
    413     @values.setter

~/anaconda/envs/pangeo/lib/python3.6/site-packages/xarray/core/variable.py in values(self)
    390     def values(self):
    391         """The variable's data as a numpy.ndarray"""
--> 392         return _as_array_or_item(self._data)
    393 
    394     @values.setter

~/anaconda/envs/pangeo/lib/python3.6/site-packages/xarray/core/variable.py in _as_array_or_item(data)
    214     TODO: remove this (replace with np.asarray) once these issues are fixed
    215     """
--> 216     data = np.asarray(data)
    217     if data.ndim == 0:
    218         if data.dtype.kind == 'M':

~/anaconda/envs/pangeo/lib/python3.6/site-packages/numpy/core/numeric.py in asarray(a, dtype, order)
    529 
    530     """
--> 531     return array(a, dtype, copy=False, order=order)
    532 
    533 

~/anaconda/envs/pangeo/lib/python3.6/site-packages/dask/array/core.py in __array__(self, dtype, **kwargs)
   1186 
   1187     def __array__(self, dtype=None, **kwargs):
-> 1188         x = self.compute()
   1189         if dtype and x.dtype != dtype:
   1190             x = x.astype(dtype)

~/anaconda/envs/pangeo/lib/python3.6/site-packages/dask/base.py in compute(self, **kwargs)
    133         dask.base.compute
    134         """
--> 135         (result,) = compute(self, traverse=False, **kwargs)
    136         return result
    137 

~/anaconda/envs/pangeo/lib/python3.6/site-packages/dask/base.py in compute(*args, **kwargs)
    331     postcomputes = [a.__dask_postcompute__() if is_dask_collection(a)
    332                     else (None, a) for a in args]
--> 333     results = get(dsk, keys, **kwargs)
    334     results_iter = iter(results)
    335     return tuple(a if f is None else f(next(results_iter), *a)

~/anaconda/envs/pangeo/lib/python3.6/site-packages/distributed/client.py in get(self, dsk, keys, restrictions, loose_restrictions, resources, sync, asynchronous, **kwargs)
   2025                 secede()
   2026             try:
-> 2027                 results = self.gather(packed, asynchronous=asynchronous)
   2028             finally:
   2029                 for f in futures.values():

~/anaconda/envs/pangeo/lib/python3.6/site-packages/distributed/client.py in gather(self, futures, errors, maxsize, direct, asynchronous)
   1463             return self.sync(self._gather, futures, errors=errors,
   1464                              direct=direct, local_worker=local_worker,
-> 1465                              asynchronous=asynchronous)
   1466 
   1467     @gen.coroutine

~/anaconda/envs/pangeo/lib/python3.6/site-packages/distributed/client.py in sync(self, func, *args, **kwargs)
    601             return future
    602         else:
--> 603             return sync(self.loop, func, *args, **kwargs)
    604 
    605     def __repr__(self):

~/anaconda/envs/pangeo/lib/python3.6/site-packages/distributed/utils.py in sync(loop, func, *args, **kwargs)
    250             e.wait(1000000)
    251     if error[0]:
--> 252         six.reraise(*error[0])
    253     else:
    254         return result[0]

~/anaconda/envs/pangeo/lib/python3.6/site-packages/six.py in reraise(tp, value, tb)
    691             if value.__traceback__ is not tb:
    692                 raise value.with_traceback(tb)
--> 693             raise value
    694         finally:
    695             value = None

~/anaconda/envs/pangeo/lib/python3.6/site-packages/distributed/utils.py in f()
    234             yield gen.moment
    235             thread_state.asynchronous = True
--> 236             result[0] = yield make_coro()
    237         except Exception as exc:
    238             logger.exception(exc)

~/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/gen.py in run(self)
   1053 
   1054                     try:
-> 1055                         value = future.result()
   1056                     except Exception:
   1057                         self.had_exception = True

~/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/concurrent.py in result(self, timeout)
    236         if self._exc_info is not None:
    237             try:
--> 238                 raise_exc_info(self._exc_info)
    239             finally:
    240                 self = None

~/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/util.py in raise_exc_info(exc_info)

~/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/gen.py in run(self)
   1061                     if exc_info is not None:
   1062                         try:
-> 1063                             yielded = self.gen.throw(*exc_info)
   1064                         finally:
   1065                             # Break up a reference to itself

~/anaconda/envs/pangeo/lib/python3.6/site-packages/distributed/client.py in _gather(self, futures, errors, direct, local_worker)
   1341                             six.reraise(type(exception),
   1342                                         exception,
-> 1343                                         traceback)
   1344                     if errors == 'skip':
   1345                         bad_keys.add(key)

~/anaconda/envs/pangeo/lib/python3.6/site-packages/six.py in reraise(tp, value, tb)
    691             if value.__traceback__ is not tb:
    692                 raise value.with_traceback(tb)
--> 693             raise value
    694         finally:
    695             value = None

KilledWorker: ('open_dataset-9fd2c1a157b50dbacfd6eb5462d93a4ftotal runoff-106687d7775cb8f7409fcc2c9aeb8122', 'tcp://10.148.9.198:42458')

Looking at a worker log, I see:

distributed.worker - INFO - Start worker at: tcp://10.148.9.198:39212

distributed.worker - INFO - Listening to: tcp://10.148.9.198:39212

distributed.worker - INFO - nanny at: 10.148.9.198:40642

distributed.worker - INFO - bokeh at: 10.148.9.198:42492

distributed.worker - INFO - Waiting to connect to: tcp://10.148.7.111:60879

distributed.worker - INFO - -------------------------------------------------

distributed.worker - INFO - Threads: 4

distributed.worker - INFO - Memory: 6.00 GB

distributed.worker - INFO - Local Directory: /glade/p/work/jhamman/loca/notebooks/dask-worker-space/worker-d9vnmevh

distributed.worker - INFO - -------------------------------------------------

distributed.worker - INFO - Registered to: tcp://10.148.7.111:60879

distributed.worker - INFO - -------------------------------------------------

distributed.worker - ERROR - failed during get data Traceback (most recent call last): File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/distributed/comm/tcp.py", line 221, in write yield future File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/gen.py", line 1055, in run value = future.result() File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/concurrent.py", line 238, in result raise_exc_info(self._exc_info) File "<string>", line 4, in raise_exc_info tornado.iostream.StreamClosedError: Stream is closed During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/distributed/worker.py", line 533, in get_data compressed = yield comm.write(msg) File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/gen.py", line 1055, in run value = future.result() File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/concurrent.py", line 238, in result raise_exc_info(self._exc_info) File "<string>", line 4, in raise_exc_info File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/gen.py", line 1063, in run yielded = self.gen.throw(*exc_info) File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/distributed/comm/tcp.py", line 225, in write convert_stream_closed_error(self, e) File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/distributed/comm/tcp.py", line 124, in convert_stream_closed_error raise CommClosedError("in %s: %s: %s" % (obj, exc.__class__.__name__, exc)) distributed.comm.core.CommClosedError: in <closed TCP>: ConnectionResetError: [Errno 104] Connection reset by peer

distributed.worker - ERROR - failed during get data Traceback (most recent call last): File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/distributed/comm/tcp.py", line 221, in write yield future File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/gen.py", line 1055, in run value = future.result() File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/concurrent.py", line 238, in result raise_exc_info(self._exc_info) File "<string>", line 4, in raise_exc_info tornado.iostream.StreamClosedError: Stream is closed During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/distributed/worker.py", line 533, in get_data compressed = yield comm.write(msg) File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/gen.py", line 1055, in run value = future.result() File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/concurrent.py", line 238, in result raise_exc_info(self._exc_info) File "<string>", line 4, in raise_exc_info File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/gen.py", line 1063, in run yielded = self.gen.throw(*exc_info) File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/distributed/comm/tcp.py", line 225, in write convert_stream_closed_error(self, e) File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/distributed/comm/tcp.py", line 124, in convert_stream_closed_error raise CommClosedError("in %s: %s: %s" % (obj, exc.__class__.__name__, exc)) distributed.comm.core.CommClosedError: in <closed TCP>: ConnectionResetError: [Errno 104] Connection reset by peer

distributed.worker - ERROR - Worker stream died during communication: tcp://10.148.9.196:46445 Traceback (most recent call last): File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/distributed/comm/tcp.py", line 174, in read n_frames = yield stream.read_bytes(8) File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/gen.py", line 1055, in run value = future.result() File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/concurrent.py", line 238, in result raise_exc_info(self._exc_info) File "<string>", line 4, in raise_exc_info tornado.iostream.StreamClosedError: Stream is closed During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/distributed/worker.py", line 1754, in gather_dep who=self.address) File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/gen.py", line 1055, in run value = future.result() File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/concurrent.py", line 238, in result raise_exc_info(self._exc_info) File "<string>", line 4, in raise_exc_info File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/gen.py", line 1063, in run yielded = self.gen.throw(*exc_info) File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/distributed/core.py", line 518, in send_recv_from_rpc result = yield send_recv(comm=comm, op=key, **kwargs) File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/gen.py", line 1055, in run value = future.result() File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/concurrent.py", line 238, in result raise_exc_info(self._exc_info) File "<string>", line 4, in raise_exc_info File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/gen.py", line 1063, in run yielded = self.gen.throw(*exc_info) File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/distributed/core.py", line 352, in send_recv response = yield comm.read() File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/gen.py", line 1055, in run value = future.result() File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/concurrent.py", line 238, in result raise_exc_info(self._exc_info) File "<string>", line 4, in raise_exc_info File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/gen.py", line 1063, in run yielded = self.gen.throw(*exc_info) File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/distributed/comm/tcp.py", line 188, in read convert_stream_closed_error(self, e) File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/distributed/comm/tcp.py", line 124, in convert_stream_closed_error raise CommClosedError("in %s: %s: %s" % (obj, exc.__class__.__name__, exc)) distributed.comm.core.CommClosedError: in <closed TCP>: ConnectionResetError: [Errno 104] Connection reset by peer

distributed.worker - INFO - Connection to scheduler broken. Reregistering

distributed.worker - INFO - -------------------------------------------------

distributed.worker - INFO - Comm closed

distributed.worker - INFO - Comm closed

distributed.worker - INFO - Registered to: tcp://10.148.7.111:60879

distributed.worker - INFO - -------------------------------------------------

distributed.worker - ERROR - failed during get data Traceback (most recent call last): File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/distributed/comm/tcp.py", line 221, in write yield future File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/gen.py", line 1055, in run value = future.result() File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/concurrent.py", line 238, in result raise_exc_info(self._exc_info) File "<string>", line 4, in raise_exc_info tornado.iostream.StreamClosedError: Stream is closed During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/distributed/worker.py", line 533, in get_data compressed = yield comm.write(msg) File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/gen.py", line 1055, in run value = future.result() File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/concurrent.py", line 238, in result raise_exc_info(self._exc_info) File "<string>", line 4, in raise_exc_info File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/gen.py", line 1063, in run yielded = self.gen.throw(*exc_info) File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/distributed/comm/tcp.py", line 225, in write convert_stream_closed_error(self, e) File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/distributed/comm/tcp.py", line 124, in convert_stream_closed_error raise CommClosedError("in %s: %s: %s" % (obj, exc.__class__.__name__, exc)) distributed.comm.core.CommClosedError: in <closed TCP>: ConnectionResetError: [Errno 104] Connection reset by peer

distributed.worker - ERROR - Worker stream died during communication: tcp://10.148.9.196:51572 Traceback (most recent call last): File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/distributed/comm/tcp.py", line 174, in read n_frames = yield stream.read_bytes(8) File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/gen.py", line 1055, in run value = future.result() File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/concurrent.py", line 238, in result raise_exc_info(self._exc_info) File "<string>", line 4, in raise_exc_info tornado.iostream.StreamClosedError: Stream is closed During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/distributed/worker.py", line 1754, in gather_dep who=self.address) File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/gen.py", line 1055, in run value = future.result() File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/concurrent.py", line 238, in result raise_exc_info(self._exc_info) File "<string>", line 4, in raise_exc_info File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/gen.py", line 1063, in run yielded = self.gen.throw(*exc_info) File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/distributed/core.py", line 518, in send_recv_from_rpc result = yield send_recv(comm=comm, op=key, **kwargs) File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/gen.py", line 1055, in run value = future.result() File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/concurrent.py", line 238, in result raise_exc_info(self._exc_info) File "<string>", line 4, in raise_exc_info File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/gen.py", line 1063, in run yielded = self.gen.throw(*exc_info) File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/distributed/core.py", line 352, in send_recv response = yield comm.read() File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/gen.py", line 1055, in run value = future.result() File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/concurrent.py", line 238, in result raise_exc_info(self._exc_info) File "<string>", line 4, in raise_exc_info File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/gen.py", line 1063, in run yielded = self.gen.throw(*exc_info) File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/distributed/comm/tcp.py", line 188, in read convert_stream_closed_error(self, e) File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/distributed/comm/tcp.py", line 124, in convert_stream_closed_error raise CommClosedError("in %s: %s: %s" % (obj, exc.__class__.__name__, exc)) distributed.comm.core.CommClosedError: in <closed TCP>: ConnectionResetError: [Errno 104] Connection reset by peer

distributed.worker - INFO - Can't find dependencies for key ('getitem-mean_agg-aggregate-mean_chunk-stack-transpose-nanmin-82cc2d7ffbd2a676cbd37ef152971123', 1, 4, 0, 0)

distributed.worker - ERROR - Worker stream died during communication: tcp://10.148.9.196:40104 Traceback (most recent call last): File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/distributed/comm/tcp.py", line 182, in read frame = yield stream.read_bytes(length) File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/gen.py", line 1055, in run value = future.result() File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/concurrent.py", line 238, in result raise_exc_info(self._exc_info) File "<string>", line 4, in raise_exc_info tornado.iostream.StreamClosedError: Stream is closed During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/distributed/worker.py", line 1754, in gather_dep who=self.address) File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/gen.py", line 1055, in run value = future.result() File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/concurrent.py", line 238, in result raise_exc_info(self._exc_info) File "<string>", line 4, in raise_exc_info File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/gen.py", line 1063, in run yielded = self.gen.throw(*exc_info) File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/distributed/core.py", line 518, in send_recv_from_rpc result = yield send_recv(comm=comm, op=key, **kwargs) File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/gen.py", line 1055, in run value = future.result() File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/concurrent.py", line 238, in result raise_exc_info(self._exc_info) File "<string>", line 4, in raise_exc_info File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/gen.py", line 1063, in run yielded = self.gen.throw(*exc_info) File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/distributed/core.py", line 352, in send_recv response = yield comm.read() File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/gen.py", line 1055, in run value = future.result() File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/concurrent.py", line 238, in result raise_exc_info(self._exc_info) File "<string>", line 4, in raise_exc_info File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/tornado/gen.py", line 1063, in run yielded = self.gen.throw(*exc_info) File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/distributed/comm/tcp.py", line 188, in read convert_stream_closed_error(self, e) File "/glade/u/home/jhamman/anaconda/envs/pangeo/lib/python3.6/site-packages/distributed/comm/tcp.py", line 126, in convert_stream_closed_error raise CommClosedError("in %s: %s" % (obj, exc)) distributed.comm.core.CommClosedError: in <closed TCP>: Stream is closed

distributed.worker - INFO - Can't find dependencies for key ('getitem-mean_agg-aggregate-mean_chunk-stack-transpose-nanmin-b462b34ffe29b17359b6e287ce5173c1', 1, 5, 0, 0)

distributed.worker - INFO - Dependent not found: ('getitem-a00d7597b60afe6cb9c6c04adfd2e13a', 1, 19, 0, 0) 0 . Asking scheduler

distributed.worker - INFO - Dependent not found: ('getitem-82287b2ccc50e670f2dd9ce85333933a', 1, 16, 0, 0) 0 . Asking scheduler

Finally,

cluster.scheduler.story(('open_dataset-9fd2c1a157b50dbacfd6eb5462d93a4ftotal runoff-106687d7775cb8f7409fcc2c9aeb8122'))

[('open_dataset-9fd2c1a157b50dbacfd6eb5462d93a4ftotal runoff-106687d7775cb8f7409fcc2c9aeb8122',
  'processing',
  'released',
  OrderedDict([('open_dataset-9fd2c1a157b50dbacfd6eb5462d93a4ftotal runoff-106687d7775cb8f7409fcc2c9aeb8122',
                'waiting')]),
  1517975493.3357968),
 ('open_dataset-9fd2c1a157b50dbacfd6eb5462d93a4ftotal runoff-106687d7775cb8f7409fcc2c9aeb8122',
  'released',
  'waiting',
  OrderedDict([('open_dataset-9fd2c1a157b50dbacfd6eb5462d93a4ftotal runoff-106687d7775cb8f7409fcc2c9aeb8122',
                'processing')]),
  1517975493.3358078),
 ('open_dataset-9fd2c1a157b50dbacfd6eb5462d93a4ftotal runoff-106687d7775cb8f7409fcc2c9aeb8122',
  'waiting',
  'processing',
  {},
  1517975493.3358288),
 ('open_dataset-9fd2c1a157b50dbacfd6eb5462d93a4ftotal runoff-106687d7775cb8f7409fcc2c9aeb8122',
  'processing',
  'released',
  OrderedDict([('open_dataset-9fd2c1a157b50dbacfd6eb5462d93a4ftotal runoff-106687d7775cb8f7409fcc2c9aeb8122',
                'waiting')]),
  1517975494.0983822),
 ('open_dataset-9fd2c1a157b50dbacfd6eb5462d93a4ftotal runoff-106687d7775cb8f7409fcc2c9aeb8122',
  'released',
  'waiting',
  OrderedDict([('open_dataset-9fd2c1a157b50dbacfd6eb5462d93a4ftotal runoff-106687d7775cb8f7409fcc2c9aeb8122',
                'processing')]),
  1517975494.0983922),
 ('open_dataset-9fd2c1a157b50dbacfd6eb5462d93a4ftotal runoff-106687d7775cb8f7409fcc2c9aeb8122',
  'waiting',
  'processing',
  {},
  1517975494.098415),
 ('open_dataset-9fd2c1a157b50dbacfd6eb5462d93a4ftotal runoff-106687d7775cb8f7409fcc2c9aeb8122',
  'processing',
  'released',
  OrderedDict([('open_dataset-9fd2c1a157b50dbacfd6eb5462d93a4ftotal runoff-106687d7775cb8f7409fcc2c9aeb8122',
                'waiting')]),
  1517975633.8895779),
 ('open_dataset-9fd2c1a157b50dbacfd6eb5462d93a4ftotal runoff-106687d7775cb8f7409fcc2c9aeb8122',
  'released',
  'waiting',
  OrderedDict([('open_dataset-9fd2c1a157b50dbacfd6eb5462d93a4ftotal runoff-106687d7775cb8f7409fcc2c9aeb8122',
                'processing')]),
  1517975633.8895876),
 ('open_dataset-9fd2c1a157b50dbacfd6eb5462d93a4ftotal runoff-106687d7775cb8f7409fcc2c9aeb8122',
  'waiting',
  'processing',
  {},
  1517975633.8896098),
 ('open_dataset-9fd2c1a157b50dbacfd6eb5462d93a4ftotal runoff-106687d7775cb8f7409fcc2c9aeb8122',
  'processing',
  'erred',
  {"('open_dataset-concatenate-fc4e88c5e6d82a68d04406b24582ceca', 25, 0, 0)": 'erred'},
  1517975633.943889)]

Sorry for the length of this post but I wanted to be clear about what I was seeing.

mrocklin · 2018-02-07T13:21:06Z

I agree that this seems to be different. Your workers seem to be dying. The most common reason we've seen for this recently is over-use of memory. Is this happening in your situation?

mrocklin · 2018-02-07T13:26:35Z

Also, long posts are fine in my book. If you really wanted to make things clean you could optionally use the <details> tag like so:

<details>

```
some long traceback
```

</details>

Rendered below

some long traceback

mrocklin · 2018-02-07T13:47:24Z

I'm just reporting some things here on that notebook as I see them:

This line oddly causes some computation. Two zarr tasks are fired off as a result of running this cell

ds_mm = ds.groupby('time.month').mean(dim='time')
ds_mm

mrocklin · 2018-02-07T14:17:38Z

It currently takes 35 seconds to open this zarr dataset:

%%prun
import gcsfs
#gcsmap = gcsfs.mapping.GCSMap('pangeo-data/dataset-duacs-rep-global-merged-allsat-phy-l4-v3')
gcsmap = gcsfs.mapping.GCSMap('pangeo-data/dataset-duacs-rep-global-merged-allsat-phy-l4-v3-alt')
ds = xr.open_zarr(gcsmap)
ds

         21265779 function calls (21260090 primitive calls) in 35.502 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    12059   20.756    0.002   20.756    0.002 {built-in method _openssl.SSL_read}
      100    7.389    0.074   30.072    0.301 core.py:435(ls)
 19540431    3.778    0.000    3.778    0.000 {method 'startswith' of 'str' objects}
      153    0.664    0.004    0.676    0.004 decoder.py:345(raw_decode)
      185    0.372    0.002    0.372    0.002 {method 'read' of '_io.FileIO' objects}
        9    0.186    0.021    0.220    0.024 core.py:131(<listcomp>)
      100    0.111    0.001   18.760    0.188 core.py:374(_list_bucket)
     1513    0.076    0.000    0.076    0.000 {built-in method posix.stat}
    12059    0.068    0.000   20.908    0.002 SSL.py:1598(recv_into)
      5/4    0.064    0.013    0.068    0.017 {built-in method _imp.create_dynamic}
        1    0.061    0.061    0.081    0.081 core.py:471(<listcomp>)
     8947    0.058    0.000    1.003    0.000 response.py:346(read)
    53836    0.053    0.000    0.144    0.000 os.py:664(__getitem__)
     1507    0.050    0.000    1.080    0.001 {method 'join' of 'bytes' objects}
    56741    0.049    0.000    0.248    0.000 _collections_abc.py:742(__iter__)
    12217    0.045    0.000    0.072    0.000 SSL.py:1417(_raise_ssl_error)
      101    0.040    0.000    0.040    0.000 models.py:830(text)
        1    0.036    0.036    0.036    0.036 {built-in method _openssl.SSL_CTX_load_verify_locations}
   105913    0.035    0.000    0.035    0.000 {method 'decode' of 'bytes' objects}
        4    0.033    0.008    0.033    0.008 {built-in method _socket.getaddrinfo}
   105108    0.032    0.000    0.066    0.000 os.py:746(decode)
      157    0.032    0.000    0.032    0.000 {built-in method _openssl.SSL_write}
    53836    0.031    0.000    0.057    0.000 os.py:742(encode)
      318    0.031    0.000    0.274    0.001 request.py:2458(getproxies_environment)
     8949    0.029    0.000    0.814    0.000 client.py:438(read)
    12063    0.028    0.000   20.990    0.002 socket.py:572(readinto)
    56995    0.026    0.000    0.026    0.000 core.py:131(<genexpr>)
    52871    0.025    0.000    0.057    0.000 os.py:687(__iter__)
86547/85977    0.024    0.000    0.044    0.000 {built-in method builtins.isinstance}

mrocklin · 2018-02-07T14:46:13Z

Yeah, other small operations are triggering computation

averaged = ds - ds_mm_p

My first guess is that some object doesn't have a dtype or something and so we're computing the dtype on a small task?

mrocklin · 2018-02-07T15:23:41Z

It looks like even after finishing ds_mm_p there is still a lot of data in memory use in the workers. It's not clear to me where this is coming from. The result we're holding in the workers is relatively small. Are there any caches in zarr or xarray that could account for this?

mrocklin · 2018-02-07T17:51:43Z

I was surprised to find this thing in a task graph:

>>> dep
<Task 'zarr-adt-0f90b3f56f247f966e5ef01277f31374' memory>
>>> Future(dep.key).result()
ImplicitToExplicitIndexingAdapter(array=LazilyIndexedArray(array=<xarray.backends.zarr.ZarrArrayWrapper object at 0x7fa921fec278>, key=BasicIndexer((slice(None, None, None), slice(None, None, None), slice(None, None, None)))))

>>> len(dep.dependents)
1781

Ideally we wouldn't need results like this, and we could instead just have our leaf tasks generate numpy arrays directly.

mrocklin · 2018-02-07T18:14:40Z

For whatever reason moving these things around is taking 10s of seconds

jhamman · 2018-02-07T19:21:08Z

I was surprised to find this thing in a task graph...

@mrocklin - if you think this is a serious problem, let's open an xarray issue on the subject. These array wrappers are fairly new and I image some of the xarray devs will have more to say on the subject than @rabernat and I.

mrocklin · 2018-02-07T19:52:25Z

pydata/xarray#1895

rabernat · 2018-02-07T20:21:17Z

Yes! I am convinced that this is the root of many of our problems. Most xarray backends (zarr and netcdf) wrap the raw numpy arrays in layers of "adapter" classes. We are trying to factor this out or xarray.

mrocklin · 2018-02-07T20:22:48Z

I'm not at all confident that these are at fault, but there are some difficult-to-explain things happening around them.

jhamman · 2018-02-13T09:23:30Z

@mrocklin - I'm sorry if I missed it but was there a solution to the following stealing issues?

tornado.application - ERROR - Exception in callback <bound method WorkStealing.balance of <distributed.stealing.WorkStealing object at 0x7f0747afe780>>
Traceback (most recent call last):
  File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 1026, in _run
    return self.callback()
  File "/opt/conda/lib/python3.6/site-packages/distributed/stealing.py", line 332, in balance
    stealable.remove(ts)
KeyError: <Task "('zarr-sla-getitem-453ed0cf1aab33146a7d8b830a0f63af', 70, 0, 0)" memory>

mrocklin · 2018-02-13T12:53:05Z

Not yet? Are these actually affecting your work or are they just noisy?

jhamman · 2018-02-13T16:19:29Z

Yes and no. Computations continue to run but for long running notebook applications, they tend to pile up make the notebook pretty clumsy to navigate.

mrocklin · 2018-02-13T16:21:08Z

OK, I'll add it back onto my todo list. Last time I checked it out I wasn't able to determine how that issue was triggering. I'll give it another shot, hopefully sometime today.

…

On Tue, Feb 13, 2018 at 11:19 AM, Joe Hamman ***@***.***> wrote: Yes and no. Computations continue to run but for long running notebook applications, they tend to pile up make the notebook pretty clumsy to navigate. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#99 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AASszHxmoMPcQ6Sboa5iYiGH4MqDxO4Iks5tUbYSgaJpZM4R7xVk> .

See pangeo-data/pangeo#99 (comment)

stale · 2018-06-25T17:15:19Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale · 2018-07-02T17:38:32Z

This issue has been automatically closed because it had not seen recent activity. The issue can always be reopened at a later date.

mrocklin mentioned this issue Feb 7, 2018

Data lingers in memory due to imbalance of worker priorities dask/distributed#1747

Open

mrocklin mentioned this issue Feb 7, 2018

Make inlined functions list extensible dask/dask#3146

Closed

mrocklin mentioned this issue Feb 7, 2018

Avoid Adapters in task graphs? pydata/xarray#1895

Closed

mrocklin added a commit to mrocklin/distributed that referenced this issue Feb 15, 2018

Use set.discard to avoid KeyErrors in stealing

84370fb

See pangeo-data/pangeo#99 (comment)

mrocklin mentioned this issue Feb 15, 2018

Use set.discard to avoid KeyErrors in stealing dask/distributed#1766

Merged

mrocklin added a commit to dask/distributed that referenced this issue Feb 15, 2018

Use set.discard to avoid KeyErrors in stealing (#1766)

e2c3328

See pangeo-data/pangeo#99 (comment)

jacobtomlinson added the bug label Apr 26, 2018

jacobtomlinson added dask xarray labels Apr 26, 2018

rabernat mentioned this issue May 24, 2018

calculating climatologies efficiently #271

Closed

stale bot added the stale label Jun 25, 2018

stale bot closed this as completed Jul 2, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

xarray groupby monthly mean fail case #99

xarray groupby monthly mean fail case #99

rabernat commented Feb 6, 2018 •

edited

Loading

rabernat commented Feb 6, 2018

jhamman commented Feb 6, 2018

mrocklin commented Feb 7, 2018

jhamman commented Feb 7, 2018

mrocklin commented Feb 7, 2018

mrocklin commented Feb 7, 2018

jhamman commented Feb 7, 2018

mrocklin commented Feb 7, 2018

mrocklin commented Feb 7, 2018

mrocklin commented Feb 7, 2018

mrocklin commented Feb 7, 2018

mrocklin commented Feb 7, 2018

mrocklin commented Feb 7, 2018

mrocklin commented Feb 7, 2018

mrocklin commented Feb 7, 2018

jhamman commented Feb 7, 2018

mrocklin commented Feb 7, 2018

rabernat commented Feb 7, 2018

mrocklin commented Feb 7, 2018

jhamman commented Feb 13, 2018

mrocklin commented Feb 13, 2018

jhamman commented Feb 13, 2018

mrocklin commented Feb 13, 2018 via email

stale bot commented Jun 25, 2018

stale bot commented Jul 2, 2018

xarray groupby monthly mean fail case #99

xarray groupby monthly mean fail case #99

Comments

rabernat commented Feb 6, 2018 • edited Loading

rabernat commented Feb 6, 2018

jhamman commented Feb 6, 2018

mrocklin commented Feb 7, 2018

jhamman commented Feb 7, 2018

mrocklin commented Feb 7, 2018

mrocklin commented Feb 7, 2018

jhamman commented Feb 7, 2018

mrocklin commented Feb 7, 2018

mrocklin commented Feb 7, 2018

mrocklin commented Feb 7, 2018

mrocklin commented Feb 7, 2018

mrocklin commented Feb 7, 2018

mrocklin commented Feb 7, 2018

mrocklin commented Feb 7, 2018

mrocklin commented Feb 7, 2018

jhamman commented Feb 7, 2018

mrocklin commented Feb 7, 2018

rabernat commented Feb 7, 2018

mrocklin commented Feb 7, 2018

jhamman commented Feb 13, 2018

mrocklin commented Feb 13, 2018

jhamman commented Feb 13, 2018

mrocklin commented Feb 13, 2018 via email

stale bot commented Jun 25, 2018

stale bot commented Jul 2, 2018

rabernat commented Feb 6, 2018 •

edited

Loading