Dask version #252

sjperkins · 2018-03-19T09:58:36Z

PR for the new dask version

RIME Pipeline Configuration
- Tensorflow Op Configurability. The CreateAntennaJones and SumCoherencies ops can now be given flags defining which terms exists and are multiplied into the RIME in their respective locations.
- Pipeline Configuration. Need to come up with dictionary definitions for the different terms. This should describe their
  - locations in the pipeline.
  - inputs for memory budgeting. e.g. { 'lm': (nsrc, 2) }
  - outputs for memory budgeting. e.g. { 'complex_phase': (nsrc, ntime, na, nchan) }
xarray dataset from Measurement Set (available in https://github.com/ska-sa/xarray-ms/).
xarray dataset from FITS file (available in https://github.com/ska-sa/xarray-fits/).
Input xarray dataset. Massage MS and FITS datasets into Montblanc dataset.
Use tensorflow Dataset API for input into the tensorflow graph. The StagingArea/Queues are complicated and unwieldy and the Dataset API now has a GPU prefetch
- Similarly to the master branch, the Dataset API uses a pull mechanism whereby it requests parts
  of the data, to be consumed internally via Iterators, from Datasets. There's no trivial fit
  with dask's worker mechanism. Likely need to create a C++ op cut and pasted from Dataset.from_tensor which consumes an infinite series of tensors.

Previously, created them in numpy and then converted to dask. Rather created from parts of dask arrays. Hopefully this reduces data movement.

In the distributed case, very little processor time is spent as its mostly waiting on a future. time.time(), though less technically less accurate, will give the waiting time, which is correct.

Current code needs master to run properly.

To stand for visibility row, as opposed to antenna row, to be introduced later.

This commit breaks benchmark.py A dimension containing consecutive antenna values for multiple timesteps. This is related to the unique 'utime' dimension and visibility row dimension 'vrow'. For example, there are 3 unique timesteps below. Timestep 0 has 4 visibility and 5 antenna rows associated with it, while timestep 1 has 3 visibility and 6 antenna rows associated with it. 0 1 2 unique time 1 1 2 5 3 7 2 4 visibility row 2 3 3 4 5 1 4 3 1 2 3 4 5 1 2 3 4 5 7 3 4 antenna row

* Add RIME op tensorflow source to MANIFEST.in * Deprecate ez_setup.py script See pypa/setuptools#581. * Include missing setup.cfg * Pin python-casacore==2.1.2 for now

Allow more recent versions.

ratt-priv-ci · 2022-01-03T19:42:06Z

Can one of the admins verify this patch?

sjperkins added 30 commits October 25, 2017 09:05

Make antenna2 tiling the same as antenna1 tiling

b9a1cab

Create time + time_index arrays with dask

7c0e87a

Previously, created them in numpy and then converted to dask. Rather created from parts of dask arrays. Hopefully this reduces data movement.

Make beam_extents a dask array too

a9296f5

Use time.time() instead of time.clock()

24c9b47

In the distributed case, very little processor time is spent as its mostly waiting on a future. time.time(), though less technically less accurate, will give the waiting time, which is correct.

Use recent tensorflow linking mechanism

c72ec25

Current code needs master to run properly.

Upgrade to tensorflow 1.4.0 (#227)

4da4d2c

Support distributed scheduler file

9acf3d9

Pin python-casacore == 2.1.2

97952b1

Default device should be CPU

b5359e5

Strip out old cruft

ac0c8d4

Remove unused time_offsets array

cb1cf67

Rename row dimension to vrow

ec67b9e

To stand for visibility row, as opposed to antenna row, to be introduced later.

Rename time_chunks to time_vrow_chunks

1b42247

Use da.zeros instead of da.full to create flags

0c45271

WIP

e498ed5

Use versioneer for versioning (#229)

c1017c8

Fix various setup py issues (#231)

834df57

* Add RIME op tensorflow source to MANIFEST.in * Deprecate ez_setup.py script See pypa/setuptools#581. * Include missing setup.cfg * Pin python-casacore==2.1.2 for now

Relax hard requirement on python-casacore 2.1.2 (#235)

42fda29

Allow more recent versions.

Covert ebeam kernel to antenna row

890fdc7

Allow configurable rime tensorflow library path

3e410f7

Convert feed rotation kernels to antenna row

7082d3d

Convert create_antenna_jones to antenna row

39dda58

Convert parallactic angle sincos to antenna row

c6b45bc

Convert complex phase to antenna row

85bf2d7

Fix e beam kernel block dimensions

16479e6

Rewrite test_phase.py as a unittest

fa10e07

Remove unnecessary test case

0394ff3

Update to latest dask and distributed

afb65fe

Depend on tensorflow 1.6.0rc1

2134a6b

sjperkins added 27 commits November 6, 2018 12:29

Re-use tensorflow install message

34ed6fe

Support naming FakeMapDataset

d3aee26

Create separate EvaluationThread class

3f245b9

Commit cuda kernel error checks

89308bf

Rework GaussShapeOp to work on row-based UVW

7f994e1

Pin dataset creation to CPU

0095cbe

Fix Zernike test cases

f5a860c

Rework zernike polynomial test case tolerances

9c8f4aa

Rework SersicShapeOp to row-based UVW

ceb5986

Add CUDA check to ParallacticAngleSinCos

04ec54f

cuda checks for PhaseOp

d6f00dc

Disable RIME prefetch

54a9692

Fix buffer_size in dde.py

4ca7aa6

Fixup Zernike tests again

9d453d3

Introduce tolerances to zernike random input test

024b0f8

Zernike test formatting

0a573ba

Mark test_zernike.py::test_random_inputs as xfail

1fee29b

Dataset updates for tensorflow 1.12.0

6c82cec

Submit op_kernel_utils.h

8784e86

Session cache

c9c2842

Rework package structure

c879f52

Revert debugging code

711a637

Remove probably unnecessary mutex_locks

488889d

Sersic shape schema stuff

a4ad38d

Correctly name sersic_shape

1fabda5

Allow maps to be named for easier debugging

d403eac

Allow queues to be named for easier debugging

4cd1967

sjperkins force-pushed the dask-tf-1.4 branch from ffbde2a to 4cd1967 Compare November 28, 2018 08:53

Last WIP

f816542

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dask version #252

Dask version #252

sjperkins commented Mar 19, 2018 •

edited

Loading

ratt-priv-ci commented Jan 3, 2022

Dask version #252

Are you sure you want to change the base?

Dask version #252

Conversation

sjperkins commented Mar 19, 2018 • edited Loading

ratt-priv-ci commented Jan 3, 2022

sjperkins commented Mar 19, 2018 •

edited

Loading