use HDF5 files for large arrays in spa and irradiance

The very large arrays in SPA and irradiance modules take a lot of space, which IMO makes those modules hard to navigate. See #235. Also having them in code can present other issues, for example if those coefficients should be changed or expanded. _EG:_ If new sets of Perez coefficients are released.

Some proposals:
1. Move data to the bottom of the module, and make it constant. When modules are imported, first only the top level symbols are interpreted, so the module attribute `MYDATA` will be interpreted before the class attribute `mydata` and won't raise a `NameError` for an unresolved reference.

``` python
import numpy as np

class ClsUsingData(objects):
    mydata = MYDATA

    def __init__(self, *args):
        # do stuff with data

# other stuff

# all constants with very large arrays at the bottom of module
MYDATA = np.arrary([
    # lots of data
])
```
1. Use HDF5 files using [`h5py`](http://www.h5py.org/). These files are highly optimized for speed and act exactly like NumPy arrays. It's okay to keep them open, they'll be closed when Python exits. HDF5 will quickly load the data only when sliced (using mp threads if h5py built with mpicc or openmp) so memory usage is faster and more efficient. Alternately, copying all of the data out of the file into a numpy array will allow you to close the file, but it is slower and less efficient.

``` python
import h5py
import os
DIRNAME = os.path.dirname(__file__)
MYDATA = os.path.join(DIRNAME, 'mydata.h5')

class ClsUsingData(objects):
    mydata = h5py.File(MYDATA)  # leave it open

# alternately copy the data to a numpy array and close the file:
# h5_data_path = '/group/dataset'
# with h5py.File(MYDATA, 'r') as f:
#     mydata = np.array(f[h5_data_path])

    def __init__(self, *args):
        # do stuff with data
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

use HDF5 files for large arrays in spa and irradiance #236

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

use HDF5 files for large arrays in spa and irradiance #236

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions