Skip to content

Commit

Permalink
Merge pull request #185 from dcs4cop/forman-181-no_metadata_written
Browse files Browse the repository at this point in the history
Forman 181 no metadata written
  • Loading branch information
forman authored Sep 25, 2019
2 parents 80ecb8c + 8387e82 commit 4ab769b
Show file tree
Hide file tree
Showing 13 changed files with 430 additions and 270 deletions.
6 changes: 3 additions & 3 deletions CHANGES.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,13 @@
## Changes in 0.2.0.dev2 (in dev)

* Reorganisation of the Documentation and Examples Section (partly addressing #106)
* Loosened python conda environment to satisfy conda-forge requirements

### New

* Added first version of the [xcube documentation](https://xcube.readthedocs.io/) generated from `./docs` folder.

### Enhancements

* Reorganisation of the Documentation and Examples Section (partly addressing #106)
* Loosened python conda environment to satisfy conda-forge requirements
* Making CLI parameters consistent and removing or changing parameter abbreviations in case they were used twice for different params. (partly addressing #91)
For every CLI command which is generating an output a path must be provided by the option `-o`, `--output`. If not provided by the user, a default output_path is generated.
The following CLI parameter have changed and their abbreviation is not enabled anymore :
Expand Down Expand Up @@ -39,6 +38,7 @@

### Fixes

* `xcube gen` CLI now updates metadata correctly. (#181)
* It was no longer possible to use the `xcube gen` CLI with `--proc` option. (#120)
* `totalCount` attribute of time series returned by Web API `ts/{dataset}/{variable}/{geom-type}` now
contains the correct number of possible observations. Was always `1` before.
Expand Down
34 changes: 18 additions & 16 deletions docs/source/cli/xcube_gen.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,14 +19,16 @@ Generate xcube dataset.
::

Usage: xcube gen [OPTIONS] [INPUT]...
Generate xcube dataset. Data cubes may be created in one go or successively in
append mode, input by input. The input paths may be one or more input
files or a pattern that may contain wildcards '?', '*', and '**'. The
input paths can also be passed as lines of a text file. To do so, provide
exactly one input file with ".txt" extension which contains the actual
input paths to be used.

Generate xcube dataset. Data cubes may be created in one go or
successively for all given inputs. Each input is expected to provide a
single time slice which may be appended, inserted or which may replace an
existing time slice in the output dataset. The input paths may be one or
more input files or a pattern that may contain wildcards '?', '*', and
'**'. The input paths can also be passed as lines of a text file. To do
so, provide exactly one input file with ".txt" extension which contains
the actual input paths to be used.

Options:
-P, --proc INPUT-PROCESSOR Input processor name. The available input
processor names and additional information
Expand All @@ -36,8 +38,8 @@ Generate xcube dataset.
with simple datasets whose variables have
dimensions ("lat", "lon") and conform with
the CF conventions.
-c, --config CONFIG xcube dataset configuration file in YAML format.
More than one config input file is
-c, --config CONFIG xcube dataset configuration file in YAML
format. More than one config input file is
allowed.When passing several config files,
they are merged considering the order passed
via command line.
Expand All @@ -50,8 +52,7 @@ Generate xcube dataset.
"<width>,<height>".
-R, --region REGION Output region using format "<lon-min>,<lat-
min>,<lon-max>,<lat-max>"
--variables, --vars VARIABLES
Variables to be included in output. Comma-
--variables, --vars VARIABLES Variables to be included in output. Comma-
separated list of names which may contain
wildcard characters "*" and "?".
--resampling [Average|Bilinear|Cubic|CubicSpline|Lanczos|Max|Median|Min|Mode|Nearest|Q1|Q3]
Expand All @@ -68,16 +69,17 @@ Generate xcube dataset.
--prof Collect profiling information and dump
results after processing.
--sort The input file list will be sorted before
creating the xcube dataset. If --sort parameter
is not passed, order of input list will be
kept.
creating the xcube dataset. If --sort
parameter is not passed, order of input list
will be kept.
-I, --info Displays additional information about format
options or about input processors.
--dry_run Just read and process inputs, but don't
produce any outputs.
--help Show this message and exit.



Below is the ouput of a ``xcube gen --info`` call showing five input processors installed via plugins.

::
Expand Down Expand Up @@ -108,7 +110,7 @@ Configuration File
==================

Configuration files passed to ``xcube gen`` via the ``-c, --config`` option use `YAML format`_.
Multiple configuration files may be given. In this case all configuration are merged into a single one.
Multiple configuration files may be given. In this case all configurations are merged into a single one.
Parameter values will be overwritten by subsequent configurations if they are scalars. If
they are objects / mappings, their values will be deeply merged.

Expand Down
35 changes: 35 additions & 0 deletions docs/source/examples/xcube_gen.rst
Original file line number Diff line number Diff line change
Expand Up @@ -195,6 +195,41 @@ The metadata of the xcube dataset can be viewed with :doc:`cli/xcube dump` as we
Dimensions without coordinates: bnds
Data variables:
analysed_sst (time, lat, lon) float64 dask.array<shape=(3, 5632, 10240), chunksize=(1, 704, 640)>
Attributes:
acknowledgment: Data Cube produced based on data provided by ...
comment:
contributor_name:
contributor_role:
creator_email: info@brockmann-consult.de
creator_name: Brockmann Consult GmbH
creator_url: https://www.brockmann-consult.de
date_modified: 2019-09-25T08:50:32.169031
geospatial_lat_max: 62.666666666666664
geospatial_lat_min: 48.0
geospatial_lat_resolution: 0.002604166666666666
geospatial_lat_units: degrees_north
geospatial_lon_max: 10.666666666666664
geospatial_lon_min: -16.0
geospatial_lon_resolution: 0.0026041666666666665
geospatial_lon_units: degrees_east
history: xcube/reproj-snap-nc
id: demo-bc-sst-sns-l2c-v1
institution: Brockmann Consult GmbH
keywords:
license: terms and conditions of the DCS4COP data dist...
naming_authority: bc
processing_level: L2C
project: xcube
publisher_email: info@brockmann-consult.de
publisher_name: Brockmann Consult GmbH
publisher_url: https://www.brockmann-consult.de
references: https://dcs4cop.eu/
source: CMEMS Global SST & Sea Ice Anomaly Data Cube
standard_name_vocabulary:
summary:
time_coverage_end: 2017-06-08T00:00:00.000000000
time_coverage_start: 2017-06-05T00:00:00.000000000
title: CMEMS Global SST Anomaly Data Cube

The metadata for the variable analysed_sst can be viewed:

Expand Down
91 changes: 76 additions & 15 deletions test/api/gen/default/test_gen.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import os
import unittest
from typing import Tuple, Optional
from typing import Tuple, Optional, Dict, Any

import numpy as np
import xarray as xr
Expand All @@ -12,7 +12,7 @@


def clean_up():
files = ['l2c-single.nc', 'l2c.nc', 'l2c.zarr', 'l2c-single.zarr']
files = ['l2c-single.nc', 'l2c-single.zarr', 'l2c.nc', 'l2c.zarr']
for file in files:
rimraf(file)
rimraf(file + '.temp.nc') # May remain from Netcdf4DatasetIO.append()
Expand All @@ -27,11 +27,15 @@ def setUp(self):
def tearDown(self):
clean_up()

def test_process_inputs_single(self):
def test_process_inputs_single_nc(self):
status, output = gen_cube_wrapper(
[get_inputdata_path('20170101-IFR-L4_GHRSST-SSTfnd-ODYSSEA-NWE_002-v2.0-fv1.0.nc')], 'l2c-single.nc')
self.assertEqual(True, status)
self.assertTrue('\nstep 8 of 8: creating input slice in l2c-single.nc...\n' in output)
self.assert_cube_ok(xr.open_dataset('l2c-single.nc', autoclose=True), 1,
dict(date_modified=None,
time_coverage_start='2016-12-31T12:00:00.000000000',
time_coverage_end='2017-01-01T12:00:00.000000000'))

def test_process_inputs_append_multiple_nc(self):
status, output = gen_cube_wrapper(
Expand All @@ -40,6 +44,20 @@ def test_process_inputs_append_multiple_nc(self):
self.assertEqual(True, status)
self.assertTrue('\nstep 8 of 8: creating input slice in l2c.nc...\n' in output)
self.assertTrue('\nstep 8 of 8: appending input slice to l2c.nc...\n' in output)
self.assert_cube_ok(xr.open_dataset('l2c.nc', autoclose=True), 3,
dict(date_modified=None,
time_coverage_start='2016-12-31T12:00:00.000000000',
time_coverage_end='2017-01-03T12:00:00.000000000'))

def test_process_inputs_single_zarr(self):
status, output = gen_cube_wrapper(
[get_inputdata_path('20170101-IFR-L4_GHRSST-SSTfnd-ODYSSEA-NWE_002-v2.0-fv1.0.nc')], 'l2c-single.zarr')
self.assertEqual(True, status)
self.assertTrue('\nstep 8 of 8: creating input slice in l2c-single.zarr...\n' in output)
self.assert_cube_ok(xr.open_zarr('l2c-single.zarr'), 1,
dict(date_modified=None,
time_coverage_start='2016-12-31T12:00:00.000000000',
time_coverage_end='2017-01-01T12:00:00.000000000'))

def test_process_inputs_append_multiple_zarr(self):
status, output = gen_cube_wrapper(
Expand All @@ -48,6 +66,10 @@ def test_process_inputs_append_multiple_zarr(self):
self.assertEqual(True, status)
self.assertTrue('\nstep 8 of 8: creating input slice in l2c.zarr...\n' in output)
self.assertTrue('\nstep 8 of 8: appending input slice to l2c.zarr...\n' in output)
self.assert_cube_ok(xr.open_zarr('l2c.zarr'), 3,
dict(date_modified=None,
time_coverage_start='2016-12-31T12:00:00.000000000',
time_coverage_end='2017-01-03T12:00:00.000000000'))

def test_process_inputs_insert_multiple_zarr(self):
status, output = gen_cube_wrapper(
Expand All @@ -59,6 +81,10 @@ def test_process_inputs_insert_multiple_zarr(self):
self.assertTrue('\nstep 8 of 8: creating input slice in l2c.zarr...\n' in output)
self.assertTrue('\nstep 8 of 8: appending input slice to l2c.zarr...\n' in output)
self.assertTrue('\nstep 8 of 8: inserting input slice before index 0 in l2c.zarr...\n' in output)
self.assert_cube_ok(xr.open_zarr('l2c.zarr'), 3,
dict(date_modified=None,
time_coverage_start='2016-12-31T12:00:00.000000000',
time_coverage_end='2017-01-03T12:00:00.000000000'))

def test_process_inputs_replace_multiple_zarr(self):
status, output = gen_cube_wrapper(
Expand All @@ -71,6 +97,10 @@ def test_process_inputs_replace_multiple_zarr(self):
self.assertTrue('\nstep 8 of 8: creating input slice in l2c.zarr...\n' in output)
self.assertTrue('\nstep 8 of 8: appending input slice to l2c.zarr...\n' in output)
self.assertTrue('\nstep 8 of 8: replacing input slice at index 1 in l2c.zarr...\n' in output)
self.assert_cube_ok(xr.open_zarr('l2c.zarr'), 3,
dict(date_modified=None,
time_coverage_start='2016-12-31T12:00:00.000000000',
time_coverage_end='2017-01-03T12:00:00.000000000'))

def test_input_txt(self):
f = open((os.path.join(os.path.dirname(__file__), 'inputdata', "input.txt")), "w+")
Expand All @@ -81,6 +111,30 @@ def test_input_txt(self):
f.close()
status, output = gen_cube_wrapper([get_inputdata_path('input.txt')], 'l2c.zarr', sort_mode=True)
self.assertEqual(True, status)
self.assert_cube_ok(xr.open_zarr('l2c.zarr'), 3,
dict(time_coverage_start='2016-12-31T12:00:00.000000000',
time_coverage_end='2017-01-03T12:00:00.000000000'))

def assert_cube_ok(self, cube: xr.Dataset, expected_time_dim: int, expected_extra_attrs: Dict[str, Any]):
self.assertEqual({'lat': 180, 'lon': 320, 'bnds': 2, 'time': expected_time_dim}, cube.dims)
self.assertEqual({'lon', 'lat', 'time', 'lon_bnds', 'lat_bnds', 'time_bnds'}, set(cube.coords))
self.assertEqual({'analysed_sst'}, set(cube.data_vars))
expected_attrs = dict(title='Test Cube',
project='xcube',
date_modified=None,
geospatial_lon_min=-4.0,
geospatial_lon_max=12.0,
geospatial_lon_resolution=0.05,
geospatial_lon_units='degrees_east',
geospatial_lat_min=47.0,
geospatial_lat_max=56.0,
geospatial_lat_resolution=0.05,
geospatial_lat_units='degrees_north')
expected_attrs.update(expected_extra_attrs)
for k, v in expected_attrs.items():
self.assertIn(k, cube.attrs)
if v is not None:
self.assertEqual(v, cube.attrs[k], msg=f'key {k!r}')

def test_handle_360_lon(self):
status, output = gen_cube_wrapper(
Expand All @@ -96,7 +150,7 @@ def test_illegal_proc(self):
gen_cube_wrapper(
[get_inputdata_path('20170101120000-UKMO-L4_GHRSST-SSTfnd-OSTIAanom-GLOB-v02.0-fv02.0.nc')],
'l2c-single.zarr', sort_mode=True, input_processor_name="")
self.assertEqual('Missing input_processor_name', f'{e.exception}')
self.assertEqual('input_processor_name must not be empty', f'{e.exception}')

with self.assertRaises(ValueError) as e:
gen_cube_wrapper(
Expand All @@ -106,7 +160,7 @@ def test_illegal_proc(self):


# noinspection PyShadowingBuiltins
def gen_cube_wrapper(input_paths, output_path, sort_mode=False, input_processor_name='default') \
def gen_cube_wrapper(input_paths, output_path, sort_mode=False, input_processor_name=None) \
-> Tuple[bool, Optional[str]]:
output = None

Expand All @@ -117,13 +171,20 @@ def output_monitor(msg):
else:
output += msg + '\n'

config = get_config_dict(dict(input_paths=input_paths, output_path=output_path))
return gen_cube(input_processor_name=input_processor_name,
output_size=(320, 180),
output_region=(-4., 47., 12., 56.),
output_resampling='Nearest',
output_variables=[('analysed_sst', dict(name='SST'))],
sort_mode=sort_mode,
dry_run=False,
monitor=output_monitor,
**config), output
config = get_config_dict(
input_paths=input_paths,
input_processor_name=input_processor_name,
output_path=output_path,
output_size='320,180',
output_region='-4,47,12,56',
output_resampling='Nearest',
output_variables='analysed_sst',
sort_mode=sort_mode,
)

output_metadata = dict(
title='Test Cube',
project='xcube',
)

return gen_cube(dry_run=False, monitor=output_monitor, output_metadata=output_metadata, **config), output
Loading

0 comments on commit 4ab769b

Please sign in to comment.