Skip to content

Velocity HOURLY #99

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 40 commits into from
Apr 2, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
b04b008
initial commit
diodon Feb 20, 2020
c9b5632
add cell_index variable
diodon Feb 24, 2020
1f4c769
change alias of the package
diodon Feb 24, 2020
f0e6fa3
Merge remote-tracking branch 'origin/master' into velocity_hourly
diodon Feb 24, 2020
ac83ec4
replace functions by aodn package call
diodon Feb 24, 2020
f7c71cd
rename variable
diodon Feb 24, 2020
3cfc227
Merge branch 'master' into velocity_hourly
diodon Feb 25, 2020
103d9c9
adapt to NETCDF4_CLASSIC. Fix cell_index variable name in template
diodon Mar 2, 2020
2df71ef
integer type to numpy int16
diodon Mar 2, 2020
fbe0f89
Merge branch 'master' into velocity_hourly
diodon Mar 2, 2020
83d0aff
Merge branch 'master' into velocity_hourly
diodon Mar 5, 2020
dac77e1
remove check fo rseconds_to_middle and don't shift TIME. Add seconds_…
diodon Mar 5, 2020
38296bb
implement process by chunks to reduce memory use
diodon Mar 10, 2020
beb4018
add CELL_INDEX to documentation
diodon Mar 10, 2020
e21db79
documentation for the hourly product
diodon Mar 10, 2020
f6121b3
documentation for the hourly product README
diodon Mar 10, 2020
186521a
fix instrument dimension error
diodon Mar 10, 2020
1a3b2e2
fix dtype in binning function
diodon Mar 11, 2020
3a2094f
chunk size as variable
diodon Mar 11, 2020
72cd57f
squeeze the dataset to remove extra dimensions in variables
diodon Mar 12, 2020
6131301
Update aodntools/timeseries_products/velocity_hourly_timeseries.py
diodon Mar 13, 2020
ba68960
Update aodntools/timeseries_products/velocity_hourly_timeseries.py
diodon Mar 13, 2020
3974960
Update aodntools/timeseries_products/velocity_hourly_timeseries.py
diodon Mar 13, 2020
535256c
Update aodntools/timeseries_products/velocity_hourly_timeseries.py
diodon Mar 13, 2020
1ce1508
changes according to review
diodon Mar 13, 2020
f8ca463
replace variables with QC>2 with NaNs
diodon Mar 13, 2020
5831076
Merge branch 'velocity_hourly' of github.com:aodn/python-aodntools in…
diodon Mar 13, 2020
8e99d5f
Bump version to 1.3.0
mhidas Mar 18, 2020
0e067eb
Merge branch 'master' into velocity_hourly
mhidas Mar 19, 2020
bd151bc
Update global attributes and code comments for no time shift
mhidas Mar 31, 2020
2bdf8c2
import check_file from velocity_aggregated code
mhidas Mar 31, 2020
76432e4
various attribute fixes
mhidas Mar 31, 2020
d2cda9f
make bad_files a dict like in other products
mhidas Apr 1, 2020
b9de287
a bit of code clean-up (no functional changes)
mhidas Apr 1, 2020
786d8ae
rename DataFrame object to distinguish it from xarray Dataset
mhidas Mar 19, 2020
b2e8f4f
simplify 30min time shift for resample
mhidas Mar 19, 2020
1d72e12
update documentation to reflect code changes
mhidas Apr 1, 2020
0de304e
fixup: remove unused -path command-line argument
mhidas Apr 1, 2020
2332594
rename and refactor get_resampled_values in response to review
mhidas Apr 2, 2020
fef2508
Merge branch 'master' into velocity_hourly
mhidas Apr 2, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .bumpversion.cfg
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[bumpversion]
current_version = 1.2.9
current_version = 1.3.0
commit = False
tag = False
tag_name = {new_version}
Expand Down
2 changes: 1 addition & 1 deletion aodntools/__init__.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = '1.2.9'
__version__ = '1.3.0'
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,8 @@ In order to keep track of the provenance of the aggregated file, accessory varia
- `LONGITUDE(INSTRUMENT)`: LONGITUDE per instrument.
- `NOMINAL_DEPTH(INSTRUMENT)`: nominal depth per instrument, from the input file’s variable `NOMINAL_DEPTH` or global attribute instrument_nominal_depth.
- `SECONDS_TO_MIDDLE(INSTRUMENT)`: offset from the timestamp to the middle of the measurement window for each deployment
- CELL_INDEX(OBSERVATION): index of the corresponding measuring cell



### Attributes
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
# Velocity Hourly Time Series Product

- [Objective](#objective)
- [Input](#input)
- [Method](#method)
- [Output](#output)




## Objective

This product provides aggregated quality controlled U, V, and W velocity time-series files for each mooring site, binned into 1-hour intervals, including only in-water data flagged as "good" or "probably good" in the input files. QC flags are not included. Statistics related to the averaging process will be stored as variables (standard deviation, minimum and maximum values, number of records binned). For the profiling (ADCP) instruments, the absolute depth of each measuring cell is calculated using the `DEPTH` measured at the instrument and the `HEIGHT_ABOVE_SENSOR` coordinate.

The output from a single run of the code will be an aggregated file of all available measurements of the velocity components UCUR, VCUR and (where available) WCUR at one mooring site, binned into 1-hour intervals.

## Input

The aggregation function will accept a list of input files, and the code of the mooring site (`site_code`), in addition to arguments that identify the path of input and output files.

The code aggregates variables and files that meet the following requirements:

- File contains data from only one deployment of one instrument;
- File is a delayed-mode, quality-controlled product (file version label “FV01”);
- File is compliant with CF-1.6 and IMOS-1.4 conventions;
- File contains, at the minimum, the components of current velocity (`UCUR`, `VCUR`), and variables `TIME`, `DEPTH`, `LATITUDE`, `LONGITUDE`, and `HEIGHT_ABOVE_SENSOR` in the case of ADCPs;
- All files to be aggregated are from the same site, and have the same `site_code` attribute;
- Variables to be aggregated have `TIME` and (optionally) `HEIGHT_ABOVE_SENSOR` as their only dimensions (or if `LATITUDE` and `LONGITUDE` are included as dimensions, they have size 1);
- The in-water data are bounded by the global attributes `time_deployment_start` and `time_deployment_end`;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might need to add this if it is so decided:
The TIME variable has an attribute seconds_to_middle_of_measurement to indicate the offset from each recorded timestamp to the centre of the averaging period.



The code is able to access the input files either locally, or remotely via the OPeNDAP protocol.

## Method

Generating function:

```
usage: velocity_aggregated_timeseries.py [-h] -site SITE_CODE -files FILENAMES
[-indir INPUT_DIR]
[-outdir OUTPUT_DIR]
[-download_url DOWNLOAD_URL]
[-opendap_url OPENDAP_URL]

Concatenate X,Y,Z velocity variables from ALL instruments from ALL deployments
from ONE site

optional arguments:
-h, --help show this help message and exit
-site SITE_CODE site code, like NRMMAI
-files FILENAMES name of the file that contains the source URLs
-indir INPUT_DIR base path of input files
-outdir OUTPUT_DIR path where the result file will be written. Default ./
-download_url DOWNLOAD_URL path to the download_url_prefix
-opendap_url OPENDAP_URL path to the opendap_url_prefix


```



### Input file validation

Before proceeding to the aggregation, each input file will be checked to ensure it meets the requirements (as specified above under Inputs). Any input files that fail to meet the requirements will be excluded from the aggregation, and their URL listed in a global attribute `rejected_files`.

### Dimensions

The dimensions of the resulting file are determined as follows:

- `OBSERVATION`: the total number of observation records, excluding out-of-the-water data, in all input files;
- `INSTRUMENT`: the number of instruments (i.e. number of files);
- `strlen`: a fixed dimension of length 256 for character array variables.

### Variables

Only in-water velocity measurements flagged as “good” or “probably good” in the input files are included. These values are averaged into one-hour time bins (independently within each depth cell for ADCPs). Timestamps in the input files indicate the start of each measurement interval, and these _have not been shifted to the centre of the interval before binning_. This could lead to an artificial shift of up to half an hour in the output data. The size of this shift, where known, has been recorded in the `SECONDS_TO_MIDDLE` variable.

After this averaging, the velocity variables are flattened into one dimensional arrays, and the arrays from each input file are concatenated into the output file. The resulting variables have dimension `OBSERVATION`.

The binning intervals will be one hour long, centred on the hour (i.e. HH:00:00). Each timestamp will be repeated once for each ADCP depth cell, in order to match the shape of the velocity variables. The `TIME` coordinate variable in the output file also has dimension `OBSERVATION`.

The `DEPTH` variables from input files are averaged into the same one-hour bins, and concatenated into a variable `DEPTH(OBSERVATION)`. In the case of ADCP instruments, the `HEIGHT_ABOVE_SENSOR` is converted to absolute depth by subtracting each of the height values from the depth measurements at the instrument.

All output variables with the `INSTRUMENT` dimension are sorted in chronological order, and the input files aggregated chronologically, according to the global attribute `time_deployment_start`.

In order to keep track of the provenance of the aggregated file, accessory variables are created:


- `instrument_index(OBSERVATION)`: index [0:number of files] of the instrument used, referencing the `INSTRUMENT` dimension.
- `source_file(INSTRUMENT, strlen)`: URLs of the files used
- `instrument_id(INSTRUMENT, strlen)`: concatenated deployment_code, instrument and instrument_serial_number from the global attributes of each file
- `LATITUDE(INSTRUMENT)`: LATITUDE per instrument.
- `LONGITUDE(INSTRUMENT)`: LONGITUDE per instrument.
- `NOMINAL_DEPTH(INSTRUMENT)`: nominal depth per instrument, from the input file’s variable `NOMINAL_DEPTH` or global attribute instrument_nominal_depth.
- `CELL_INDEX(OBSERVATION)`: index of the corresponding measuring cell.



### Attributes

The variable attributes will comply with the IMOS metadata standards.

The global metadata will be a set of IMOS standard attributes. Fixed attributes are read from a [JSON file](../velocity_hourly_timeseries_template.json) that contains the {key:value} pairs for each of them.

Attributes specific to each aggregated product, are added as follows:

- `site_code`: obtained from the input files (should be the same in all of them);
- `time_coverage_start`, `time_coverage_end`: set to the full range of TIME values in the aggregated file;
- `geospatial_vertical_min`, `geospatial_vertical_max`: set to the full range of DEPTH values in the aggregated file;
- `geospatial_lat_min`, `geospatial_lat_max` : set to the full range of LATITUDE values in the aggregated file;
- `geospatial_lon_min`, `geospatial_lon_max`: set to the full range of LONGITUDE values in the aggregated file;
- `date_created`: set to the date/time the product file is created;
- `history`: set to “<date_created>: Aggregated file created.”;
- `keywords`: set to a comma-separated list of the main variable names (“UCUR, VCUR, WCUR, DEPTH, AGGREGATED”);
- `lineage`: a statement about how the file was created, including a link to the code used;
- `title`: "Long Timeseries Velocity Hourly Aggregated product: UCUR, VCUR, WCUR, DEPTH at <site_code> between <time_coverage_start> and <time_coverage_end>";
- `rejected_files`: a list of URLs for files that were in the input files list, but did not meet the input requirements.


## Output

The output from a single run of the code will be an aggregated file of all available current velocity measurements at one mooring site.

The product will be delivered, in netCDF4 classic format, compliant with the CF-1.6 and IMOS-1.4 conventions, and structured according to the [indexed ragged array representation](http://cfconventions.org/cf-conventions/v1.6.0/cf-conventions.html#_indexed_ragged_array_representation).


1 change: 1 addition & 0 deletions aodntools/timeseries_products/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ Documentation:
- [Hourly time series (non-velocity)](Documentation/Hourly_timeseries.md)
- [Gridded time series (Temperature)](Documentation/Gridded_timeseries.md)
- [Velocity aggregated time series](Documentation/Velocity_agrregated_timeseries.md)
- [Velocity hourly time series](Documentation/velocity_hourly_timeseries.md)


Please use the [issue tracker](https://github.com/aodn/python-aodntools/issues) for feedback and suggestions related to these products.
Expand Down
Loading