-
Notifications
You must be signed in to change notification settings - Fork 3
Velocity HOURLY #99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Velocity HOURLY #99
Changes from all commits
Commits
Show all changes
40 commits
Select commit
Hold shift + click to select a range
b04b008
initial commit
diodon c9b5632
add cell_index variable
diodon 1f4c769
change alias of the package
diodon f0e6fa3
Merge remote-tracking branch 'origin/master' into velocity_hourly
diodon ac83ec4
replace functions by aodn package call
diodon f7c71cd
rename variable
diodon 3cfc227
Merge branch 'master' into velocity_hourly
diodon 103d9c9
adapt to NETCDF4_CLASSIC. Fix cell_index variable name in template
diodon 2df71ef
integer type to numpy int16
diodon fbe0f89
Merge branch 'master' into velocity_hourly
diodon 83d0aff
Merge branch 'master' into velocity_hourly
diodon dac77e1
remove check fo rseconds_to_middle and don't shift TIME. Add seconds_…
diodon 38296bb
implement process by chunks to reduce memory use
diodon beb4018
add CELL_INDEX to documentation
diodon e21db79
documentation for the hourly product
diodon f6121b3
documentation for the hourly product README
diodon 186521a
fix instrument dimension error
diodon 1a3b2e2
fix dtype in binning function
diodon 3a2094f
chunk size as variable
diodon 72cd57f
squeeze the dataset to remove extra dimensions in variables
diodon 6131301
Update aodntools/timeseries_products/velocity_hourly_timeseries.py
diodon ba68960
Update aodntools/timeseries_products/velocity_hourly_timeseries.py
diodon 3974960
Update aodntools/timeseries_products/velocity_hourly_timeseries.py
diodon 535256c
Update aodntools/timeseries_products/velocity_hourly_timeseries.py
diodon 1ce1508
changes according to review
diodon f8ca463
replace variables with QC>2 with NaNs
diodon 5831076
Merge branch 'velocity_hourly' of github.com:aodn/python-aodntools in…
diodon 8e99d5f
Bump version to 1.3.0
mhidas 0e067eb
Merge branch 'master' into velocity_hourly
mhidas bd151bc
Update global attributes and code comments for no time shift
mhidas 2bdf8c2
import check_file from velocity_aggregated code
mhidas 76432e4
various attribute fixes
mhidas d2cda9f
make bad_files a dict like in other products
mhidas b9de287
a bit of code clean-up (no functional changes)
mhidas 786d8ae
rename DataFrame object to distinguish it from xarray Dataset
mhidas b2e8f4f
simplify 30min time shift for resample
mhidas 1d72e12
update documentation to reflect code changes
mhidas 0de304e
fixup: remove unused -path command-line argument
mhidas 2332594
rename and refactor get_resampled_values in response to review
mhidas fef2508
Merge branch 'master' into velocity_hourly
mhidas File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1 @@ | ||
__version__ = '1.2.9' | ||
__version__ = '1.3.0' |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
127 changes: 127 additions & 0 deletions
127
aodntools/timeseries_products/Documentation/velocity_hourly_timeseries.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,127 @@ | ||
# Velocity Hourly Time Series Product | ||
|
||
- [Objective](#objective) | ||
- [Input](#input) | ||
- [Method](#method) | ||
- [Output](#output) | ||
|
||
|
||
|
||
|
||
## Objective | ||
|
||
This product provides aggregated quality controlled U, V, and W velocity time-series files for each mooring site, binned into 1-hour intervals, including only in-water data flagged as "good" or "probably good" in the input files. QC flags are not included. Statistics related to the averaging process will be stored as variables (standard deviation, minimum and maximum values, number of records binned). For the profiling (ADCP) instruments, the absolute depth of each measuring cell is calculated using the `DEPTH` measured at the instrument and the `HEIGHT_ABOVE_SENSOR` coordinate. | ||
|
||
The output from a single run of the code will be an aggregated file of all available measurements of the velocity components UCUR, VCUR and (where available) WCUR at one mooring site, binned into 1-hour intervals. | ||
|
||
## Input | ||
|
||
The aggregation function will accept a list of input files, and the code of the mooring site (`site_code`), in addition to arguments that identify the path of input and output files. | ||
|
||
The code aggregates variables and files that meet the following requirements: | ||
|
||
- File contains data from only one deployment of one instrument; | ||
- File is a delayed-mode, quality-controlled product (file version label “FV01”); | ||
- File is compliant with CF-1.6 and IMOS-1.4 conventions; | ||
- File contains, at the minimum, the components of current velocity (`UCUR`, `VCUR`), and variables `TIME`, `DEPTH`, `LATITUDE`, `LONGITUDE`, and `HEIGHT_ABOVE_SENSOR` in the case of ADCPs; | ||
- All files to be aggregated are from the same site, and have the same `site_code` attribute; | ||
- Variables to be aggregated have `TIME` and (optionally) `HEIGHT_ABOVE_SENSOR` as their only dimensions (or if `LATITUDE` and `LONGITUDE` are included as dimensions, they have size 1); | ||
- The in-water data are bounded by the global attributes `time_deployment_start` and `time_deployment_end`; | ||
|
||
|
||
|
||
The code is able to access the input files either locally, or remotely via the OPeNDAP protocol. | ||
|
||
## Method | ||
|
||
Generating function: | ||
|
||
``` | ||
usage: velocity_aggregated_timeseries.py [-h] -site SITE_CODE -files FILENAMES | ||
[-indir INPUT_DIR] | ||
[-outdir OUTPUT_DIR] | ||
[-download_url DOWNLOAD_URL] | ||
[-opendap_url OPENDAP_URL] | ||
|
||
Concatenate X,Y,Z velocity variables from ALL instruments from ALL deployments | ||
from ONE site | ||
|
||
optional arguments: | ||
-h, --help show this help message and exit | ||
-site SITE_CODE site code, like NRMMAI | ||
-files FILENAMES name of the file that contains the source URLs | ||
-indir INPUT_DIR base path of input files | ||
-outdir OUTPUT_DIR path where the result file will be written. Default ./ | ||
-download_url DOWNLOAD_URL path to the download_url_prefix | ||
-opendap_url OPENDAP_URL path to the opendap_url_prefix | ||
|
||
|
||
``` | ||
|
||
|
||
|
||
### Input file validation | ||
|
||
Before proceeding to the aggregation, each input file will be checked to ensure it meets the requirements (as specified above under Inputs). Any input files that fail to meet the requirements will be excluded from the aggregation, and their URL listed in a global attribute `rejected_files`. | ||
|
||
### Dimensions | ||
|
||
The dimensions of the resulting file are determined as follows: | ||
|
||
- `OBSERVATION`: the total number of observation records, excluding out-of-the-water data, in all input files; | ||
- `INSTRUMENT`: the number of instruments (i.e. number of files); | ||
- `strlen`: a fixed dimension of length 256 for character array variables. | ||
|
||
### Variables | ||
|
||
Only in-water velocity measurements flagged as “good” or “probably good” in the input files are included. These values are averaged into one-hour time bins (independently within each depth cell for ADCPs). Timestamps in the input files indicate the start of each measurement interval, and these _have not been shifted to the centre of the interval before binning_. This could lead to an artificial shift of up to half an hour in the output data. The size of this shift, where known, has been recorded in the `SECONDS_TO_MIDDLE` variable. | ||
|
||
After this averaging, the velocity variables are flattened into one dimensional arrays, and the arrays from each input file are concatenated into the output file. The resulting variables have dimension `OBSERVATION`. | ||
|
||
The binning intervals will be one hour long, centred on the hour (i.e. HH:00:00). Each timestamp will be repeated once for each ADCP depth cell, in order to match the shape of the velocity variables. The `TIME` coordinate variable in the output file also has dimension `OBSERVATION`. | ||
|
||
The `DEPTH` variables from input files are averaged into the same one-hour bins, and concatenated into a variable `DEPTH(OBSERVATION)`. In the case of ADCP instruments, the `HEIGHT_ABOVE_SENSOR` is converted to absolute depth by subtracting each of the height values from the depth measurements at the instrument. | ||
|
||
All output variables with the `INSTRUMENT` dimension are sorted in chronological order, and the input files aggregated chronologically, according to the global attribute `time_deployment_start`. | ||
|
||
In order to keep track of the provenance of the aggregated file, accessory variables are created: | ||
|
||
|
||
- `instrument_index(OBSERVATION)`: index [0:number of files] of the instrument used, referencing the `INSTRUMENT` dimension. | ||
- `source_file(INSTRUMENT, strlen)`: URLs of the files used | ||
- `instrument_id(INSTRUMENT, strlen)`: concatenated deployment_code, instrument and instrument_serial_number from the global attributes of each file | ||
- `LATITUDE(INSTRUMENT)`: LATITUDE per instrument. | ||
- `LONGITUDE(INSTRUMENT)`: LONGITUDE per instrument. | ||
- `NOMINAL_DEPTH(INSTRUMENT)`: nominal depth per instrument, from the input file’s variable `NOMINAL_DEPTH` or global attribute instrument_nominal_depth. | ||
- `CELL_INDEX(OBSERVATION)`: index of the corresponding measuring cell. | ||
|
||
|
||
|
||
### Attributes | ||
|
||
The variable attributes will comply with the IMOS metadata standards. | ||
|
||
The global metadata will be a set of IMOS standard attributes. Fixed attributes are read from a [JSON file](../velocity_hourly_timeseries_template.json) that contains the {key:value} pairs for each of them. | ||
|
||
Attributes specific to each aggregated product, are added as follows: | ||
|
||
- `site_code`: obtained from the input files (should be the same in all of them); | ||
- `time_coverage_start`, `time_coverage_end`: set to the full range of TIME values in the aggregated file; | ||
- `geospatial_vertical_min`, `geospatial_vertical_max`: set to the full range of DEPTH values in the aggregated file; | ||
- `geospatial_lat_min`, `geospatial_lat_max` : set to the full range of LATITUDE values in the aggregated file; | ||
- `geospatial_lon_min`, `geospatial_lon_max`: set to the full range of LONGITUDE values in the aggregated file; | ||
- `date_created`: set to the date/time the product file is created; | ||
- `history`: set to “<date_created>: Aggregated file created.”; | ||
- `keywords`: set to a comma-separated list of the main variable names (“UCUR, VCUR, WCUR, DEPTH, AGGREGATED”); | ||
- `lineage`: a statement about how the file was created, including a link to the code used; | ||
- `title`: "Long Timeseries Velocity Hourly Aggregated product: UCUR, VCUR, WCUR, DEPTH at <site_code> between <time_coverage_start> and <time_coverage_end>"; | ||
- `rejected_files`: a list of URLs for files that were in the input files list, but did not meet the input requirements. | ||
|
||
|
||
## Output | ||
|
||
The output from a single run of the code will be an aggregated file of all available current velocity measurements at one mooring site. | ||
|
||
The product will be delivered, in netCDF4 classic format, compliant with the CF-1.6 and IMOS-1.4 conventions, and structured according to the [indexed ragged array representation](http://cfconventions.org/cf-conventions/v1.6.0/cf-conventions.html#_indexed_ragged_array_representation). | ||
|
||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might need to add this if it is so decided:
The
TIME
variable has an attributeseconds_to_middle_of_measurement
to indicate the offset from each recorded timestamp to the centre of the averaging period.