echometrics improvements #12

jr3cermak · 2022-03-15T15:34:47Z

Continued from PR cf-convention/vocabularies#186

Use pytest test_slocum.py to exercise echometrics/pseudogram code to produce desired netCDF results
Always assign echometrics variables with extras dimension even if pseudogram is missing
Wire acoustic sensor configuration through deployment.json using extra_kwargs
apply extra_kwargs to ascii to nc conversion
tests/test_slocum.py::TestEcoMetricsThree::test_pseudogram produces three netCDF files that need to be consistent
Manual running of ascii/netCDF produces one file; pytest produces three files [differences in json config files]

Deferred:

apply extra_kwargs to dbd to ascii conversion (no current hooks to grab extra_kwargs from deployment.json)

The text was updated successfully, but these errors were encountered:

jr3cermak · 2022-03-15T15:54:07Z

That was one thing I hadn't quite figured out was how to run things within the test harness. I am familiar with pytest. To produce those results, I was manually running gutils_binary_to_ascii_watch and gutils_ascii_to_netcdf_watch.

From email:

I will make a change that allows "extra_kwargs" to be specified in the deployment.json file (top level key) and then passed into each Reader's (i.e. SlocumReader) extras(data, **kwargs) method.

This will be in preparation for moving the processing code from the merge (using *.*bd files) to the analysis/processing (using ascii/pandas). Doing that will be much easier when we move to using dbdreader, which is a great suggestion! I didn't know it existed and it will make it much easier to work with the *.*bd files.

The dbdreader has its own quirks.

kwilcox · 2022-03-15T15:57:34Z

You can run the existing EcoMetrics tests with pytest -k TestEcoMetrics. The tests do remove any of the produced files at the end of running. I often will comment out the tearDown method of a test while I am writing the assertions so I can inspect the produced netCDF files: https://github.com/SECOORA/GUTILS/blob/master/gutils/tests/test_slocum.py#L295-L297

I pushed a branch pseudograms-remix branch that has your initial work from cf-convention/vocabularies#186. You can PR against that!

jr3cermak · 2022-03-16T06:14:52Z

It seems to be working as is. With tearDown() disabled...

$ pytest -k TestEcoMetricsThree
============================================================================= test session starts ==============================================================================
platform linux -- Python 3.6.15, pytest-6.2.5, py-1.11.0, pluggy-1.0.0 -- /home/cermak/miniconda3/envs/glider/bin/python
cachedir: .pytest_cache
rootdir: /home/cermak/src/GUTILS, configfile: setup.cfg
plugins: anyio-2.2.0
collected 34 items / 33 deselected / 1 selected                                                                                                                                

tests/test_slocum.py::TestEcoMetricsThree::test_pseudogram 2022-03-15 21:14:42,212 - gutils.slocum - INFO - Converted unit_507-2022-042-1-2.sbd,unit_507-2022-042-1-2.tbd to unit_507_2022_042_1_2_sbd.dat
2022-03-15 21:14:42,348 - gutils.filters - INFO - ('Filtered 2/5 profiles from unit_507_2022_042_1_2_sbd.dat', 'Depth (1m): 1', 'Points (5): 1', 'Time (5s): 0', 'Distance (1m): 0')
PASSED

======================================================================= 1 passed, 33 deselected in 5.77s =======================================================================

The test produces three netCDF files. The last one has the desired information. The first two will need empty variables.

~/src/GUTILS/gutils/tests/resources/slocum/ecometrics3/rt$ ls -l netcdf/
total 652
-rw-rw-r-- 1 cermak staff 198712 Mar 15 21:14 ecometrics_1644647093_20220212T062453Z_rt.nc
-rw-rw-r-- 1 cermak staff 209647 Mar 15 21:14 ecometrics_1644647313_20220212T062833Z_rt.nc
-rw-rw-r-- 1 cermak staff 253545 Mar 15 21:14 ecometrics_1644648114_20220212T064154Z_rt.nc

netcdf ecometrics_1644648114_20220212T064154Z_rt {
dimensions:
        time = 20 ;
        extras = 2079 ;
variables:
        string trajectory ;
                trajectory:cf_role = "trajectory_id" ;
                trajectory:long_name = "Trajectory/Deployment Name" ;
                trajectory:comment = "A trajectory is a single deployment of a glider and may span multiple data files." ;
                trajectory:ioos_category = "Identifier" ;
....
        double pseudogram_time(extras) ;
                pseudogram_time:_FillValue = -9999.9 ;
                pseudogram_time:units = "seconds since 1990-01-01 00:00:00Z" ;
                pseudogram_time:calendar = "standard" ;
                pseudogram_time:long_name = "Pseudogram Time" ;
                pseudogram_time:ioos_category = "Other" ;
                pseudogram_time:standard_name = "pseudogram_time" ;
                pseudogram_time:platform = "platform" ;
                pseudogram_time:observation_type = "measured" ;
        double pseudogram_depth(extras) ;
                pseudogram_depth:_FillValue = -9999.9 ;
                pseudogram_depth:units = "m" ;
                pseudogram_depth:long_name = "Pseudogram Depth" ;
                pseudogram_depth:valid_min = 0. ;
                pseudogram_depth:valid_max = 2000. ;
                pseudogram_depth:ioos_category = "Other" ;
                pseudogram_depth:standard_name = "pseudogram_depth" ;
                pseudogram_depth:platform = "platform" ;
                pseudogram_depth:observation_type = "measured" ;
....
sci_echodroid_aggindex = _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, 
    _, _, _, _, 0.0382824018597603, _, _, _, _, _, _, _, _, _, _, _, _, _, _, 
    _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, 
    _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, 
....

Continuing with other tasks... adding information seems straightforward. Do non-standard attributes cause problems? Fiddling with deployment.json and instrument.json a bit:

        double pseudogram_sv(extras) ;
                pseudogram_sv:_FillValue = -9999.9 ;
                pseudogram_sv:units = "db" ;
                pseudogram_sv:long_name = "Pseudogram SV" ;
                pseudogram_sv:colorBarMinimum = -200. ;
                pseudogram_sv:colorBarMaximum = 200. ;
                pseudogram_sv:ioos_category = "Other" ;
                pseudogram_sv:standard_name = "pseudogram_sv" ;
                pseudogram_sv:platform = "platform" ;
                pseudogram_sv:observation_type = "measured" ;
                pseudogram_sv:coordinates = "pseudogram_time pseudogram_depth" ;
                pseudogram_sv:echosounderRangeBins = 20LL ;
                pseudogram_sv:echosounderRange = 60. ;
                pseudogram_sv:echosounderRangeUnits = "meters" ;
                pseudogram_sv:echosounderDirection = "up" ;

The acoustics has two components with separate serial numbers.
Added acoustics instrument as:

        int instrument_acoustics ;
                instrument_acoustics:_FillValue = 0 ;
                instrument_acoustics:serial_number = "269615" ;
                instrument_acoustics:make_model = "Simrad WBT Mini" ;
                instrument_acoustics:serial_number_2 = "167" ;
                instrument_acoustics:make_model_2 = "ES200-CDK-split" ;
                instrument_acoustics:comment = "Slocum Glider UAF G507" ;
                instrument_acoustics:long_name = "Kongsberg Simrad WBT Mini" ;
                instrument_acoustics:mode_operation = "EK80" ;
                instrument_acoustics:calibration_date = "" ;
                instrument_acoustics:factory_calibrated = "" ;
                instrument_acoustics:calibration_report = "" ;
                instrument_acoustics:platform = "platform" ;
                instrument_acoustics:type = "instrument" ;

If this is ok, I can look at removing the hard coded options.

jr3cermak · 2022-03-16T17:22:41Z

Moved config options to the instrument since it impacts all the eco* variables.

        int instrument_acoustics ;
                instrument_acoustics:_FillValue = 0 ;
                instrument_acoustics:serial_number = "269615" ;
                instrument_acoustics:make_model = "Simrad WBT Mini" ;
                instrument_acoustics:serial_number_2 = "167" ;
                instrument_acoustics:make_model_2 = "ES200-CDK-split" ;
                instrument_acoustics:comment = "Slocum Glider UAF G507" ;
                instrument_acoustics:long_name = "Kongsberg Simrad WBT Mini" ;
                instrument_acoustics:mode_operation = "EK80" ;
                instrument_acoustics:echosounderRangeBins = 20LL ;
                instrument_acoustics:echosounderRange = 60. ;
                instrument_acoustics:echosounderRangeUnits = "meters" ;
                instrument_acoustics:echosounderDirection = "up" ;
                instrument_acoustics:calibration_date = "" ;
                instrument_acoustics:factory_calibrated = "" ;
                instrument_acoustics:calibration_report = "" ;
                instrument_acoustics:platform = "platform" ;
                instrument_acoustics:type = "instrument" ;

jr3cermak · 2022-03-17T07:02:44Z

Think about grouping these so other features can be added later and not get mixed up with other provided keywords.

replace:

    "extra_kwargs": {
        "enable_pseudograms": true,
        "echosounderRange": 60.0,
        "echosounderRangeBins": 20,
        "echosounderDirection": "up",
        "echosounderRangeUnits": "meters"
    },

with?

    "extra_kwargs": {
        "pseudograms": {
               "enable": true,
               "echosounderRange": 60.0,
               "echosounderRangeBins": 20,
               "echosounderDirection": "up",
               "echosounderRangeUnits": "meters"
        }
    },

kwilcox · 2022-03-17T13:15:29Z

Grouping the kwargs is a great idea... extras can be used to do anything and isn't restricted to pseudogram things.

jr3cermak · 2022-04-08T17:33:34Z

Current tasks:

Try always adding extras dimension for deployments that need it; there is a current edge condition if no extras variables are present, the extras dimension is not added (Ben, 3/31/2022); do not use extras dimension; save as separate profile with time (PR New standard_name: atmosphere_mass_content_of_snow cf-convention/vocabularies#99)
generate psudogram plots from the GUTILS side of the house
~~add controls to kwargs to allow filtering of known bad data for deployment (IE: for this deployment, the 20th bin is producing bad data)~~ that was just a bug; fixed

Interim testing:

base ERDDAP container running via "axiom/docker-erddap"; grab setup.xml and datasets.xml
configure local "content" or "bigParentDirectory" outside of container
~~Add testing dataset "extras_snippet.xml" to dataset.xml for testing; even make it the only available~~ dataset
~~Produce nc files from glider dbd files and test ERDDAP~~

kwilcox · 2022-04-11T17:18:39Z

@jr3cermak I played around with hosting the datasets as-in (with the extras dimension) and it won't currently work with the DAC's setup since they are on an old version of ERDDAP. Even if they did upgrade their ERDDAP version it still doesn't work wonderfully. Requesting a subset of data where variables are dimensioned by both time and extras fails to return data. I'm sure this is something Bob Simons could advise on, but for now, we have 2 options:

Remove the extras dimension and put the pseudogram data directly into the time dimension. We did this at one point, but I likely suggested splitting it out. IMO the extras dimension is much more correct.
Bypass the DAC and get the pseudogram data into the AOOS data system another way. The profile netCDF files will not include the pseudogram data and it will only be available through the AOOS data portal. It won't be archived with the glider data through NCEI.

jr3cermak · 2022-04-11T17:44:48Z

I also experimented with storing the pseudogram with the time dimension. Since the pseudogram time coordinates are different from the CTD profile, the resultant netCDF files became very large. So, I would say writing the pseudogram data out to a separate file sounds like the best option at the moment.

kwilcox · 2022-04-11T18:36:01Z

😒

Here is an ERDDAP Dataset that just serves the pseudogram data. I'm playing with some ideas to get this into AOOS, stay tuned.

<dataset type="EDDTableFromMultidimNcFiles" datasetID="unit_507_pseudogram" active="true">
        <!-- defaultDataQuery uses datasetID -->
        <!--
                    <defaultDataQuery>&amp;trajectory=extras_test-20220329T0000</defaultDataQuery>
                    <defaultGraphQuery>longitude,latitude,time&amp;.draw=markers&amp;.marker=2|5&.color=0xFFFFFF&.colorBar=|||||</defaultGraphQuery>
                    -->
        <reloadEveryNMinutes>1440</reloadEveryNMinutes>
        <updateEveryNMillis>-1</updateEveryNMillis>
        <!-- use datasetID as the directory name -->
        <fileDir>/datasets/gliders/ecodroid2</fileDir>
        <recursive>false</recursive>
        <fileNameRegex>.*\.nc</fileNameRegex>
        <metadataFrom>last</metadataFrom>
        <sortedColumnSourceName>pseudogram_time</sortedColumnSourceName>
        <sortFilesBySourceNames>trajectory pseudogram_time</sortFilesBySourceNames>
        <fileTableInMemory>false</fileTableInMemory>
        <accessibleViaFiles>true</accessibleViaFiles>
        <addAttributes>
            <att name="cdm_data_type">trajectoryProfile</att>
            <att name="featureType">trajectoryProfile</att>
            <!-- <att name="cdm_altitude_proxy">pseudogram_depth</att> -->
            <att name="cdm_trajectory_variables">trajectory,wmo_id</att>
            <att name="cdm_profile_variables">profile_id,profile_time,latitude,longitude</att>
            <att name="subsetVariables">trajectory,wmo_id,profile_id,profile_time,latitude,longitude</att>
            <att name="Conventions">Unidata Dataset Discovery v1.0, COARDS, CF-1.6</att>
            <att name="keywords">AUVS &gt; Autonomous Underwater Vehicles, Oceans &gt; Ocean Pressure &gt; Water Pressure, Oceans &gt; Ocean Temperature &gt; Water Temperature, Oceans &gt; Salinity/Density &gt; Conductivity, Oceans &gt; Salinity/Density &gt; Density, Oceans &gt; Salinity/Density &gt; Salinity, glider, In Situ Ocean-based platforms &gt; Seaglider, Spray, Slocum, trajectory, underwater glider, water, wmo</att>
            <att name="keywords_vocabulary">GCMD Science Keywords</att>
            <att name="Metadata_Conventions">Unidata Dataset Discovery v1.0, COARDS, CF-1.6</att>
            <att name="sourceUrl">(local files)</att>
            <att name="infoUrl">https://gliders.ioos.us/erddap/</att>
            <!-- title=datasetID -->
            <att name="title">unit_507-20220212T0000_pseudogram</att>
            <att name="ioos_dac_checksum">sdfsdf</att>
            <att name="ioos_dac_completed">False</att>
            <att name="gts_ingest">true</att>
        </addAttributes>

        <dataVariable>
            <sourceName>trajectory</sourceName>
            <destinationName>trajectory</destinationName>
            <dataType>String</dataType>
            <addAttributes>
                <att name="comment">A trajectory is one deployment of a glider.</att>
                <att name="ioos_category">Identifier</att>
                <att name="long_name">Trajectory Name</att>
            </addAttributes>
        </dataVariable>

        <dataVariable>
            <sourceName>global:wmo_id</sourceName>
            <destinationName>wmo_id</destinationName>
            <dataType>String</dataType>
            <addAttributes>
                <att name="ioos_category">Identifier</att>
                <att name="long_name">WMO ID</att>
                <att name="missing_value" type="string">none specified</att>
            </addAttributes>
        </dataVariable>

        <dataVariable>
            <sourceName>profile_id</sourceName>
            <destinationName>profile_id</destinationName>
            <dataType>int</dataType>
            <addAttributes>
                <att name="cf_role">profile_id</att>
                <att name="ioos_category">Identifier</att>
                <att name="long_name">Profile ID</att>
            </addAttributes>
        </dataVariable>

        <dataVariable>
            <sourceName>profile_time</sourceName>
            <dataType>double</dataType>
            <addAttributes>
                <att name="ioos_category">Time</att>
                <att name="long_name">Profile Time</att>
                <att name="comment">Timestamp corresponding to the mid-point of the profile.</att>
            </addAttributes>
        </dataVariable>

        <dataVariable>
            <sourceName>profile_lat</sourceName>
            <destinationName>latitude</destinationName>
            <dataType>double</dataType>
            <addAttributes>
                <att name="colorBarMaximum" type="double">90.0</att>
                <att name="colorBarMinimum" type="double">-90.0</att>
                <att name="valid_max" type="double">90.0</att>
                <att name="valid_min" type="double">-90.0</att>
                <att name="ioos_category">Location</att>
                <att name="long_name">Profile Latitude</att>
                <att name="comment">Value is interpolated to provide an estimate of the latitude at the mid-point of the profile.</att>
            </addAttributes>
        </dataVariable>

        <dataVariable>
            <sourceName>profile_lon</sourceName>
            <destinationName>longitude</destinationName>
            <dataType>double</dataType>
            <addAttributes>
                <att name="colorBarMaximum" type="double">180.0</att>
                <att name="colorBarMinimum" type="double">-180.0</att>
                <att name="valid_max" type="double">180.0</att>
                <att name="valid_min" type="double">-180.0</att>
                <att name="ioos_category">Location</att>
                <att name="long_name">Profile Longitude</att>
                <att name="comment">Value is interpolated to provide an estimate of the longitude at the mid-point of the profile.</att>
            </addAttributes>
        </dataVariable>

        <dataVariable>
            <sourceName>pseudogram_time</sourceName>
            <destinationName>time</destinationName>
            <dataType>double</dataType>
            <addAttributes>
                <att name="ioos_category">Time</att>
                <att name="long_name">Profile Time</att>
                <att name="comment">Timestamp corresponding to the mid-point of the profile.</att>
            </addAttributes>
        </dataVariable>


        <dataVariable>
            <sourceName>pseudogram_depth</sourceName>
            <destinationName>depth</destinationName>
            <dataType>double</dataType>
            <addAttributes>
                <att name="colorBarMaximum" type="double">2000.0</att>
                <att name="colorBarMinimum" type="double">0.0</att>
                <att name="colorBarPalette">OceanDepth</att>
                <att name="ioos_category">Location</att>
                <att name="long_name">Depth</att>
            </addAttributes>
            </dataVariable>
        <dataVariable>
            <sourceName>pseudogram_sv</sourceName>
            <dataType>double</dataType>
            <addAttributes>
                <att name="ioos_category">Other</att>
            </addAttributes>
        </dataVariable>

        <dataVariable>
            <sourceName>sci_echodroid_aggindex</sourceName>
            <dataType>double</dataType>
            <addAttributes>
                <att name="ioos_category">Other</att>
            </addAttributes>
        </dataVariable>
        <dataVariable>
            <sourceName>sci_echodroid_ctrmass</sourceName>
            <dataType>double</dataType>
            <addAttributes>
                <att name="ioos_category">Other</att>
            </addAttributes>
        </dataVariable>
        <dataVariable>
            <sourceName>sci_echodroid_eqarea</sourceName>
            <dataType>double</dataType>
            <addAttributes>
                <att name="ioos_category">Other</att>
            </addAttributes>
        </dataVariable>
        <dataVariable>
            <sourceName>sci_echodroid_inertia</sourceName>
            <dataType>double</dataType>
            <addAttributes>
                <att name="ioos_category">Other</att>
            </addAttributes>
        </dataVariable>
        <dataVariable>
            <sourceName>sci_echodroid_propocc</sourceName>
            <dataType>double</dataType>
            <addAttributes>
                <att name="ioos_category">Other</att>
            </addAttributes>
        </dataVariable>
        <dataVariable>
            <sourceName>sci_echodroid_sa</sourceName>
            <dataType>double</dataType>
            <addAttributes>
                <att name="ioos_category">Other</att>
            </addAttributes>
        </dataVariable>
        <dataVariable>
            <sourceName>sci_echodroid_sv</sourceName>
            <dataType>double</dataType>
            <addAttributes>
                <att name="ioos_category">Other</att>
            </addAttributes>
        </dataVariable>
    </dataset>

jr3cermak · 2022-04-13T00:20:33Z

I am just rounding the corner where I can almost get the latest glider deployment loaded under ERDDAP. I can see it is complaining about something. This is the combined case. It does not seem happy at all with the extras dimension.

*** constructing EDDTableFromFiles unit_507_combined
dir/file table doesn't exist: /erddapData/dataset/ed/unit_507_combined/dirTable.nc
dir/file table doesn't exist: /erddapData/dataset/ed/unit_507_combined/fileTable.nc
creating new dirTable and fileTable (dirTable=null?true fileTable=null?true badFileMap=null?false)
doQuickRestart=false
574 files found in /data/combined/
regex=.*\.nc recursive=false pathRegex=.* time=22ms
old nBadFiles size=0
old fileTable size=0   nFilesMissing=0
Didn't get expected attributes because there were no previously valid files,
  or none of the previously valid files were unchanged!
EDDTableFromFiles file #0=/data/combined/G507_1644626730_20220212T004530Z_rt.nc
0 insert in fileList
0 bad file: removing fileTable row for /data/combined/G507_1644626730_20220212T004530Z_rt.nc
java.lang.RuntimeException: 
ERROR in Test.ensureEqual(Strings) line cf-convention/discuss#1, col cf-convention/discuss#1 'e[end]'!='t[end]':
ERROR in Table.readNDNc /data/combined/G507_1644626730_20220212T004530Z_rt.nc:
Unexpected axis#0 for variable=pseudogram_depth
Specifically, at line cf-convention/discuss#1, col cf-convention/discuss#1:
s1: extras[end]
s2: time[end]
    ^

 at com.cohort.util.Test.error(Test.java:43)
 at com.cohort.util.Test.ensureEqual(Test.java:340)
 at gov.noaa.pfel.coastwatch.pointdata.Table.readNDNc(Table.java:7021)
 at gov.noaa.pfel.erddap.dataset.EDDTableFromNcFiles.lowGetSourceDataFromFile(EDDTableFromNcFiles.java:211)
 at gov.noaa.pfel.erddap.dataset.EDDTableFromFiles.getSourceDataFromFile(EDDTableFromFiles.java:3270)
 at gov.noaa.pfel.erddap.dataset.EDDTableFromFiles.<init>(EDDTableFromFiles.java:1543)
 at gov.noaa.pfel.erddap.dataset.EDDTableFromNcFiles.<init>(EDDTableFromNcFiles.java:130)
 at gov.noaa.pfel.erddap.dataset.EDDTableFromFiles.fromXml(EDDTableFromFiles.java:503)
 at gov.noaa.pfel.erddap.dataset.EDD.fromXml(EDD.java:457)
 at gov.noaa.pfel.erddap.LoadDatasets.run(LoadDatasets.java:359)

netcdf G507_1644626730_20220212T004530Z_rt {
dimensions:
        time = 78 ;
        extras = 651 ;
...
        double pseudogram_time(extras) ;
                pseudogram_time:_FillValue = -9999.9 ;
                pseudogram_time:units = "seconds since 1990-01-01 00:00:00Z" ;
                pseudogram_time:calendar = "standard" ;
                pseudogram_time:long_name = "Pseudogram Time" ;
                pseudogram_time:ioos_category = "Other" ;
                pseudogram_time:standard_name = "pseudogram_time" ;
                pseudogram_time:platform = "platform" ;
                pseudogram_time:observation_type = "measured" ;

That test looks suspicious... e[end]'!='t[end]. It almost looks like it wants the extra dimension to also start and end with the same timestamp?

Onto the separated case...

jr3cermak · 2022-04-25T22:47:35Z

Resync branch after PR cf-convention/vocabularies#99 and carry on.

jr3cermak · 2022-05-04T20:17:19Z

Resync with master to take a look at the new pathway.

jr3cermak · 2022-05-04T22:48:14Z

Running the latest deployment through the current code shows a single netcdf file now. Are the profiles combined?

This is quite different than what was shown in an earlier email with the tabledap link: https://gliders.ioos.us/erddap/tabledap/extras_test-20220329T0000.htmlTable?trajectory%2Cwmo_id%2Cprofile_id%2Ctime%2Clatitude%2Clongitude%2Cdepth%2Cpseudogram_depth%2Cpseudogram_sv%2Cpseudogram_time%2Csci_echodroid_aggindex%2Csci_echodroid_ctrmass%2Csci_echodroid_eqarea%2Csci_echodroid_inertia%2Csci_echodroid_propocc%2Csci_echodroid_sa%2Csci_echodroid_sv&time%3E=2021-12-02T00%3A00%3A00Z&time%3C=2021-12-09T17%3A33%3A35Z that references: https://gliders.ioos.us/erddap/files/extras_test-20220329T0000/

On the DAC for unit_507, there are two separate sets of files *_rt.nc and the _extra_rt.nc: https://gliders.ioos.us/erddap/files/unit_507-20220212T0000/

It looks like the pseudogram is folded back into the profiles as a single file now.

jr3cermak · 2022-07-13T19:49:28Z

The latest master of GUTILS is great for backend storage of echodroid/pseudogram data.

Tossing the _extra_rt.nc behind an aggregated netCDF dataset in thredds allows for full deployment plotting.

  <dataset name="Glider extras" ID="Gretel-NC-extra" urlPath="GretelExtra.nc">
    <serviceName>all</serviceName>
    <netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2">
      <aggregation dimName="time" type="joinExisting">
        <scan location="/home/cermak/glider/ecometrics6/rt/netcdf/" suffix="_rt_extra.nc" subdirs="false"/>
      </aggregation>
    </netcdf>
  </dataset>

Python code to pull from the aggregation just for reference.

#$ cat plotGretelDepl2.py 
import io, os, sys, struct, datetime
import subprocess
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
from matplotlib.figure import Figure
from matplotlib.colors import LinearSegmentedColormap, Colormap
import matplotlib.dates as dates
import matplotlib.ticker as mticker
from matplotlib.patches import Rectangle
import json
import xarray as xr
#get_ipython().run_line_magic('matplotlib', 'inline')

def newFigure(figsize = (10,8), dpi = 100):

    fig = Figure(figsize=figsize, dpi=dpi)

    return fig

# Fetch Sv data
def fetchSv(start_time, end_time, ds):
    # Copy data into a numpy array and resort Sv(dB) values for plotting
    # Convert TS to string
    # datetime.datetime.strftime(datetime.datetime.utcfromtimestamp(dt), "%Y-%m-%d %H:%M:%S.%f")
    # Convert string to TS
    # datetime.datetime.strptime(dtSTR, "%Y-%m-%d %H:%M:%S.%f").timestamp()
    time_dim = 'time'
    sv_ts    = np.unique(ds[time_dim])

    startDTTM = start_time
    #startVal = datetime.datetime.strptime(startDTTM, "%Y-%m-%d %H:%M:%S.%f").timestamp()
    startVal = np.datetime64(datetime.datetime.strptime(startDTTM, "%Y-%m-%d %H:%M:%S.%f"))
    endDTTM = end_time
    #endVal = datetime.datetime.strptime(endDTTM, "%Y-%m-%d %H:%M:%S.%f").timestamp()
    endVal = np.datetime64(datetime.datetime.strptime(endDTTM, "%Y-%m-%d %H:%M:%S.%f"))

    # This obtains time indicies for the unique time values
    a = np.abs(sv_ts-startVal).argmin()
    b = np.abs(sv_ts-endVal).argmin()
    #print(a,b)

    #print(time_array.shape)
    #print(list(sv.variables))
    # https://xarray.pydata.org/en/v0.11.0/time-series.html
    sv_data  = ds['pseudogram_sv'].sel(time=slice(pd.Timestamp(sv_ts[a]),pd.Timestamp(sv_ts[b])))
    sv_time  = [pd.Timestamp(t.values).timestamp() for t in ds[time_dim].sel(time=slice(pd.Timestamp(sv_ts[a]),pd.Timestamp(sv_ts[b])))]
    sv_depth = ds['depth'].sel(time=slice(pd.Timestamp(sv_ts[a]),pd.Timestamp(sv_ts[b])))

    return (sv_time, sv_depth, sv_data)


# Make plots from intermediate deployment data
def makePlot(sv_time, sv_depth, sv_data):
    # Set the default SIMRAD EK500 color table plus grey for NoData.
    simrad_color_table = [(1, 1, 1),
        (0.6235, 0.6235, 0.6235),
        (0.3725, 0.3725, 0.3725),
        (0, 0, 1),
        (0, 0, 0.5),
        (0, 0.7490, 0),
        (0, 0.5, 0),
        (1, 1, 0),
        (1, 0.5, 0),
        (1, 0, 0.7490),
        (1, 0, 0),
        (0.6509, 0.3255, 0.2353),
        (0.4705, 0.2353, 0.1568)]
    simrad_cmap = (LinearSegmentedColormap.from_list
        ('Simrad', simrad_color_table))
    simrad_cmap.set_bad(color='lightgrey')

    # Convert sv_time to something useful
    svData   = np.column_stack((sv_time, sv_depth, sv_data))

    # Filter out the noisy -5.0 and -10.0 data
    svData = np.where(svData == -5.0, -60.0, svData)
    svData = np.where(svData == -15.0, -60.0, svData)

    # Sort Sv(dB) from lowest to highest so higher values are plotted last
    svData = svData[np.argsort(svData[:,2])]

    # Plot simply x, y, z data (time, depth, dB)
    #fig, ax = plt.subplots(figsize=(10,8))
    fig = newFigure()
    ax = fig.subplots()

    #ax.xaxis.set_minor_locator(dates.MinuteLocator(interval=10))   # every 10 minutes
    #ax.xaxis.set_minor_locator(dates.HourLocator(interval=3))   # every 3 hours
    #ax.xaxis.set_minor_formatter(dates.DateFormatter('%H'))  # hours
    #ax.xaxis.set_minor_formatter(dates.DateFormatter('%H:%M'))  # hours and minutes
    ax.xaxis.set_major_locator(dates.DayLocator(interval=2))    # every day
    #ax.xaxis.set_major_formatter(dates.DateFormatter('\n%m-%d-%Y'))
    ax.xaxis.set_major_formatter(dates.DateFormatter('%m/%d'))
    ax.tick_params(which='major', labelrotation=45)

    #ax.set_facecolor('lightgray')
    ax.set_facecolor('white')

    dateData = [datetime.datetime.fromtimestamp(ts) for ts in svData[:,0]]
    #im = plt.scatter(dateData, svData[:,1], c=svData[:,2], cmap=simrad_cmap, s=30.0)
    im = ax.scatter(dateData, svData[:,1], c=svData[:,2], cmap=simrad_cmap, s=30.0)

    #cbar = plt.colorbar(orientation='vertical', label='Sv (dB)', shrink=0.40)
    fig.colorbar(im, orientation='vertical', label='Sv (dB)', shrink=0.40)

    #plt.ylim(0, sv_depth.max())

    #plt.gca().invert_yaxis()

    #plt.ylabel('Depth (m)')
    #plt.xlabel('Date (UTC)')
    ax.set(ylim=[0, sv_depth.max()], xlabel='Date (UTC)', ylabel='Depth (m)')
    #plt.clim(0, -55)
    im.set_clim(0, -55)

    # Invert axis after limits are set
    im.axes.invert_yaxis()
    #plt.title("Acoustic Scattering Volume (dB) Pseudogram")
    ax.set_title("Acoustic Scattering Volume (dB) Pseudogram")

    return fig, ax

ds = xr.open_dataset('http://mom6node0:8080/thredds/dodsC/GretelExtra.nc')

# Find the timespan of the dataset
ts_min = ds['time'].min()
ts_max = ds['time'].max()

# use the entire deployment

start_dt_string = str(ts_min.dt.strftime("%Y-%m-%d %H:%M:%S.%f").values)
end_dt_string = str(ts_max.dt.strftime("%Y-%m-%d %H:%M:%S.%f").values)

(sv_time, sv_depth, sv_data) = fetchSv(start_dt_string, end_dt_string, ds)

if len(sv_data) > 100:
    (fig, ax) = makePlot(sv_time, sv_depth, sv_data)

    imageOut = "Sv_%s_all.png" % (str(ts_min.dt.strftime("%Y%m%d").values))
    fig.savefig(imageOut, bbox_inches='tight', dpi=100)

ds.close()

kwilcox · 2022-07-14T17:25:32Z

Nice, an added benefit I didn't even think about!

jr3cermak · 2023-03-18T18:35:47Z

Cycling back around to provide an update to support future deployments. Will resync with master and move forward. Please let me know what things you need to support of echometrics, low resolution / echograms (formerly pseudograms). This update will provide:

netcdf output
csv output
pandas data frame
xarray data frame
direct images by plot type: binned, scatter or pcolormesh
the deployment json file needs to specify Sv thresholds for plotting and the default plotting type
updated documents on the data processing tools

Because of the two stage processing of GUTILS, to provide a data frame, the 2nd pass script would have to provide the DBD files and the cache file directory to decode and provide a direct data frame object. Otherwise, continue to use the 1st pass and produce the csv file and then read the csv file in the 2nd pass to recover the data frame (kinda of what happens now). The intermediate output file can be anything -- a pickled object with the data frame needed in the 2nd stage, etc.

We need to know what target(s) to hit for you so we can get them built into the CI testing. Once it all passes again, move ahead with other fun things. It looks like python 3.7 is EOL. Is there a particular version of python we should use? We are at the stage of reworking the tests and updating code. I am anticipating at least two to four weeks of additional effort on our side before a reasonable PR is ready. This could change based on the requirements/targets provided.

jr3cermak · 2023-03-18T21:57:09Z

Main branch readme => python 3.9 :)

jr3cermak · 2023-03-25T20:33:02Z

Unfortunately, our work has snowballed a bit. So, we will need to submit at least three PRs in total as of this writing. The first is ready to go when CI tests pass.

current PR New description for sea_water_potential_temperature cf-convention/vocabularies#199 : Improving echogram decoding and generation of single profile images for use; access to echogram data frame within the context of GUTILS now available as an example. Still not perfect -- but at least progress in the right direction after merging about four different sources of code over a period of two years. This PR improves support for the glider's "combo" mode.
now included in PR New description for sea_water_potential_temperature cf-convention/vocabularies#199: Support echograms of higher resolution: glider's "egram" mode to support the UAF glider and the USF glider.
also included in PR New description for sea_water_potential_temperature cf-convention/vocabularies#199: Provide code examples that create echogram timeseries/waterfall plots; aggregating profiles of arbitrary length of time (day, month, etc)

jr3cermak · 2023-03-31T08:35:58Z

Latest checks have passed. I have refreshed documentation in the README.pdf and have it out on a website (that may be down at some point for an OS update).

https://nasfish.fish.washington.edu/echotools/docs/html/echotools/html/echotools/README.html

The important bit is walking from the produced netCDF files (*_extra.nc) to a time series plot of the echogram profiles given any time range. So, I think that is the target product that is desired on the data portal.

https://nasfish.fish.washington.edu/echotools/docs/html/echotools/html/echotools/README.html#product

That should give us the pivot point to start heading down the pyarrow rabbit hole.

jr3cermak · 2023-04-04T12:32:00Z

Just a little more work on some additional "profile" products for echometrics. We stood up a prototype that will be used internally once implemented in some fashion on the data portal.

https://nasfish.fish.washington.edu/echotools/dppp/egramBrowser/portal.html

kwilcox · 2023-04-04T12:35:24Z

Ready for me to take a look?

jr3cermak · 2023-04-04T15:05:07Z

There is at least one more pending update with additional "profile" products to be sent. I will post another note when things settled.

jr3cermak · 2023-04-06T23:29:10Z

You can move ahead with the current code in the PR. This other new part needs some more R&D before it can be implemented. I originally thought it was going to be an easy drop in addition. That is not the case.

jcermauwedu · 2024-03-18T19:00:53Z

A proposal for additional CF standard names has been submitted to improve standards compliance for proposed acoustic datasets. For future use in deployment.json and other configuration files.

jr3cermak mentioned this issue Mar 17, 2022

Pseudograms remix #16

Merged

jr3cermak mentioned this issue Aug 28, 2023

Echometrics improvements early 2023 #22

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

echometrics improvements #12

echometrics improvements #12

jr3cermak commented Mar 15, 2022 •

edited

Loading

jr3cermak commented Mar 15, 2022

kwilcox commented Mar 15, 2022

jr3cermak commented Mar 16, 2022

jr3cermak commented Mar 16, 2022

jr3cermak commented Mar 17, 2022

kwilcox commented Mar 17, 2022

jr3cermak commented Apr 8, 2022 •

edited

Loading

kwilcox commented Apr 11, 2022

jr3cermak commented Apr 11, 2022

kwilcox commented Apr 11, 2022

jr3cermak commented Apr 13, 2022

jr3cermak commented Apr 25, 2022

jr3cermak commented May 4, 2022

jr3cermak commented May 4, 2022

jr3cermak commented Jul 13, 2022 •

edited

Loading

kwilcox commented Jul 14, 2022

jr3cermak commented Mar 18, 2023 •

edited

Loading

jr3cermak commented Mar 18, 2023

jr3cermak commented Mar 25, 2023 •

edited

Loading

jr3cermak commented Mar 31, 2023

jr3cermak commented Apr 4, 2023

kwilcox commented Apr 4, 2023

jr3cermak commented Apr 4, 2023

jr3cermak commented Apr 6, 2023

jcermauwedu commented Mar 18, 2024

echometrics improvements #12

echometrics improvements #12

Comments

jr3cermak commented Mar 15, 2022 • edited Loading

jr3cermak commented Mar 15, 2022

kwilcox commented Mar 15, 2022

jr3cermak commented Mar 16, 2022

jr3cermak commented Mar 16, 2022

jr3cermak commented Mar 17, 2022

kwilcox commented Mar 17, 2022

jr3cermak commented Apr 8, 2022 • edited Loading

kwilcox commented Apr 11, 2022

jr3cermak commented Apr 11, 2022

kwilcox commented Apr 11, 2022

jr3cermak commented Apr 13, 2022

jr3cermak commented Apr 25, 2022

jr3cermak commented May 4, 2022

jr3cermak commented May 4, 2022

jr3cermak commented Jul 13, 2022 • edited Loading

kwilcox commented Jul 14, 2022

jr3cermak commented Mar 18, 2023 • edited Loading

jr3cermak commented Mar 18, 2023

jr3cermak commented Mar 25, 2023 • edited Loading

jr3cermak commented Mar 31, 2023

jr3cermak commented Apr 4, 2023

kwilcox commented Apr 4, 2023

jr3cermak commented Apr 4, 2023

jr3cermak commented Apr 6, 2023

jcermauwedu commented Mar 18, 2024

jr3cermak commented Mar 15, 2022 •

edited

Loading

jr3cermak commented Apr 8, 2022 •

edited

Loading

jr3cermak commented Jul 13, 2022 •

edited

Loading

jr3cermak commented Mar 18, 2023 •

edited

Loading

jr3cermak commented Mar 25, 2023 •

edited

Loading