-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect extreme values for BERGSDSWT and others #43
Comments
The incorrect low water for BERGSDSWT 05-07-1996 01:09 NAP -207cm has been removed from DONAR. |
Yes, 4 hours might be too high. But it gives an indication for most stations. Of course, this can easily be adjusted. Thank you for correcting that single value, but there were more timesteps provided for BERGSDSWT. The time
I have expanded my code a bit to generate figures of all tooclose timesteps (this time with >0 and <2 hours as threshold): import os
import xarray as xr
import pandas as pd
import ddlpy
import matplotlib.pyplot as plt
plt.close("all")
dir_base = r"p:\11210325-005-kenmerkende-waarden\work"
dir_meas = os.path.join(dir_base, "measurements_wl_18700101_20240101")
station_list = ['A12','AWGPFM','BAALHK','BATH','BERGSDSWT','BROUWHVSGT02','BROUWHVSGT08','GATVBSLE','BRESKVHVN','CADZD','D15','DELFZL','DENHDR','EEMSHVN','EURPFM','F16','F3PFM','HARVT10','HANSWT','HARLGN','HOEKVHLD','HOLWD','HUIBGT','IJMDBTHVN','IJMDSMPL','J6','K13APFM','K14PFM','KATSBTN','KORNWDZBTN','KRAMMSZWT','L9PFM','LAUWOG','LICHTELGRE','MARLGT','NES','NIEUWSTZL','NORTHCMRT','DENOVBTN','OOSTSDE04','OOSTSDE11','OOSTSDE14','OUDSD','OVLVHWT','Q1','ROOMPBNN','ROOMPBTN','SCHAARVDND','SCHEVNGN','SCHIERMNOG','SINTANLHVSGR','STAVNSE','STELLDBTN','TERNZN','TERSLNZE','TEXNZE','VLAKTVDRN','VLIELHVN','VLISSGN','WALSODN','WESTKPLE','WESTTSLG','WIERMGDN','YERSKE']
dict_tdiff_toosmall_ntimes = {}
dict_tdiff_toosmall_times = {}
dict_has_aggers = {}
print(f'counting small ext differences for {len(station_list)} stations: ', end='')
for istat, current_station in enumerate(station_list):
print(f"{istat+1} ", end="")
#load measext data
file_ext_nc = os.path.join(dir_meas,f"{current_station}_measext.nc")
if not os.path.exists(file_ext_nc):
continue
ds_ext_meas = xr.open_dataset(file_ext_nc)
# find (number of) times that are too close (but no exact time duplicates)
times_pd = ds_ext_meas.time.to_pandas().index.tz_localize("UTC").tz_convert("UTC+01:00")
bool_tdiff_toosmall = (times_pd.diff() < pd.Timedelta(hours=2)) & (times_pd.diff() > pd.Timedelta(hours=0))
# find number of times that have waterlevel values close to each other
# vdiff = ds_ext_meas["Meetwaarde.Waarde_Numeriek"].to_pandas().diff().abs()
# bool_vdiff_toosmall = (vdiff < 100) # max 1m difference, to avoid hit on HW and LW that are too close
bool_diff_toosmall = bool_tdiff_toosmall #& bool_vdiff_toosmall
if bool_diff_toosmall.sum() == 0:
continue
dict_tdiff_toosmall_ntimes[current_station] = bool_diff_toosmall.sum()
dict_tdiff_toosmall_times[current_station] = times_pd[bool_diff_toosmall]
# find unique hwlwcode (aggers or not)
hwlwcode_uniq = ds_ext_meas["HWLWcode"].to_pandas().drop_duplicates().tolist()
if len(hwlwcode_uniq) > 2:
dict_has_aggers[current_station] = True
else:
dict_has_aggers[current_station] = False
print()
stats_pd = pd.DataFrame({"ntimes tooclose":dict_tdiff_toosmall_ntimes,
"has_aggers":dict_has_aggers})
print(stats_pd.sort_values("has_aggers"))
# plot timeseries/ext per tooclose timestep
locations = ddlpy.locations()
station_list_ext = stats_pd.index.tolist()
# station_list_ext = ["NIEUWSTZL"]
for station_code in station_list_ext:
plt.close("all")
print(station_code)
if stats_pd.loc[station_code,"has_aggers"]:
print(f"skipping {station_code} because of aggers")
continue
bool_hoedanigheid = locations['Hoedanigheid.Code'].isin(['NAP'])
bool_stations = locations.index.isin([station_code])
bool_grootheid = locations['Grootheid.Code'].isin(['WATHTE'])
bool_groepering_ts = locations['Groepering.Code'].isin(['NVT'])
bool_groepering_ext = locations['Groepering.Code'].isin(['GETETBRKDMSL2', 'GETETBRKD2', 'GETETMSL2', 'GETETM2'])
selected_ts = locations.loc[bool_grootheid & bool_hoedanigheid & bool_groepering_ts & bool_stations]
selected_ext = locations.loc[bool_grootheid & bool_hoedanigheid & bool_groepering_ext & bool_stations]
if len(selected_ts) == 0:
print(f"skipping {station_code} since locations dataframe is empty (check Hoedanigheid)")
continue
if len(selected_ts) > 1:
raise ValueError(f"too much selected location rows for {station_code}")
date_list = dict_tdiff_toosmall_times[station_code]
if len(date_list) > 30:
print(f"skipping {station_code} for now since it has too much tooclose values")
continue
for issue_date in date_list:
# issue_date = date_list[0]
start_date = issue_date - pd.Timedelta(days=1)
end_date = issue_date + pd.Timedelta(days=1)
# pass a single row of the locations dataframe to the measurements function to get the measurements for that location
measurements_ts = ddlpy.measurements(selected_ts.iloc[0], start_date, end_date)
measurements_ext = ddlpy.measurements(selected_ext.iloc[0], start_date, end_date)
if measurements_ts.empty:
print("not ts for issue_date")
continue
fig, ax = plt.subplots(figsize=(13,6))
ax.plot(measurements_ts['Meetwaarde.Waarde_Numeriek'], label="waterlevel")
ax.plot(measurements_ext['Meetwaarde.Waarde_Numeriek'], "xr", markersize=10, label="extremes")
ax.grid()
ax.legend(loc=1)
ax.set_title(f"{station_code} {issue_date}")
fig.tight_layout()
fig.savefig(f"{station_code}_{issue_date.strftime('%Y%m%d_%H%M%S')}") For BERGSDSWT it results in these figures: A zipfile with all resulting figures is attached: Some issues that can be resolved later:
|
Delfzijl 11-11-1891 20:50 (MET) HW +13 and 22:30 LW +9 is still not an error, but a extreme small fall following a negative surge. |
The incorrect high water for BERGSDSWT Bergse Diepsluis west 31-12-1991 23:55 (MET) NAP + 109cm has been removed from DONAR. |
The incorrect low water for BROUWHVSGT08 Brouwershavense Gat 08 01-01-2015 05:33 (MET) NAP -84 cm has been removed from DONAR. |
The incorrect high water for BROUWHVSGT08 Brouwershavense Gat 08 01-01-2015 10:51 (MET) NAP + 105 cm has been removed from DONAR. |
The incorrect low water for BROUWHVSGT08 Brouwershavense Gat 08 01-01-2015 17:30 (MET) NAP -103 cm has been removed from DONAR. |
The incorrect low water for CADZD Cadzand 31-12-1992 23:50 (MET) NAP -123 cm has been removed from DONAR. |
Low water for DELFZL Delfzijl 09-08-1898 15:55 (MET) -113: the time has been corrected in DONAR to 10:55. |
High water for DELFZL Delfzijl 23-09-1901 01:10 (MET) + 86: the time has been corrected in DONAR to 06:30. |
High water for DELFZL Delfzijl17-02-1902 01:05 (MET) + 72: the time has been corrected in DONAR to 06:05. |
High water for DELFZL Delfzijl 04-05-1902 03:40 (MET) + 109: the time has been corrected in DONAR to 08:40. |
High water for DELFZL Delfzijl 18-05-1902 04:35 (MET) + 109: the time has been corrected in DONAR to 09:15. |
Low water for DELFZL Delfzijl 25-06-1902 04:05 (MET) -192: the time has been corrected in DONAR to 09:05. |
High water for DELFZL Delfzijl 18-05-1906 16:00 (MET) + 87: the time has been corrected in DONAR to 20:45. |
Low water for DELFZL Delfzijl 29-03-1907 13:00 (MET) -211: the time has been corrected in DONAR to 19:00. |
High water for DELFZL Delfzijl 29-12-1907 00:45 (MET) + 66: the time has been corrected in DONAR to 05:45. |
High water for DELFZL Delfzijl 14-03-1909 08:40 (MET) + 112: the time has been corrected in DONAR to 03:40. |
Low water for DELFZL Delfzijl 04-09-1909 04:15 (MET) -189: the time has been corrected in DONAR to 09:15. |
Low water for DELFZL Delfzijl 12-01-1914 01:10 (MET) -230: the time has been corrected in DONAR to 06:30. |
High water for DELFZL Delfzijl 13-03-1916 08:40 (MET) + 65: the time has been corrected in DONAR to 05:40. |
High water for DELFZL Delfzijl 24-08-1918 19:25 (MET) + 138: the time has been corrected in DONAR to 13:25. |
High water for DELFZL Delfzijl 17-10-1918 04:15 (MET) + 117: the time has been corrected in DONAR to 09:15. |
DELFZL Delfzijl 06-11-1921 18:05 (MET) is not an error, but an extremely small fall due to a combination of a very weak neap tide and transition from negative to positive surge. |
DENOVBTN Den Oever buiten 14-02-1989 07:05 (MET) is not an error, but an extremely small fall during a very steep storm surge (along the coast of Holland there was even no fall at all that morning). |
DENOVBTN Den Oever buiten 27-10-2002 16:50 (MET) is not an error, but an extremely small fall during a very steep storm surge. |
The superfluous high water for HANSWT Hansweert 01-04-2020 07:32 (MET) NAP + 196 cm has been removed from DONAR. |
High water for HARLGN Harlingen 30-11-1945 16:00 (MET) + 67: the time has been corrected in DONAR to 19:00. |
HARLGN Harlingen 27-10-2002 17:10 (MET) is not an error, but an extremely small fall during a very steep storm surge. |
The superfluous low water for HOLWD Holwerd 15-02-1994 07:10 (MET) NAP -247 cm has been removed from DONAR. |
IJMDBTHVN IJmuiden buitenhaven 16-09-1997 15:03 (MET) is not an error, but an extremely swift rise after a very late low water in combination with a large daily inequality. |
IJMDBTHVN IJmuiden buitenhaven 09-11-2001 09:20 (MET) is not an error, but a very short rise after a small positive surge. |
The other 7 cases for IJMDBTHVN IJmuiden buitenhaven are no errors, either. |
KORNWDZBTN Kornwerderzand buiten 27-10-2002 17:00 (MET) is not an error, but an extremely small fall during a very steep storm surge. |
The superfluous high water for KRAMMSZWT Krammersluizen west 31-12-1991 23:55 (MET) NAP + 101 cm has been removed from DONAR. |
MARLGT Marollegat 15-04-1987 03:45 (MET), 16-04-1987 04:16 (MET) and 16:44 (MET): no errors. The anomalous curves were caused by the partial closure of the Eastern Scheldt barrier during the closure of the Philipsdam. |
OUDSD Oudeschild 14-02-1989 06:15 (MET) is not an error, but an extremely small fall during a very steep storm surge. |
OUDSD Oudeschild 01-03-2008 05:33 (MET) is not an error, but an extremely small fall during a very steep storm surge. |
The superfluous high water for STAVNSE Stavenisse 31-12-1991 23:55 (MET) NAP + 99 cm has been removed from DONAR. |
Low water for TERNZN Terneuzen 30-03-1952 05:46 (MET) -230: the time has been corrected in DONAR to 11:46. |
TERSLNZE Terschelling Noordzee 31-01-2008 18:25 (MET) is not an error, but an very short fall due to a combination of a weak neap tide and transition from negative to positive surge. |
TEXNZE Texel Noordzee 31-01-2008 17:20 (MET) is not an error, but an very short fall due to a combination of a weak neap tide and transition from negative to positive surge. |
VLAKTVDRN Vlakte van de Raan 08-05-1985 09:46 (MET) LW -227 and 10:04 HW + 229 have been corrected in DONAR to 10:00 LW -183 and 16:00 HW + 210, respectively. |
VLAKTVDRN Vlakte van de Raan 20-11-1985 09:46 (MET) HW + 96 has been corrected in DONAR to 07:30 HW + 86.. |
VLIELHVN Vlieland haven 09-02-1948 19:20 (MET) LW -12 has been corrected in DONAR to 14:20. |
The superfluous low water for WESTKPLE Westkapelle 31-12-1992 23:50 (MET) NAP -106 cm has been removed from DONAR. |
WESTTSLG West-Terschelling 19-10-1935 18:45 (MET) is not an error, but an extremely small fall during a storm surge. |
WESTTSLG West-Terschelling 21-12-1940 12:05 (MET) LW -160 has been corrected in DONAR to 08:05. |
WIERMGDN Wierumergronden 29-05-1985 12:18 (MET) LW -113 and 14:00 HW + 95 have been corrected in DONAR to 11:00 LW -101 and 16:30 HW + 77, respectively. |
WIERMGDN Wierumergronden 01-06-2013 23:40 (MET) HW + 63 has been corrected in DONAR to 02-06-2013 04:00 + 60. |
As to NIEUWSTZL Nieuwe Statenzijl 09-06-2011: the water level record during some months in 2011 shows signs of extreme stoppage of the float gauge. The incorrect 10 minute water level data will be replaced by interpolated values, and then the HW/LW-data will be computed from the 10 minute data again. |
With a closed Oosterscheldekering, no high waters will be registered in the Oosterschelde. That is the reason for the missing four high waters for BERGSDSWT in 1993. |
When assessing the extremes for BERGSDSWT, there are some extremes too close to each other. Further inspection of one of the instances shows that there are incorrect values present in the timeseries. In this particular example the timestamps 1996-07-05 00:20:00+01:00 and 1996-07-05 01:09:00+01:00 are too close toghether and have different values (-189 vs -207). All metadata is exactly equal.
Results in this dataframe:
And the figure clearly shows the the additional low water with a distance from the waterlevel timeseries:
This additional extreme also does not overlap with grootheid WATHTBRKD or WATHTASTRO (latter is empty).
The above can be filtered out by checking for small time differences between extremes. Time differences smaller than 4 hours also occur for the following timestamps for this station:
Also happens for different stations
This happens also for other stations (the variable
dict_tdiff_toosmall_times
is a dictionary with all times that are too close for each station). This is also included inkenmerkendewaarden.derive_statistics()
for datasets with extremes:Prints:
The stations HOEKVHLD, ROOMPBNN, STELLDBTN, HARVT10 and SCHEVNGN contain aggers, so those are the only stations where the tooclose-times might be valid. In all other stations these should likely be fixed.
TODO: redo analysis after removal of aggers
The text was updated successfully, but these errors were encountered: