Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Empty response instead of error if request is too large #96

Closed
5 tasks done
veenstrajelmer opened this issue Apr 25, 2024 · 0 comments · Fixed by #97
Closed
5 tasks done

Empty response instead of error if request is too large #96

veenstrajelmer opened this issue Apr 25, 2024 · 0 comments · Fixed by #97

Comments

@veenstrajelmer
Copy link
Collaborator

veenstrajelmer commented Apr 25, 2024

  • ddlpy version: 0.4.0
  • Python version: 3.11
  • Operating System: Windows

When requesting five years of waterlevels with 10-minute frequency at once, we get an empty response from the ddl. It makes sense that the request will fail, since the amount of measurments (6*24*365*5=262800) is higher than the maximum amount returned by ddl (157681). However, there is no error but an empty response.

import ddlpy
import logging
logging.basicConfig()
# show log messages of ddlpy
ddlpy.ddlpy.logger.setLevel(logging.DEBUG)

locations = ddlpy.locations()
bool_hoedanigheid = locations['Hoedanigheid.Code'].isin(['NAP'])
bool_stations = locations.index.isin(['HOEKVHLD', 'IJMDBTHVN','SCHEVNGN'])
bool_grootheid = locations['Grootheid.Code'].isin(['WATHTE'])
bool_groepering = locations['Groepering.Code'].isin(['NVT'])
selected = locations.loc[bool_grootheid & bool_hoedanigheid & bool_groepering & bool_stations]

# successful query
start_date = "2020-01-01"
end_date = "2020-02-01"
print(f"retrieving {start_date} to {end_date}")
measurements = ddlpy.measurements(selected.iloc[0], start_date, end_date)
print(measurements) # filled dataframe


# DEBUG:ddlpy.ddlpy:Got  invalid response: {'Succesvol': False, 'Foutmelding': 'Geen gegevens gevonden!'}
start_date = "2080-01-01"
end_date = "2080-01-02"
print(f"retrieving {start_date} to {end_date}")
measurements = ddlpy.measurements(selected.iloc[0], start_date, end_date)
print(measurements) # empty dataframe

# Foutmelding: 'Het max aantal waarnemingen (157681) is overschreven, beperk uw request.'
start_date = "2017-01-01"
end_date = "2020-02-01"
print(f"retrieving {start_date} to {end_date}")
measurements = ddlpy.measurements(selected.iloc[0], start_date, end_date, freq=None)
print(measurements) # empty dataframe

Selection of prints:

retrieving 2020-01-01 to 2020-02-01
DEBUG:ddlpy.ddlpy:0 duplicated values dropped
                          WaarnemingMetadata.StatuswaardeLijst  ...             Y
time                                                            ...              
2020-01-01 01:00:00+01:00                        Gecontroleerd  ...  5.759136e+06
2020-01-01 01:10:00+01:00                        Gecontroleerd  ...  5.759136e+06
2020-01-01 01:20:00+01:00                        Gecontroleerd  ...  5.759136e+06
2020-01-01 01:30:00+01:00                        Gecontroleerd  ...  5.759136e+06
2020-01-01 01:40:00+01:00                        Gecontroleerd  ...  5.759136e+06
                                                       ...  ...           ...
2020-02-01 00:20:00+01:00                        Gecontroleerd  ...  5.759136e+06
2020-02-01 00:30:00+01:00                        Gecontroleerd  ...  5.759136e+06
2020-02-01 00:40:00+01:00                        Gecontroleerd  ...  5.759136e+06
2020-02-01 00:50:00+01:00                        Gecontroleerd  ...  5.759136e+06
2020-02-01 01:00:00+01:00                        Gecontroleerd  ...  5.759136e+06
[4465 rows x 53 columns]


retrieving 2080-01-01 to 2080-01-02
DEBUG:ddlpy.ddlpy:Got  invalid response: {'Succesvol': False, 'Foutmelding': 'Geen gegevens gevonden!'}
DEBUG:ddlpy.ddlpy:No data availble for 2080-01-01 00:00:00 2080-01-02 00:00:00
100%|██████████| 1/1 [00:00<00:00,  1.77it/s]
DEBUG:ddlpy.ddlpy:no data found for this station and time extent
Empty DataFrame
Columns: []
Index: []


retrieving 2017-01-01 to 2020-02-01
DEBUG:ddlpy.ddlpy:Got  invalid response: {'Succesvol': False, 'Foutmelding': 'Het max aantal waarnemingen (157681) is overschreven, beperk uw request.'}
DEBUG:ddlpy.ddlpy:No data availble for 2017-01-01 00:00:00 2020-02-01 00:00:00
100%|██████████| 1/1 [01:39<00:00, 99.30s/it]
DEBUG:ddlpy.ddlpy:no data found for this station and time extent
Empty DataFrame
Columns: []
Index: []

The last query complains about the 'max aantal waarnemingen', but this is exception is not thrown but catched, which results in an empty dataframe instead of an exception. Since we are allowing larger chunks to be retrieved with ddlpy since #94, it is important to properly raise this exception so the user is aware of the request being too large.

Todo:

  • add RequestTooLargeException
  • simplify (unindent) try/except logging code
  • make handling of unsuccessful request modular
  • add testcase for NoDataException for all functions that can raise it
  • add testcase for RequestTooLargeException >> commented since it is too slow
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant