Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preparation for future Python exceptions enabled by default #7475

Merged
merged 51 commits into from
Mar 24, 2023

Conversation

rouault
Copy link
Member

@rouault rouault commented Mar 17, 2023

refs #7452
While doing this, I had to change how we interacted with the CPL error handler code. Previously we pushed a global error handler with CPLSetErrorHandlerEx(). It turns out that doing that from the gdal, ogr and osr modules, and enabling to temporarily disable exceptions with gdal.ExceptionMgr(useExceptions=False) is impossible. So I switched to pushing the python error handler before running a GDAL function and restoring it afterwards. This should help making the GDAL Python bindings work more nicely with other Python code using GDAL such as rasterio.
PR not ready as a number of changes have to be done in autotest

@rouault rouault force-pushed the python_enable_exceptions branch 2 times, most recently from 688ba19 to 1ef2258 Compare March 18, 2023 00:17
@rouault
Copy link
Member Author

rouault commented Mar 18, 2023

CC @rcoup This PR will change a bit how the gdal.ConfigurePythonLogging() function you added some time ago behaves when GDAL exceptions are enabled (which will be the default). Previously gdal.UseExceptions() and gdal.ConfigurePythonLogging() would compete against each other as both used gdal.SetErrorHandler() to install their GDAL error handler. Now, gdal.UseExceptions() no longer install a permanent error handler, but just pushes / pops one around each GDAL call. The effect of this is that CPLE_Failure will be caught by the PythonBindingErrorHandler(), and not by the one of gdal.ConfigurePythonLogging(). Warnings, debug and info messages will still be forwarded by PythonBindingErrorHandler() to the previously active error handler, that is the one installed by gdal.ConfigurePythonLogging()

@rouault rouault force-pushed the python_enable_exceptions branch 13 times, most recently from 4abbcdb to 23b62dc Compare March 19, 2023 00:00
@rouault
Copy link
Member Author

rouault commented Mar 19, 2023

Now that the PR is "green," it's time to share my thoughts. I expected this PR
to be more than just a one-liner that switches bUseExceptions from false to true,
but I didn't anticipate it would raise so many topics.

First, this PR actually starts at commit "python_exceptions.i: revise logic to
use Push/PopErrorHandler instead of setting a global error handler". All previous
commits, that were needed to make tests pass once exceptions enabled, could be
applied independently on the following commits.

Exception handling changes

Regarding exception handling, the need to disable it in some places in autotest
(since it is not reasonable to expect 350 kLOC of autotest Python code to be
converted to accept exceptions overnight), while keeping the global default of
them being enabled, exposed how fragile and dangerous the
UseExceptions()/DontUseExceptions() functions were.

Before this PR, when UseExceptions() was called, it installed a global error
handler with CPLSetErrorHandler() that silenced all error classes except
CE_Failure (and forwarded warnings and debug messages to the previously
installed handler). This was invasive for other Python code that dealt with GDAL.
Furthermore, it was a problem when combined with the fact that the bUseExceptions
state was per gdal/ogr/osr Python module and not global when threads were
considered, especially when threads are considered.
For example, if one module called UseExceptions() and another called
DontUseExceptions(), the PythonBindingsErrorHandler was not installed. There was
some ugly hack before this PR to avoid this issue, but it prevented temporary
disabling of exceptions from working.

This PR changes the way exceptions are handled by installing the PythonBindingsErrorHandler
with (thread-local) CPLPushErrorHandler() just before the call to the GDAL/OGR
C API and uninstalling it afterward with CPLPopErrorHandler(). This should play
in a much friendlier way with other GDAL-using Python libraries like Fiona or Rasterio.

The gdal/ogr/osr.ExceptionMgr() context manager (introduced earlier in the 3.7-dev cycle)
that can be used to temporarily change the exception-enabled state is also modified
to not touch the global bUseExceptions flag but instead set a thread-local
bUseExceptionsLocal flag. This enables the use of ExceptionMgr() in Python threads
without affecting other threads. When set, GetUseExceptions() will use that
thread-local flag in priority over the global bUseExceptions.

Those changes affect the working of the gdal.ConfigurePythonLogging() function
(cf above comment #7475 (comment))

To sum up, these are non-trivial changes, but I believe they result in a saner
exception handling. Moreover, the fact that autotest still passes in this PR
demonstrates that they are hopefully reasonable. I recommend using the
ExceptionMgr() context manager rather than the UseExceptions()/DontUseExceptions()
pair, which should only be reserved at beginning of scripts, but not in the
middle of them (as mentione above, the global error state will be ignored by the
ExceptionMgr() context manager).

Separate boolean flag for exception enabled state?

One interesting question is if the 15-year tradition of having the exception-enabled
state per gdal, ogr and osr module is a reasonable one? It would seem to me to
be more natural to have a unified exception-enabled state for the 3 modules.
There might be some breakage potential in doing so though for code that would
explicitly depend on exceptions being enabled in one module and not another one
(I suspect this might occur accidentally for people not realizing that each
module has its own boolean, and thus people would just do gdal.UseExceptions(),
expecting that to apply to ogr and osr as well)
This is a discussion not strictly related to the subject of this PR which is
to enable exceptions by default, but if we do enable them by default, this could
be a time to change that.

Consistency of exception throwing?

Is our code base ready for turning on exceptions by default? This is really hard
to assess. autotest now tests this (but with a lot of places with exception
disabled given the massive changes required as mentioned above). Just a few
changes where found to be needed in swig/python/gdal-utils/*.py (utility scripts)
to make tests happy. I've some doubts that all the code is necessarily ready,
and there might be further changes in untested code branches that would be needed.

Is the gdal-utils library code added by @idanmiara in
swig/python/gdal-utils/osgeo_utils/auxiliary/ ready for exceptions enabled? I don't know.

Are the Python bindings consistent regarding exception throwing?

I discovered that the (deprecated) ogr.Open() didn't throw exceptions (now
fixed per this PR).

What should be the criteria to throw exceptions rather than returning None?
Not always obvious to me. Some overview of the current situation with this PR
(beware: this is far from being an exhaustive review of the API!):

gdal.Open/gdal.OpenEx/ogr.Open("not_valid") throw an exception when exceptions enabled, return None when disabled

Situations where no exception is thrown when exceptions are enabled (same behaviour
when they are disabled):

  • Dataset.GetSpatialRef() returns None when no SRS
  • Dataset.GetGeoTransform() returns (0.0, 1.0, 0.0, 0.0, 0.0, 1.0) when no geotransform
  • Dataset.GetGeoTransform(can_return_null=True) returns None when no geotransform
  • Dataset.GetFieldDomain("not_a_field_domain") returns None
  • Dataset.GetLayerByName("not_existing") returns None
  • Dataset.GetLayer(-1) returns None
  • Layer.GetLayerDefn().GetFieldIndex("non_existing") return -1

Dataset.ExecuteSQL("invalid SQL") throws an exception when exceptions enabled, return None when disabled

Multidimensional API (explicit code was added couple months ago for that. cf 173ad00):

  • Group.OpenGroup("not_existing") throws an exception when exceptions enabled, return None when disabled
  • Group.OpenMDArray("not_existing") throws an exception when exceptions enabled, return None when disabled
  • Array.GetAttribute("not_existing") throws an exception when exceptions enabled, return None when disabled

It seems to me that Dataset.GetFieldDomain("not_a_field_domain"), GetLayerByName("not_existing")
and GetLayer(-1) should rather throw when exceptions are enabled, shouldn't they?

Methods that returns OGRErr (mostly in OGR and OSR API) throw an exception
when != OGRERR_NONE when exceptions enabled
Some CPL methods that return a POSIX type code of error (0=success) throw an
exception when != 0 (for example gdal.Unlink(), numerous occurences of that
in the autotest changes in this PR)

Final thoughts and questions

Enabling exceptions by default will break external code. I see it for example
breaks some tests in the QGIS regression test suite. Each project will have to
decide if they make their code exception-ready or not. For example for QGIS,
given that it must work with a different range of GDAL versions, and if we
decide to expand in new GDAL versions the scope of functions where we throw, it
might be simpler to explicitly turn off exceptions with gdal/ogr/osr.DontUseExceptions()
in scripts using GDAL Python.

To summarize, here are the questions to answer:

  • Should we unify gdal/ogr/osr.UseExceptions()/DontUseExceptions() to turn the
    same flag instead of 3 separate ones ?
  • Should we revisit situtions in which exceptions are thrown or not ?
  • Do we believe that GDAL (the bindings, script and other Python library code)
    is really ready (for upcoming 3.7) to enable exceptions by default?
    If not, I would hate throwing this PR away given all the tedious work that
    went into it. A possibility that would keep most of this work could be to
    just revert enabling exceptions by default, and turn them on in autotest/conftest.py
    setup code.
  • If GDAL is ready, is the rest of the world ready for that change? This is going to
    break folks. The obvious solution for users to disable exceptions with
    gdal/ogr/osr.DontUseExceptions() in every script could be tedious for large
    code bases.

@rouault rouault changed the title [WIP] Turn on Python exceptions by default Turn on Python exceptions by default Mar 20, 2023
@rouault rouault changed the title Turn on Python exceptions by default Preparation for future Python exceptions enabled by default Mar 21, 2023
@rouault
Copy link
Member Author

rouault commented Mar 21, 2023

Given the feedback given on the mailing list about turning on exceptions by default not being appropriate before GDAL 4.0, I've added 2 extra commits:

@rouault rouault force-pushed the python_enable_exceptions branch from ad10cfd to e4b7530 Compare March 21, 2023 18:16
@rcoup
Copy link
Member

rcoup commented Mar 21, 2023

This PR will change a bit how the gdal.ConfigurePythonLogging() function you added some time ago behaves when GDAL exceptions are enabled (which will be the default). Previously gdal.UseExceptions() and gdal.ConfigurePythonLogging() would compete against each other as both used gdal.SetErrorHandler() to install their GDAL error handler.

This is probably an oversight on my part, my boilerplate has always been from osgeo import gdal; gdal.UseExceptions(). Your rework of it sounds good.

Overall +1 on the change; as per gdal-dev@ thread the open question for me is just the timing/rollout path.

@rouault rouault force-pushed the python_enable_exceptions branch 2 times, most recently from 2c9912a to e9b0ae1 Compare March 21, 2023 23:43
@snowman2
Copy link
Contributor

Should we unify gdal/ogr/osr.UseExceptions()/DontUseExceptions() to turn the
same flag instead of 3 separate ones ?

I only use the bindings for debugging and in GDAL tests presently, so take my opinion as such. I didn't know that you had to enable/disable separately, and that could be a gotcha. I like the idea of having only one place to set it. This will likely simply the process of disabling the exceptions where necessary and help transition to the new default when it is changed.

rouault added 24 commits March 22, 2023 19:20
…nd scripts. And emit a FutureWarning if exceptions have not been explicitly enabled or disabled

This commit can be just reverted for GDAL 4.0
…) and make existing [Dont]UseExceptions() affect all modules
@rouault rouault force-pushed the python_enable_exceptions branch from e9b0ae1 to a1df6dc Compare March 22, 2023 18:55
@rouault
Copy link
Member Author

rouault commented Mar 22, 2023

This will likely simply the process of disabling the exceptions where necessary and help transition to the new default when it is changed.

I've added an extra commit "Python bindings: remove recently added [Dont]UseExceptionsAllModules() and make existing [Dont]UseExceptions() affect all modules" that does that.

Demo:

>>> from osgeo import gdal
>>> gdal.Open("non_existing")
/home/even/gdal/gdal/build_cmake/swig/python/osgeo/gdal.py:291: FutureWarning: Neither gdal.UseExceptions() nor gdal.DontUseExceptions() has been explicitly called. In GDAL 4.0, exceptions will be enabled by default.
  warnings.warn(
ERROR 4: non_existing: No such file or directory
>>> gdal.UseExceptions()
>>> gdal.Open("non_existing")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/even/gdal/gdal/build_cmake/swig/python/osgeo/gdal.py", line 5258, in Open
    return _gdal.Open(*args)
RuntimeError: non_existing: No such file or directory
>>> from osgeo import ogr
>>> ogr.Open("non_existing")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/even/gdal/gdal/build_cmake/swig/python/osgeo/ogr.py", line 7199, in Open
    return _ogr.Open(*args, **kwargs)
RuntimeError: non_existing: No such file or directory

and conversely

>>> from osgeo import ogr
>>> ogr.Open("non_existing")
/home/even/gdal/gdal/build_cmake/swig/python/osgeo/ogr.py:559: FutureWarning: Neither ogr.UseExceptions() nor ogr.DontUseExceptions() has been explicitly called. In GDAL 4.0, exceptions will be enabled by default.
  warnings.warn(
>>> ogr.UseExceptions()
>>> ogr.Open("non_existing")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/even/gdal/gdal/build_cmake/swig/python/osgeo/ogr.py", line 7199, in Open
    return _ogr.Open(*args, **kwargs)
RuntimeError: non_existing: No such file or directory
>>> from osgeo import gdal
>>> gdal.Open("non_existing")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/even/gdal/gdal/build_cmake/swig/python/osgeo/gdal.py", line 5258, in Open
    return _gdal.Open(*args)
RuntimeError: non_existing: No such file or directory

The deprecation warning is emitted just once, from the first call to the module that triggers it

@rouault rouault merged commit 35a139e into OSGeo:master Mar 24, 2023
@rouault rouault added this to the 3.7.0 milestone Mar 24, 2023
@dbaston dbaston mentioned this pull request Aug 30, 2023
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants