Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GDAL errors are also "printed" to stdout in addition to raised as Python exception #91

Closed
jorisvandenbossche opened this issue Apr 29, 2022 · 7 comments · Fixed by #236

Comments

@jorisvandenbossche
Copy link
Member

A small example:

import pyogrio
import geopandas
from shapely.geometry import Point, LineString, Polygon, box, MultiPolygon

df = geopandas.GeoDataFrame(
    {"col": [1, 2, 3]},
    geometry=[Point(0, 0), LineString([(0, 0), (1, 1)]), box(0, 0, 1, 1)],
    crs="EPSG:4326",
)


In [3]: pyogrio.write_dataframe(df, "test_pyogrio_mixed.shp", driver="ESRI Shapefile")
ERROR 1: Attempt to write non-point (LINESTRING) geometry to point shapefile.
---------------------------------------------------------------------------
FeatureError                              Traceback (most recent call last)
<ipython-input-3-f50fe8b379fe> in <module>
----> 1 pyogrio.write_dataframe(df, "test_pyogrio_mixed_truly.shp", driver="ESRI Shapefile")

~/scipy/repos/pyogrio/pyogrio/geopandas.py in write_dataframe(df, path, layer, driver, encoding, geometry_type, **kwargs)
    248             crs = geometry.crs.to_wkt(WktVersion.WKT1_GDAL)
    249 
--> 250     write(
    251         path,
    252         layer=layer,

~/scipy/repos/pyogrio/pyogrio/raw.py in write(path, geometry, field_data, fields, layer, driver, geometry_type, crs, encoding, **kwargs)
    186         )
    187 
--> 188     ogr_write(
    189         str(path),
    190         layer=layer,

~/scipy/repos/pyogrio/pyogrio/_io.pyx in pyogrio._io.ogr_write()

FeatureError: Could not add feature to layer at index 1: Attempt to write non-point (LINESTRING) geometry to point shapefile.

In the above case, we detect that GDAL errors, get the error message and include that in the Python exception we raise. But, you can also see that the original GDAL error is still printed as well (the first line before the traceback, ERROR 1: ...).

This is not that a big deal, but I think ideally we would only have the exception (for example, if you catch the exception to handle it in some way, you still get the printed message). And, when using fiona, this doesn't happen (df.to_file("test_fiona_mixed.shp", driver="ESRI Shapefile") is the equivalent of the above pyogrio write, and this only gives a python exception).

I have been wondering why this is (we do use CPLErrorReset after getting the error type and message). Comparing to Fiona, the error handling itself is very similar (since copied from Fiona), but in Fiona there is in addition also a error handler registered (CPLPushErrorHandler):

https://github.com/Toblerity/Fiona/blob/9c8fe736d4381526905c6976c399a33d10e3ecbf/fiona/_env.pyx#L137-L155
https://github.com/Toblerity/Fiona/blob/9c8fe736d4381526905c6976c399a33d10e3ecbf/fiona/_env.pyx#L387-L395

@jorisvandenbossche
Copy link
Member Author

I noticed that the GDAL python SWIG bindings have this handler defined, which I suppose is basically what we want as well (hide errors logs because those are already handled as python exception):

static CPLErrorHandler pfnPreviousHandler = CPLDefaultErrorHandler;

static void CPL_STDCALL
PythonBindingErrorHandler(CPLErr eclass, int code, const char *msg )
{
  /*
  ** Generally we want to suppress error reporting if we have exceptions
  ** enabled as the error message will be in the exception thrown in
  ** Python.
  */

  /* If the error class is CE_Fatal, we want to have a message issued
     because the CPL support code does an abort() before any exception
     can be generated */
  if (eclass == CE_Fatal ) {
    pfnPreviousHandler(eclass, code, msg );
  }

  /*
  ** We do not want to interfere with non-failure messages since
  ** they won't be translated into exceptions.
  */
  else if (eclass != CE_Failure ) {
    pfnPreviousHandler(eclass, code, msg );
  }
  else {
    CPLSetThreadLocalConfigOption("__last_error_message", msg);
    CPLSetThreadLocalConfigOption("__last_error_code", CPLSPrintf("%d", code));
  }
}

(copying the snippet here, because doesn't allow to link to the line: https://github.com/OSGeo/gdal/blob/master/swig/python/extensions/ogr_wrap.cpp)

There is also a builtin CPLQuietErrorHandler (https://gdal.org/api/cpl.html#_CPPv420CPLQuietErrorHandler6CPLErr11CPLErrorNumPKc), but that also hides warning messages.

@Oreilles
Copy link

This also affects warnings. Ideally, you'd be able to supress warnings using the stdlib, but since those are simply logged from gdal there is no way to do so. The annoying one I'm facing:

Warning 1: Value 113034553.399706051 of field AREA of feature 3746 not successfully written. Possibly due to too larger number with respect to field width

@codeananda
Copy link

codeananda commented Jan 19, 2024

@Oreilles did you figure out a way to solve that? I write the same file with fiona and it doesn't complain. But pyogrio does. Would love to write at the speed of pyogrio.

I assume this negatively impacts reading the files afterwards as well? Are the values NULL? Or are they truncated floats? Would be good to know what happens.

@brendan-ward any ideas?

@Oreilles
Copy link

Oreilles commented Jan 19, 2024

I could not fix the bug, however, I found it was possible to redirect the GDAL output elsewhere so as to not pollute my program output. Here is the code to just throw GDAL's output away (Change dev/null to any file path if you want to preserve it somehow):

from pyogrio import set_gdal_config_options

set_gdal_config_options({"CPL_LOG": "/dev/null"})

I assume this negatively impacts reading the files afterwards as well? Are the values NULL? Or are they truncated floats? Would be good to know what happens.

All values were already correctly written for me, the warning was unexpected which is why I decided to suppress it. Of course this comes with the downside that if a valid warning were to be emitted I would miss it.

@jorisvandenbossche
Copy link
Member Author

@codeananda which version of pyogrio are you using? Because my understanding is that the warnings issue should also be solved by #242 (and I know in our CI we use pytest.mark.filterwarnings to ignore some of them)

@jorisvandenbossche
Copy link
Member Author

We had this specific warning in the geopandas tests as well, and added a filter for this (so filtering specifically this warning with python warnings.filterwarnings is certainly possible). See geopandas/geopandas#3040 (comment) for some discussion.

I don't really understand why GDAL raises this warning, but our conclusion was also that it is probably harmless (you could open an issue about it upstream in osgeo/gdal)

@codeananda
Copy link

Wow thanks for the super speedy responses!

I'm using pyogrio 0.7.2.

Good to know that the values are actually written correctly.

@jorisvandenbossche maybe I'm missing something but it seems like #242 is related to suppressing warnings for tests... but are these warnings actually suppressed in the code?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants