Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid lost work/credits when an exception happens #314

Closed
rafatower opened this issue Dec 2, 2016 · 7 comments
Closed

Avoid lost work/credits when an exception happens #314

rafatower opened this issue Dec 2, 2016 · 7 comments
Assignees

Comments

@rafatower
Copy link
Contributor

The scenario we want to improve is:

  1. An analysis is added to a dashboard
  2. That triggers a query on all the rows of a dataset through the Batch SQL API. E.g: https://github.com/CartoDB/camshaft/blob/master/lib/node/nodes/georeference-street-address.js#L59-L64
  3. Calls are executing one by one on the Dataservices API server side, because of the way pl/proxy works
  4. Some of those calls can fail. There are a few causes for that: quota exhausted, programming error, network issues.
  5. That produces an exception/error that rollbacks the whole transaction in the user's DB, wasting any work done and potentially consuming credits.

Possible solutions discussed internally:

a. Wrap all public functions in an _exception_safe version that returns NULL instead of bubbling the exception to the client.
b. Write table-level functions for services consuming quota: isolines, hires_geocoder, routing, observatory.

cc'ing @rochoa for discussion.

@rochoa
Copy link

rochoa commented Dec 5, 2016

I think we already talk about scenario b. a while ago, however, we didn't dig deeper into it.

About option a., I guess it depends if you, as an end user, want to avoid null values, and don't want a do an analysis with half of your dataset as null entries.

@rafatower
Copy link
Contributor Author

(cc'ing @javisantana to let him know about your concerns, Raul)

Another option is to return warnings instead of errors, which can later be checked.

Anyway, any option different from just returning null, such as adding an extra column status or something like that, would put the responsibility of managing that information in the client (i.e. do some "error handling" with the status or the warnings).

IMHO we should seek for a cost-effective solution right now and discuss what we'd like to have in the interfaces in the near future.

@rafatower
Copy link
Contributor Author

We're going for (a) and make all the necessary changes, as it is the fastest path. If that doesn't work or is not enough, and since we're keeping existing functions and adding new ones, we can always rollback or implement (b).

@rafatower rafatower self-assigned this Dec 9, 2016
@javisantana
Copy link
Contributor

There were some corncerns from @ethervoid about being about to implement the "exception safe" version of geocoder (and other DS) so, did we already research if that's possible?

@rafatower
Copy link
Contributor Author

I'll write a PoC (which shall be quick) and ask Mario about his concerns.

In pseudo-code, it should be something like this:

BEGIN
    RETURN original_function(params);
EXCEPTION
    WHEN OTHERS THEN
        RAISE WARNING 'whatever';
        RETURN NULL;
END;

and we already got the template engine to produce those functions out of the description of current API:
https://github.com/CartoDB/dataservices-api/blob/development/client/renderer/interface.yaml
https://github.com/CartoDB/dataservices-api/tree/development/client/renderer/templates

For the reference: https://www.postgresql.org/docs/9.5/static/plpgsql-control-structures.html#PLPGSQL-ERROR-TRAPPING

@rafatower
Copy link
Contributor Author

rafatower commented Dec 9, 2016

I wrote a small PoC based on cdb_geocode_street_point_exception_safe (which is actually generated out of this template:

CREATE OR REPLACE FUNCTION cdb_dataservices_client.cdb_geocode_street_point_exception_safe(searchtext text, city text DEFAULT NULL::text, state_province text DEFAULT NULL::text, country text DEFAULT NULL::text)
RETURNS geometry
AS $$
DECLARE
  ret Geometry;
  username text;
  orgname text;
BEGIN
  IF session_user = 'publicuser' OR session_user ~ 'cartodb_publicuser_*' THEN
    RAISE EXCEPTION 'The api_key must be provided';
  END IF;
  SELECT u, o INTO username, orgname FROM cdb_dataservices_client._cdb_entity_config() AS (u text, o text);
  -- JSON value stored "" is taken as literal
  IF username IS NULL OR username = '' OR username = '""' THEN
    RAISE EXCEPTION 'Username is a mandatory argument, check it out';
  END IF;

  BEGIN
    SELECT cdb_dataservices_client._cdb_geocode_street_point(username, orgname, searchtext, city, state_province, country) INTO ret;
    RETURN ret;
  EXCEPTION
    WHEN OTHERS THEN
      RAISE WARNING 'an exception happened';
      RETURN NULL;
  END;

END;
$$ LANGUAGE plpgsql SECURITY DEFINER;

and it seems to work ok:

SELECT * from  cdb_dataservices_client.cdb_service_quota_info() WHERE service = 'hires_geocoder';
    service     | monthly_quota | used_quota | soft_limit | provider 
----------------+---------------+------------+------------+----------
 hires_geocoder |             5 |          5 | f          | mapzen
(1 row)

-- An exception is to be expected here
SELECT cdb_geocode_street_point('651 Lombard Street', 'San Francisco', 'California', 'United States');
ERROR:  cdb_dataservices_client._cdb_geocode_street_point(6): [server] REMOTE ERROR: spiexceptions.ExternalRoutineException: Exception: You have reached the limit of your quota
CONTEXT:  Remote context: Traceback (most recent call last):
  PL/Python function "cdb_geocode_street_point", line 21, in <module>
    return plpy.execute(mapzen_plan, [username, orgname, searchtext, city, state_province, country], 1)[0]['point']
PL/Python function "cdb_geocode_street_point"
SQL statement "SELECT cdb_dataservices_client._cdb_geocode_street_point(username, orgname, searchtext, city, state_province, country)"
PL/pgSQL function cdb_geocode_street_point(text,text,text,text) line 16 at SQL statement

-- But here we get a NULL in it's _exception_safe counterpart
SELECT cdb_geocode_street_point_exception_safe('651 Lombard Street', 'San Francisco', 'California', 'United States');
WARNING:  an exception happened
 cdb_geocode_street_point_exception_safe 
-----------------------------------------
 
(1 row)

What I'm going to do:

  • run a few more tests to make sure the approach works
  • try to gather a bit more information for the warning (GET STACKED DIAGNOSTICS) and add it if it makes sense.
  • write the corresponding templates to generate these functions automatically.

@rafatower
Copy link
Contributor Author

I made some more tests and found out that returning NULL can be troublesome in some cases (particularly when returning SETOF type):

WARNING:  whatever
ERROR:  structure of query does not match function result type
DETAIL:  Returned type record does not match expected type geometry(Geometry,4326) in column 1.
CONTEXT:  PL/pgSQL function cdb_isochrone(geometry,text,integer[],text[]) line 22 at RETURN QUERY

In that case I found it more practical to just not return anything explicitly. E.g: for isochrones:

CREATE OR REPLACE FUNCTION cdb_dataservices_client.cdb_isochrone(source geometry, mode text, range integer[], options text[] DEFAULT ARRAY[]::text[])
 RETURNS SETOF cdb_dataservices_client.isoline

--...

  BEGIN
    RETURN QUERY
    SELECT * FROM cdb_dataservices_client._cdb_isochrone(username, orgname, source, mode, range, options);
  EXCEPTION
    WHEN OTHERS THEN
      RAISE WARNING 'whatever';
  END;

rafatower pushed a commit that referenced this issue Dec 13, 2016
Fix for the `ERROR:  control reached end of function without RETURN` but
now need to implement for the two other cases.
rafatower pushed a commit that referenced this issue Dec 13, 2016
rafatower pushed a commit that referenced this issue Dec 13, 2016
rafatower pushed a commit that referenced this issue Dec 13, 2016
rafatower pushed a commit that referenced this issue Dec 14, 2016
rafatower pushed a commit that referenced this issue Dec 14, 2016
rafatower pushed a commit that referenced this issue Dec 14, 2016
rafatower pushed a commit that referenced this issue Dec 14, 2016
rafatower pushed a commit that referenced this issue Dec 14, 2016
rafatower pushed a commit that referenced this issue Dec 14, 2016
rafatower pushed a commit that referenced this issue Dec 14, 2016
Make sure we return an empty record and that the mentioned code is never
reached.
rafatower pushed a commit to CartoDB/Windshaft-cartodb that referenced this issue Dec 19, 2016
Use exception safe Dataservices API functions. See
CartoDB/dataservices-api#314 and
CartoDB/camshaft#242
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants