Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split add_any_resource into two distributed functions for Citus #3713

Closed
punktilious opened this issue Jun 15, 2022 · 1 comment
Closed

Split add_any_resource into two distributed functions for Citus #3713

punktilious opened this issue Jun 15, 2022 · 1 comment
Assignees
Labels
enhancement New feature or request

Comments

@punktilious
Copy link
Collaborator

punktilious commented Jun 15, 2022

Is your feature request related to a problem? Please describe.
Citus supports distributing functions based on one of the passed parameter values.

Describe the solution you'd like
It may be more efficient to split the add_any_resource logic into two separate functions:

  1. add_logical_resource_ident - distributed by the logical_id parameter
  2. add_any_resource - distributed by the logical_resource_id parameter

The add_logical_resource_ident function should manage the insert or select for update of the logical_resource_ident record, returning the logical_resource_id. When this function returns, the logical_resource_ident record must exist and the row locked for update.

The add_any_resource function now takes an additional attribute (logical_resource_id). It no longer needs to return the logical_resource_id.

Also delete_resource_parameters should be distributed by p_logical_resource_id.

Describe alternatives you've considered
Use the current single-procedure solution.

Acceptance Criteria

  1. a new resource is correctly created
  2. an existing resource is correctly updated
  3. a new resource with an existing logical_resource_ident record (created by a reference parameter from another resource) is correctly created.
  4. the system correctly handles concurrent updates of the same resource

fhir-bucket can load synthea data with better performance than the single-procedure solution when using a multi-node Citus database.

To check that citus is using two functions, describe fhirdata.add_any_resource and note that the first parameter is the logical_resource_id (which is now passed into this function instead of being generated within it):

postgres=# \df fhirdata.add_any_resource;
                                                                                                                                                                                                                                                                       Li
st of functions
  Schema  |       Name       | Result data type |                                                                                                                                                                                                                        
                  Argument data types                                                                                                                                                                                                                                    
      | Type 
----------+------------------+------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
------+------
 fhirdata | add_any_resource | record           | **p_logical_resource_id** bigint, p_resource_type_id integer, p_resource_type character varying, p_logical_id character varying, p_payload bytea, p_last_updated timestamp without time zone, p_is_deleted character, p_sou
rce_key character varying, p_version integer, p_parameter_hash_b64 character varying, p_if_none_match integer, p_resource_payload_key character varying, OUT o_current_parameter_hash character varying, OUT o_interaction_status integer, OUT o_if_none_match_version in
teger | func

Further, show that the add_logical_resource_ident procedure exists:

postgres=# \df fhirdata.add_logical_resource_ident;
                                                                       List of functions
  Schema  |            Name            | Result data type |                                     Argument data types                                      | Type 
----------+----------------------------+------------------+----------------------------------------------------------------------------------------------+------
 fhirdata | add_logical_resource_ident | bigint           | p_resource_type_id integer, p_logical_id character varying, OUT o_logical_resource_id bigint | func

You can check that both functions are distributed by checking the log output when running the schema tool --update-proc option:

...
2022-08-12 13:17:44.366 00000001    INFO abase.utils.citus.CitusAdapter Distributing function: fhirdata.add_logical_resource_ident(integer,character varying)
...
2022-08-12 13:17:44.563 00000001    INFO abase.utils.citus.CitusAdapter Distributing function: fhirdata.add_any_resource(bigint,integer,character varying,character varying,bytea,timestamp without time zone,character,character varying,integer,character varying,integer,character varying)
...

Additional context
None.

@punktilious punktilious added the enhancement New feature or request label Jun 15, 2022
@punktilious punktilious self-assigned this Jun 16, 2022
punktilious added a commit that referenced this issue Jun 16, 2022
Signed-off-by: Robin Arnold <robin.arnold@ibm.com>
punktilious added a commit that referenced this issue Jun 21, 2022
* issue #3713 use distributed add_any_resource function for citus

Signed-off-by: Robin Arnold <robin.arnold@ibm.com>

* issue #3437 fix usage of logicalResourceId and resourceId for remote index

Signed-off-by: Robin Arnold <robin.arnold@ibm.com>
@lmsurpre
Copy link
Member

The stored procedure has been split. Additional testing is needed with multi-node Citus but that will be performed separately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants