Representing a complex update/insert/delete CTE #62

lithdew · 2022-12-06T17:52:51Z

Hi,

I wanted to ask if the following complex CTE query which comprises of inserts, deletions, and updates could be modeled with this library.

Suppose we have a post system with attachments. Post attachments are identified by ETag's. ETag's can be associated with many posts, and posts can be associated with several ETag's at once. A many-to-many relations table is used to define this relationship.

CREATE TABLE public.posts (
    id text NOT NULL,
    author_id text NOT NULL,
    content text NOT NULL,
    created_at timestamp(3) without time zone DEFAULT CURRENT_TIMESTAMP NOT NULL,
    updated_at timestamp(3) without time zone DEFAULT CURRENT_TIMESTAMP NOT NULL
);

CREATE TABLE public.etags_on_posts (
    post_id text NOT NULL,
    storage_id public."Storage" NOT NULL,
    etag_id text NOT NULL
);

CREATE TABLE public.etags (
    storage_id public."Storage" NOT NULL,
    id text NOT NULL,
    type text NOT NULL,
    num_posts_attached integer DEFAULT 0 NOT NULL,
    num_avatars_attached integer DEFAULT 0 NOT NULL
);

CREATE TYPE public."Storage" AS ENUM (
    'r2'
);

Suppose we want to update a post's content and attachment given {postId: string, content: string, etags: { storageId: "r2", id: string, type: string }}[].

Old entries to "etags_on_posts" should be removed, and their corresponding entries in "etags" should have "num_posts_attached" decremented by 1.
New entries to "etags_on_posts" should be added, and their corresponding entries in "etags" should have "num_posts_attached" incremented by 1. If no corresponding entries in "etags" exist, they are created with the etag's type inserted as well.
The entry of the post in "posts" should have its content updated.

The following raw porsager/postgres code represents the query:

async updatePostOrComment(sql: Sql, id: string, content: string, etags: { type: string, etag: string }[]) {
    const input = etags.map(e => (["r2", e.etag, e.type]));

    return sql`
        with updated_post as (update posts set ${sql({ content, updatedAt: new Date() })} where id = ${id} returning *),
        new_posts as (select * from posts union select * from updated_post)
        ${etags.length > 0 ?
            sql`
            , input as (select storage_id::"Storage", id::text, type::text from (values ${sql(input)}) as input(storage_id, id, type))
            , removed_etags_on_post as (delete from etags_on_posts where post_id = (select id from updated_post) and (storage_id, etag_id) not in (select storage_id, id as etag_id from input) returning *)
            , etags_on_post_to_add as (select (select id from updated_post) as post_id, storage_id, id as etag_id from input where (storage_id, id) not in (select storage_id, etag_id from removed_etags_on_post))
            , added_etags_on_post as (insert into etags_on_posts select * from etags_on_post_to_add returning *)
            , update_added_etags as (
                insert into etags (storage_id, id, type) select * from input where (storage_id, id) in (select storage_id, etag_id as id from added_etags_on_post)
                on conflict (storage_id, id) do update set num_posts_attached = excluded.num_posts_attached + 1 returning *
            )
            , update_removed_etags as (
                update etags set num_posts_attached = num_posts_attached - 1
                where (storage_id, id) in (select storage_id, etag_id as id from removed_etags_on_post) returning *
            )
            , new_etags_on_posts as (select * from etags_on_posts union select * from added_etags_on_post except select * from removed_etags_on_post)
            , new_etags as (select * from etags union select * from update_added_etags union select * from update_removed_etags)`
            :
            sql``}
        
        ${this.submissions(sql, sql`updated_post`, { posts: sql`new_posts`, etagsOnPosts: etags.length > 0 ? sql`new_etags_on_posts` : sql`etags_on_posts`, etags: etags.length > 0 ? sql`new_etags` : sql`etags` })}`.then(([post]) => post);
},

... where this.submissions joins some additional fields and performs some extra transformations to updated_post.

If perhaps the following query might be too complicated, could it be significantly simplified with this library then?

The text was updated successfully, but these errors were encountered:

juanluispaz · 2022-12-06T19:05:49Z

OMG, you put all sql features in a single query. Let me decompose the query a little bit:

with					(#1)

	updated_post as (
		update posts		(#2)
		set SOME_COL = SOME_VAL 
		where id = ${id} 
		returning *
	),
        new_posts as (
		select * 
		from posts 
		union 			(#3)
		select * 
		from updated_post
	),
	input as (
		select storage_id::"Storage", id::text, type::text 
		from (
			values ((VALUE_1, VALUE_2, VALUE_3))) 
			as input(storage_id, id, type)			(#4)
	), 
	removed_etags_on_post as (
		delete from etags_on_posts 
		where post_id = (
			select id from updated_post			(#5)
		) and (storage_id, etag_id) not in (			(#6)
			select storage_id, id as etag_id from input	(#7)
		) 
		returning *						(#8)
	), 
	etags_on_post_to_add as (
		select 
			(select id from updated_post) as post_id, 	(#4)
			storage_id, id as etag_id 
		from input 
		where (storage_id, id) not in (				(#6)
				select storage_id, etag_id 		(#7)
				from removed_etags_on_post
			)
	), 
	added_etags_on_post as (
		insert into etags_on_posts 				(#2)
		select * from etags_on_post_to_add 			(#9)
		returning *						(#10)
	),
	update_added_etags as (
		insert into etags (storage_id, id, type) 
		select * 						(#9)
		from input 
		where (storage_id, id) in (				(#6)
			select storage_id, etag_id as id 		(#7)
			from added_etags_on_post
		)
		on conflict (storage_id, id) 				(#11)
		do update 
		set num_posts_attached = excluded.num_posts_attached + 1 
		returning *						(#10)
	),
	update_removed_etags as (
		update etags 
		set num_posts_attached = num_posts_attached - 1
		where (storage_id, id) in (				(#6)
			select storage_id, etag_id as id		(#7)
			from removed_etags_on_post
		)
		returning *						(#12)
	),
	new_etags_on_posts as (
		select * 
		from etags_on_posts
		union							(#2)
		select * 
		from added_etags_on_post
		except							(#2)
		select * from removed_etags_on_post
	),
	new_etags as (
		select *
		from etags
		union 							(#2)
		select * 
		from update_added_etags
		union							(#2)
		select *
		from update_removed_etags
	)
select SOMETHING
from SOME_TABLE
where SOME_CONDITION

My notes:

(#1) With clauses are supported. Documentation
(#2) Use insert, update and delete queries in a with clause is a very PostgreSQL specific not supported yet (there are other specific postgres already supported)
(#3) Compound queries (union, intersect, except) are supported. Documentation
(#4) Inline values as table/view are supported. Documentation
(#5) Inline select values are supported. Documentation
(#6) Comparison of multiple values (aka (V1, V2) operator SOMETHING) are not supported yet, but you can use a custom sql fragment to do it (with regular values, not selects). Documentation
(#7) You can transform this subquery in another with and use it as a regular table in your front, then you don't need point (#6)
(#8) Delete returning are supported. Documentation
(#9) Insert from a select are supported. Documentation
(#10) Insert returning are supported. Documentation
(#11) Insert on conflicto to update supported, Included specify the conflict columns. Documentation. List of all methods available on insert
(#12) Update returning are supported. Documentation
Usually you don't want to use * in your queries in your backend, because it make fragile your queries, specially when the database schema changes and generate a mismatch between code and database; ts-sql-query doesn't support it, but instead you can specify all the columns expected by your code, there is some utilities function to do it. Documentation

Summary: The only missing feature to be able to perform this query in ts-sql-query is the one mention in (#1)

Regarding the readability of the code that generate the query, ts-sql-query can help you a lot on this. In our systems we have very complex queries, but we don't assemble it in a single function, instead of that, we decompose it in several functions that returns the object built by ts-sql-query (just omit the return type of the function declaration). Each query in the with can be created in a function.

Additionally you can take advantage of the select clause order to create a function that create the base select, and then add or complement the specific where clauses. Additionally, you can use optional joins when the join is not required all the time (there is a current limitation, if it refer a query in the with, the with will be included, but not used).

I recommend you to have a look on all convenient features to dial with dynamic queries and to the sql fragments that will allow you to use sql in parts not covered yet by the library (most of the case that is enough, but it doesn't cover point (#1)).

I'm going to see how can I add support to the point one, but it will take me a little bit.

juanluispaz · 2022-12-06T19:30:48Z

I'm going to point in my list to improve ts-sql-query:

Insert as with (only for postgres)
Update as with (only for postgres)
Delete as with (only for postgres)
(value1, value2, ...) in select query (all supported databases)
(value1, value2, ...) in insert query (only for postgres)
(value1, value2, ...) in update query (only for postgres)
(value1, value2, ...) in delete query (only for postgres)
Avoid add unused withs in dynamic queries

Be aware, ts-sql-query generate a compilation error when you try to use a feature not available in your database.

Good, now I have requirements, my list was empty 😃

lithdew · 2022-12-06T21:14:45Z

Really appreciate the swift response and detailed writeup. For the time being, I'm happy to isolate separate queries and have them independently executed one by one as a transaction rather than as a large CTE :).

juanluispaz · 2022-12-06T22:29:26Z

Oh, you can do it in several queries if you want, but that is not what I tried to tell you.

In our case, we have very complex queries in a special part of the system that manage the accountability of a project, In that case what we did is split the query in several functions that build that part of the system, and the put together in another one. It had a very interesting effect, some subqueries where reused in several places, that made even easer to understand and think in the system.

Let me put an example with the case you put in #63:

function buildFollowersCountSubquery(connection: DBConnection) { // I intentionally omit the return type, allowing TS infer it
    return connection
        .subSelectUsing(accounts)
        .from(follows)
        .where(follows.followingId.equals(a.id)
        .selectOneColumn(connection.countAll())
        .forUseAsInlineQueryValue()  // At this point is a value that you can use in other query
        .valueWhenNull(0);
}

function buildFollowingCountSubquery(connection: DBConnection) { // I intentionally omit the return type, allowing TS infer it
    return connection
        .subSelectUsing(accounts)
        .from(follows)
        .where(follows.followerId.equals(a.id)
        .selectOneColumn(connection.countAll())
        .forUseAsInlineQueryValue()  // At this point is a value that you can use in other query
        .valueWhenNull(0);
}

async function getAccountsInformation(connection: DBConnection) {
    const result = await connection
        .from(accounts)
        .select({
            id: a.id,
            numFollowers: buildFollowersCountSubquery(connection),
            numFollowing: buildFollowingCountSubquery(connection),
            // the others counts you need
        })
        .orderBy('id', 'asc')
        .executeSelectMany();

    return result;
}

I can go eve further:

function buildAccountsInformationQuery(connection: DBConnection) { // I intentionally omit the return type, allowing TS infer it
    return connection
        .from(accounts)
        .select({
            id: a.id,
            numFollowers: buildFollowersCountSubquery(connection),
            numFollowing: buildFollowingCountSubquery(connection),
            // the others counts you need
        })
        .orderBy('id', 'asc')
}

and the use it as:

async function getAccountsInformationForVerifiedUsers(connection: DBConnection) {
    const result = await  buildAccountsInformationQuery()
        .where(accounts.isVerified)
        .executeSelectMany();

    return result;
}

This strategy allowed us to deal with very complex queries in an easy and understandable way without scarify execute everything in a single call to the database.

lithdew · 2022-12-07T09:20:48Z

For points (#2), (#8), and (#9), would you then suggest splitting the query into independent statements and having them execute separately in a transaction rather than as a single CTE statement?

juanluispaz · 2022-12-07T09:48:45Z

For now yes, due I don't support that construction; or, if it is important for you to be in a single query, just keep as is, in raw sql if for you it is important to be in a single query due a very important performance reason (you must be able to measure and confirm you need it).

If you want to access to the sql object used by PostgresQueryRunner you can do this (I didn't include in PostgresQueryRunner documentation, but I mention in PrismaQueryRunner documentation):

Accessing to the porsager/postgres sql object from the connection

If you want to access the underlying porsager/postgres sql object (to the transaction one when it is in one) from your connection object you can define an accesor method in your connection class like:

import type { Sql } from 'postgres'

class DBConnection extends PostgreSqlConnection<'DBConnection'> {
    getSqlClient(): Sql {
        const client = this.queryRunner.getCurrentNativeTransaction() || this.queryRunner.getNativeRunner();
        // ideally I will do something like if (client instanceof Sql) { return client; } else { throw new Error('...'); }
        // but I don't know how to perform this validation in porsager/postgres, then I just cast to any without control allowing the function return the proper type.
        return client as any;
    }
}

In this way you will be able to keep raw sql in some places.

Note: TransactionSql extends Sql interface in porsager/postgres

juanluispaz added the question Further information is requested label Dec 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Representing a complex update/insert/delete CTE #62

Representing a complex update/insert/delete CTE #62

lithdew commented Dec 6, 2022

juanluispaz commented Dec 6, 2022 •

edited

Loading

juanluispaz commented Dec 6, 2022

lithdew commented Dec 6, 2022

juanluispaz commented Dec 6, 2022 •

edited

Loading

lithdew commented Dec 7, 2022 •

edited

Loading

juanluispaz commented Dec 7, 2022

Representing a complex update/insert/delete CTE #62

Representing a complex update/insert/delete CTE #62

Comments

lithdew commented Dec 6, 2022

juanluispaz commented Dec 6, 2022 • edited Loading

juanluispaz commented Dec 6, 2022

lithdew commented Dec 6, 2022

juanluispaz commented Dec 6, 2022 • edited Loading

lithdew commented Dec 7, 2022 • edited Loading

juanluispaz commented Dec 7, 2022

juanluispaz commented Dec 6, 2022 •

edited

Loading

juanluispaz commented Dec 6, 2022 •

edited

Loading

lithdew commented Dec 7, 2022 •

edited

Loading