update custom column names on renaming/dropping columns #2933

rakeshkky · 2019-09-24T13:11:03Z

Description

Update custom column names in catalogue on dropping/renaming columns.

Customising column names in generated GraphQL schema is introduced in #2509. However, handling the custom column names metadata during altering/dropping columns was not included in original PR.

Affected components

Server
Tests

Related Issues

N/A. Follow up PR for #2509

Solution and Design

Remove dropped columns (if any) from custom column names map
Update custom column names map with renamed columns (if any)

Steps to test and verify

Follow the added test case
or
Defined custom column names on a table and try to alter through console.

netlify · 2019-09-24T13:12:16Z

Deploy preview for hasura-docs ready!

Built with commit 7b1e4ca

https://deploy-preview-2933--hasura-docs.netlify.com

hasura-bot · 2019-09-24T14:24:45Z

Review app for commit 63f764f deployed to Heroku: https://hge-ci-pull-2933.herokuapp.com
Docker image for server: hasura/graphql-engine:pull2933-63f764f7

lexi-lambda

LGTM aside from two minor comments.

server/src-lib/Hasura/RQL/DDL/Schema/Rename.hs

lexi-lambda · 2019-09-26T02:17:32Z

server/src-lib/Hasura/RQL/DDL/Schema/Table.hs

-    procAddedCols tn =
+    possiblyDropCustomColumnNames tn = do
+      let TableConfig customFields customColumnNames = _tiCustomConfig ti
+          modifiedCustomColumnNames = foldl (flip M.delete) customColumnNames droppedCols


This should probably be foldl'.

@lexi-lambda
From my understanding, foldl' is "strict" version of foldl. Am I right? :)

Yes, that’s right. It never makes sense to use foldl on lists because it never has any benefit and will always leak space.

To explain why, I wrote a mini blog post explaining the difference between foldl and foldr in Haskell. To start, you have to understand that foldl and foldr are not folds “from the left” and “from the right.” Both foldl and foldr traverse the structure in the same order, which in the case of lists means left to right. The difference is the fold’s associativity.

foldl vs foldr illustrated

The best way to think about this is with an illustration. When you write

foldl (⨂) v [e₀, e₁, e₂, ..., e_n−1, e_n]

you’re performing the following computation:

( ... (((v ⨂ e₀) ⨂ e₁) ⨂ e₂) ⨂ ... ⨂ e_n−1) ⨂ e_n

In contrast, when you write

foldr (⨂) v [e₀, e₁, e₂, ..., e_n−1, e_n]

you’re performing this computation:

e₀ ⨂ (e₁ ⨂ (e₂ ⨂ ... ⨂ (e_n−1 ⨂ (e_n ⨂ v)) ... ))

See the difference? In both expressions, the elements of the list appear in the expression in the same order—from left to right—but the grouping changes. With foldl, the applications of (⨂) are left-associated, while with foldr, they’re right-associated.

foldl vs. foldr, strictly

The question is: how does this difference actually impact the behavior of a program? Well, let’s start by first thinking about what the difference would be in a strict language. In a strict language, evaluation order always proceeds from the “inside out,” starting with the most deeply nested expression.

Let’s think about that in the context of foldl first. Let’s say we wrote this expression:

foldl (+) 0 [1, 2, 3, 4]

By the above illustration, we know that expression is equivalent to this one:

(((0 + 1) + 2) + 3) + 4

Reducing from the inside out, we get the following reduction sequence:

foldl (+) 0 [1, 2, 3, 4] = (((0 + 1) + 2) + 3) + 4 = (( 1 + 2) + 3) + 4 = ( 3 + 3) + 4 = 6 + 4 = 10

In contrast, if we had used foldr, we’d get the same result (since (+) is an associative, commutative operation), but with a slightly different reduction sequence:

foldr (+) 0 [1, 2, 3, 4] = 1 + (2 + (3 + (4 + 0))) = 1 + (2 + (3 + 4 )) = 1 + (2 + 7 ) = 1 + 9 = 10

What’s the practical difference between these two things? Well, note the following detail: with foldl, to start reducing, we only need the first element of the list, but with foldr, we have to start from the end of the list and reduce “backwards.”¹ Practically, this means foldl can be tail-recursive, reducing as it traverses the list in constant space, while foldr cannot be: to reduce a list of length n with foldr, you need to create n stack frames before any reduction can start.

_{¹ This is why foldr is sometimes described as “folding from the right”, even though it traverses the list from left to right. As we’ll see, however, this doesn’t actually hold in lazy languages!}

foldl, lazily

But what about in lazy languages, like Haskell? In a lazy language, evaluation order doesn’t proceed from the “inside out” like it does in strict languages, but rather from the “outside in,” evaluating expressions only as their results are demanded.

In a strict language, foldl (+) 0 [1, 2, 3, 4] doesn’t actually get turned into the expression (((0 + 1) + 2) + 3) + 4; as I mentioned earlier, it’s implemented as a tail-recursive loop. But in Haskell, it basically does get expanded into that expression before any reduction starts—each application of (+) is lazily suspended in a thunk.

If we explicitly denote thunks with ⟨⟩ brackets, the thunk we’ll end up with looks like this:

⟨⟨⟨⟨0 + 1⟩ + 2⟩ + 3⟩ + 4⟩

These thunks will only get forced when the outermost thunk is evaluated, which will cause (+) to be applied to the arguments ⟨⟨⟨0 + 1⟩ + 2⟩ + 3⟩ and 4. Since (+) is strict in both its arguments, it will force the next thunk, which will in turn apply (+) to ⟨⟨0 + 1⟩ + 2⟩ and 3, and so on until the whole thunk tree is reduced.

The result ends up being the same, but from a practical point of view, this is really bad, because instead of reducing the list in constant space, like we did in the strict language, we’re now creating a thunk linear in the size of the input list! It’s even worse that that, though, because in a strict language, the input list takes up space linear to its own size, so our overall space consumption for producing/consuming the list would simply be a constant factor increase, but in a lazy language, the list itself is more like a stream, which may actually use constant space if the whole stream is not fully realized in memory. By using the lazy foldl, we’ve possibly gone from a constant-space algorithm to a linear-space algorithm, which is bad!

What we want is to recover the behavior of the strict language’s foldl, efficiently updating an accumulator as we traverse the list, not building thunks. Therefore, we need a stricter version of foldl, which is exactly what foldl' is. foldl' places a demand on the ⟨0 + 1⟩ thunk before moving onto the next element of the list, so instead of building a larger ⟨⟨0 + 1⟩ + 2⟩ thunk, it simply builds a ⟨1 + 2⟩ thunk. foldl' continues traversing the list in constant space, never building a thunk larger than a single application of (+).

foldr, lazily

But what about foldr? Remember that in a strict language, foldr already needed to consume space linear in the size of the input list, since it fundamentally needed the last element in the list before it could start reducing. Indeed, if we consider a lazy foldr with a strict operation like (+), this is still true—but in an interestingly different way from foldl.

With foldl, we ended up building thunks incrementally as we traversed the list, leading to a very large, nested thunk. But with foldr, that doesn’t actually happen. Why? Well, consider the expansion again for just a moment:

foldr (+) 0 [1, 2, 3, 4] = 1 + (2 + (3 + (4 + 0)))

To consider how we end up with this expansion, let’s write the expansion out in an explicitly inductive way:

foldr (+) 0 [1, 2, 3, 4] = 1 + foldr (+) 0 [2, 3, 4] = 1 + (2 + foldr (+) 0 [3, 4]) = 1 + (2 + (3 + foldr (+) 0 [4])) = 1 + (2 + (3 + (4 + foldr (+) 0 []))) = 1 + (2 + (3 + (4 + 0)))

This makes the recursive calls to foldr more explicit. In a strict language, as soon as we call foldr, we need to demand the result, so we have to traverse the whole list. But here’s where things get interesting—in a lazy language, we can actually just return the following result, with a suspended thunk:

foldr (+) 0 [1, 2, 3, 4] = 1 + ⟨foldr (+) 0 [2, 3, 4]⟩

This might seem totally irrelevant, since once that result is forced, the ⟨foldr (+) 0 [2, 3, 4]⟩ will be forced by (+), and we’ll get the same reduction sequence we had before. But note that this is only true because (+) is a strict operation. What if we instead used a lazy operation, like (:)? In that case, we’d get the following expansion:

foldr (:) [] [1, 2, 3, 4] = 1 : ⟨foldr (:) [] [2, 3, 4]⟩

Guess what? That result is already in weak-head normal form (WHNF)! So evaluation just stops there until the rest of the result is explicitly demanded by something else. Now, in this case, this is a silly operation, since foldr (:) [] is just a complicated identity function on lists, but we could imagine a slightly more complicated function, such as one that doubles each element in a list:

let f x xs = (x * 2) : xs in foldr f [] [1, 2, 3, 4]

This will expand into the following:

foldr f [] [1, 2, 3, 4] = ⟨1 * 2⟩ : ⟨foldr f [] [2, 3, 4]⟩

…and again, it will just stop there, since it’s already in WHNF. How is this useful? Well, what if we didn’t actually consume the entire result list, like this?

sum (take 2 (foldr f [] [1, 2, 3, 4]))

Because take 2 will only return the first two elements of the list, then when sum forces the list and its values to add them together, it will never even evaluate the thunk ⟨foldr f [] [3, 4]⟩, and the list will only be partially-traversed.

What are the implications of this? Well, it means that foldr can possibly save on work if the reducing function is lazy in its second argument, and the result list is not entirely consumed. In fact, foldr can operate on infinite lists this way, while foldl cannot. It also means that foldr may be subject to more list fusion than foldl, though that’s another discussion entirely.

foldl vs. foldr, lazily

Okay, so, to briefly recap, here’s what I’ve said so far:

In a lazy language, foldl on lists is bad because it’s too lazy, and it builds up big thunks. Use foldl' instead to force the thunks incrementally and consume the list in constant space.

In a lazy language, foldr on lists is good because it’s lazy, so if the reducing function is lazy in its second argument, it can save on work.

These two things might seem a little contradictory. Why is foldl bad because it’s too lazy while foldr is good because it’s lazy?

To understand the difference, let’s expand foldl inductively like we did with foldr:

foldl (+) 0 [1, 2, 3, 4] = foldl (+) (0 + 1) [2, 3, 4] = foldl (+) ((0 + 1) + 2) [3, 4] = foldl (+) (((0 + 1) + 2) + 3) [4] = foldl (+) ((((0 + 1) + 2) + 3) + 4) [] = ((((0 + 1) + 2) + 3) + 4)

See the difference? With foldr, the recursive call was pushed into a “leaf” of the resulting expression tree, but with foldl, the recursive call is always the root. This is, by the way, why foldl is tail recursive—this is exactly what tail recursion is!—but it means it can’t possibly be lazy, since it will never be in WHNF until the entire list has been traversed.

This gives us a general rule of thumb for using foldl and foldr on lists:

When the accumulation function is strict, use foldl' to consume the list in constant space, since the whole list is going to have to be traversed, anyway.

When the accumulation function is lazy in its second argument, use foldr to do work incrementally to improve streaming and work-saving.

Never use foldl or foldr'; they’re always worse on lists.

In your case, the accumulation function you’re applying is Map.delete, which is strict, so you should use foldl'.

That said, this is often a micro-optimization, so if the list is not large, it usually doesn’t really matter. It’s just a good habit to get into, and it’s worth understanding, since it’s a great example of laziness in practice.

Addendum: foldl and foldr on other data structures

As a final note, you might wonder: if foldl and foldr' are so useless, why do they even exist? Why not just have foldl' and foldr?

The answer is that everything I just said only applies to lists. This behavior happens because, fundamentally, (:) is a right-associative operation, so the “remainder” of the list is on the right. But if we had snoc lists, like this:

data SnocList a = Nil | Snoc (SnocList a) a

…then our lists would be left-associative, and we’d want to use foldr' in situations where we use foldl' on ordinary lists and foldl where we use foldr on ordinary lists. A little confusing, isn’t it?

Ordinary cons lists and snoc lists are basically the two extremes of foldl vs foldr, but in practice, other data structures are a lot fuzzier. For example, if you have a tree, like

data Tree a = Leaf | Branch (Tree a) a (Tree a)

…then some elements are on the left and others are on the right, and neither foldl nor foldr are clearly better. In that case, if you really, really care about performance, foldMap and foldMap' are usually your best bet, since they don’t specify any particular associativity of calls to (<>). However, we don’t have foldMap' until base-4.13.0.0, which won’t be available until we switch to GHC 8.8.1. (But in truth, it probably doesn’t matter, anyway.)

@lexi-lambda

As suggested by @lexi-lambda

hasura-bot · 2019-09-26T06:36:17Z

Review app for commit 53b7760 deployed to Heroku: https://hge-ci-pull-2933.herokuapp.com
Docker image for server: hasura/graphql-engine:pull2933-53b7760a

hasura-bot · 2019-09-28T00:30:06Z

Review app for commit 4718af7 deployed to Heroku: https://hge-ci-pull-2933.herokuapp.com
Docker image for server: hasura/graphql-engine:pull2933-4718af70

hasura-bot · 2019-10-03T06:39:46Z

Review app for commit 7b1e4ca deployed to Heroku: https://hge-ci-pull-2933.herokuapp.com
Docker image for server: hasura/graphql-engine:pull2933-7b1e4ca3

hasura-bot · 2019-10-03T07:45:54Z

Review app https://hge-ci-pull-2933.herokuapp.com is deleted

hasura/graphql-engine#2933 (comment) Fixes #2482

rakeshkky added the c/server Related to server label Sep 24, 2019

rakeshkky requested a review from lexi-lambda as a code owner September 24, 2019 13:11

rakeshkky self-assigned this Sep 24, 2019

lexi-lambda suggested changes Sep 26, 2019

View reviewed changes

rakeshkky added 2 commits September 26, 2019 11:30

update custom column names on altering/dropping columns

a7e2d6d

no-op refactor; resolve review comments

53b7760

As suggested by @lexi-lambda

rakeshkky force-pushed the custom-column-names-update branch from 63f764f to 53b7760 Compare September 26, 2019 06:12

lexi-lambda approved these changes Sep 26, 2019

View reviewed changes

tirumaraiselvan mentioned this pull request Sep 26, 2019

remote relationship create api #2850

Closed

6 tasks

Merge branch 'master' into custom-column-names-update

4718af7

Merge branch 'master' into custom-column-names-update

7b1e4ca

shahidhk merged commit 55a7885 into hasura:master Oct 3, 2019

folkertdev mentioned this pull request Nov 8, 2019

Fix #642 Bad performance/memory leak for long list literals avh4/elm-format#643

Merged

polRk pushed a commit to polRk/graphql-engine that referenced this pull request Feb 12, 2020

update custom column names on renaming/dropping columns (hasura#2933)

b9d0cdb

sjakobi mentioned this pull request Jul 8, 2020

feat(dhall-docs): Add test-setup and assertions as examples dhall-lang/dhall-haskell#1899

Merged

arybczak mentioned this pull request May 9, 2022

Add support for triggers and associated functions scrive/hpqtypes-extras#68

Merged

doyougnu mentioned this pull request Sep 9, 2022

Folding the right way optimization chapter haskellfoundation/hs-opt-handbook.github.io#43

Open

martijnbastiaan added a commit to clash-lang/clash-compiler that referenced this pull request Mar 9, 2024

Make foldl strict, like foldl'

9dc3fda

hasura/graphql-engine#2933 (comment) Fixes #2482

martijnbastiaan mentioned this pull request Mar 9, 2024

Make foldl strict, like foldl' clash-lang/clash-compiler#2691

Merged

2 tasks

NorfairKing mentioned this pull request May 7, 2024

Consider using less morally-loaded language in hints. ndmitchell/hlint#1584

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

update custom column names on renaming/dropping columns #2933

update custom column names on renaming/dropping columns #2933

rakeshkky commented Sep 24, 2019

netlify bot commented Sep 24, 2019 •

edited

Loading

hasura-bot commented Sep 24, 2019

lexi-lambda left a comment

lexi-lambda Sep 26, 2019

rakeshkky Sep 26, 2019

lexi-lambda Sep 26, 2019 •

edited

Loading

hasura-bot commented Sep 26, 2019

hasura-bot commented Sep 28, 2019

hasura-bot commented Oct 3, 2019

hasura-bot commented Oct 3, 2019

update custom column names on renaming/dropping columns #2933

update custom column names on renaming/dropping columns #2933

Conversation

rakeshkky commented Sep 24, 2019

Description

Affected components

Related Issues

Solution and Design

Steps to test and verify

netlify bot commented Sep 24, 2019 • edited Loading

hasura-bot commented Sep 24, 2019

lexi-lambda left a comment

Choose a reason for hiding this comment

lexi-lambda Sep 26, 2019

Choose a reason for hiding this comment

rakeshkky Sep 26, 2019

Choose a reason for hiding this comment

lexi-lambda Sep 26, 2019 • edited Loading

Choose a reason for hiding this comment

foldl vs foldr illustrated

foldl vs. foldr, strictly

foldl, lazily

foldr, lazily

foldl vs. foldr, lazily

Addendum: foldl and foldr on other data structures

hasura-bot commented Sep 26, 2019

hasura-bot commented Sep 28, 2019

hasura-bot commented Oct 3, 2019

hasura-bot commented Oct 3, 2019

netlify bot commented Sep 24, 2019 •

edited

Loading

lexi-lambda Sep 26, 2019 •

edited

Loading

`foldl` vs `foldr` illustrated

`foldl` vs. `foldr`, strictly

`foldl`, lazily

`foldr`, lazily

`foldl` vs. `foldr`, lazily

Addendum: `foldl` and `foldr` on other data structures