Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add traverse function #59

Merged
merged 17 commits into from
Jul 12, 2024
Merged

feat: add traverse function #59

merged 17 commits into from
Jul 12, 2024

Conversation

aleclarson
Copy link
Member

@aleclarson aleclarson commented Jul 2, 2024

Tip

The owner of this PR can publish a preview release by commenting /publish in this PR. Afterwards, anyone can try it out by running pnpm add radashi@pr<PR_NUMBER>.

Summary

The traverse function helps you iterate the properties of an object or the values of an iterable, while also iterating any nested plain objects and arrays deep within.

Terminology

  • visitor: The callback passed to traverse that receives each value found in the root object or an object nested within the root object.
  • visit: To “visit” a value is to invoke the visitor callback with the value.
  • traverse: To “traverse an object” is to visit the object's properties or, if the object is an iterable, its elements (a.k.a. its generated values).

Here's a (hopefully) exhaustive list of its behavior:

  • The object passed in as the root argument of traverse is always traversed. If it's an iterable (e.g. a Map or a Set), then it's iterated with a for..of loop. If it's an array, it's iterated with .forEach (so that sparse arrays won't have their holes visited). Otherwise, the object properties are iterated.
  • By default, non-iterable objects only have enumerable properties with string keys get iterated. That's because we use Object.keys by default, but the caller can provide their own “get object keys” function (the 4th argument). For example, you may pass Reflect.ownKeys to iterate all properties whether they're enumerable/non-enumerable or have a string/number/symbol key.
  • The visitor callback receives these arguments: (value, key, parent, context). The context argument contains a path array (the keys used to reach this value) and a parents array (the chain of objects used to reach this value), along with some others.
  • You can skip the traversal of a nested object. When the current value is a traversable object, just call context.skip(). The skip() method also accepts an optional object argument, which allows skipping a nested object before it's been visited.
  • You can end traversal early. If traversal was ended early, the traverse call returns false. Return false from your visitor callback to end traversal immediately (no more values will be visited).
  • If a nested object is encountered, it won't be traversed unless it's a plain object or an array. To traverse other nested objects, you can call traverse within your visitor callback. To preserve the context.path and context.parents arrays in your nested traverse call, be sure to pass in the context from the parent traversal.
  • Circular references are accounted for.
  • The visitor callback can return a “leave callback” which is: (1) called only for traversed objects and is (2) called when all properties/items of the object have been traversed.

Note: We'll likely use traverse to implement deep-cloning in Radashi.

Naming

While traverse is a decent name, I'm open to other suggestions. Other common names for this type of function include:

  • walk
  • deepForEach
  • depthFirstScan

What kind of change does this PR introduce?

Feature

For any code change,

  • Related documentation has been updated, if needed
  • Related tests have been added or updated, if needed

Does this PR introduce a breaking change?

No

Bundle impact

Status File Size
A src/object/traverse.ts 1340
A src/typed/isIterable.ts 97

@aleclarson aleclarson added the new feature This PR adds a new function or extends an existing one label Jul 2, 2024
@aleclarson aleclarson marked this pull request as draft July 3, 2024 18:08
@aleclarson aleclarson mentioned this pull request Jul 4, 2024
3 tasks
export function traverse(
root: object,
visitor: TraverseVisitor,
outerContext?: TraverseContext | null,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to make outerContext a public API? Do you have any nice use case examples for that?

I think it's better to create internalTaverse which is not exported and export traverse as a root call of internalTraverse. It helps us to keep API simple. And we can always add outerContext argument in the future if we have nice use case for that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's intentional. You can call traverse from within another traverse visitor if you need to traverse a class instance or a non-array iterable. By passing the context along, you maintain the context.parents and context.path properties.

export type TraverseVisitor = (
value: unknown,
key: keyof any,
parent: object,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have parent inside context, so we can skip that argument at all:

 _.traverse(obj, (value, key, {parent}) => {
})

For me, that API looks nicer. What do you think?

Copy link
Member Author

@aleclarson aleclarson Jul 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was intentional. The idea is to support any function that implements a “standard” enumeration signature. By matching the signature of Array.prototype.forEach callbacks for example, we leave open opportunities for code reuse.

* the `visitor` callback. If that's necessary, you'll want to clone
* it first.
*/
readonly parents: readonly object[]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a hacky solution for this problem. We can use getter for this get parents (or maybe method getParents would be even better (explicit)?). And we can copy array inside getter (so copying is lazy). To prevent a copy of stale array (outside visitor callback) we can keep iterationIdx in the context and call iterationIdx++ each time we do recursive call. The code can look something like:

const currentRunIdx = iterationIdx;
const context = {
    ...outerContext
    get parents() {
         if (currentRunIdx !== iterationIdx) {
              throw Error("traverse context was accessed outside of the callback")
         }

         return [...parents];
    }
}

The good thing is that we don't shadow the error here. But with such API we always copy an array if user access it, which is not performant.

In the end, I don't know how to solve it better.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The getter isn't a bad idea, but it means we'd have to define another undocumented property for the mutable array. I believe the current behavior is well-documented enough and misuse is noticeable enough, so it should be quick/easy for developers to debug their misuse without needing to complicate traverse further.

@aleclarson aleclarson marked this pull request as ready for review July 11, 2024 20:28
@aleclarson aleclarson force-pushed the main branch 2 times, most recently from 2154f96 to 6a4b4f6 Compare July 12, 2024 00:23
@aleclarson aleclarson merged commit 2231c0e into main Jul 12, 2024
4 checks passed
@aleclarson aleclarson deleted the feat/traverse branch July 12, 2024 00:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new feature This PR adds a new function or extends an existing one
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants