-
Notifications
You must be signed in to change notification settings - Fork 11.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(core): Dedupe #10101
feat(core): Dedupe #10101
Conversation
fb0a47d
to
509ebed
Compare
509ebed
to
ce2690d
Compare
ce2690d
to
357f95e
Compare
3b024d1
to
b1d9b74
Compare
97f5711
to
ca02cf9
Compare
6b1d5b1
to
0ea7889
Compare
e82deb5
to
2a1182c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for addressing all the comments 💟 Couple comments mainly about code readability that would be nice to address and a question about the sorting. But otherwise LGTM 🚀
items: DeduplicationItemTypes[], | ||
mode: DeduplicationMode, | ||
): DeduplicationItemTypes[] { | ||
return items.slice().sort((a, b) => (DeduplicationHelper.compareValues(mode, a, b) ? 1 : -1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sorting is not currently stable
(i.e. equal items don't have guaranteed order). Does it need to be? If it does, we should return 0
when the items are equal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it took me a while to get :) and it was actually an issue, so I fixed it
return createHash('md5').update(value.toString()).digest('base64'); | ||
} | ||
|
||
private async fetchProcessedData( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally fetch sounds like it's making an http request (because of fetch api). Maybe we could rename this queryProcessedData
or selectProcessedData
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it to findProcessedData to convey that it is a wrapper for findOne
export type DeduplicationScope = 'node' | 'workflow'; | ||
export type DeduplicationItemTypes = string | number; | ||
export type DeduplicationMode = 'entries' | 'latestIncrementalKey' | 'latestDate'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, more descriptive now 👌
this.validateMode(processedData, options); | ||
|
||
if (['latestIncrementalKey', 'latestDate'].includes(options.mode)) { | ||
const incomingItems = DeduplicationHelper.sortEntries(items, options.mode); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is quite long and seems to consists of two branches. WDYT about moving the branches to their own methods? I.e. something like this:
if (options.mode === 'latestIncrementalKey' || options.mode === 'latestDate') {
return await this.dedupeAndRecordByLatest(...)
} else {
return await this.dedupeAndRecordByEntries(...)
}
n8n Run #7316
Run Properties:
|
Project |
n8n
|
Run status |
Passed #7316
|
Run duration | 54m 04s |
Commit |
c29aecbc15: 🌳 🖥️ browsers:node18.12.0-chrome107 🤖 ShireenMissi 🗃️ e2e/*
|
Committer | Shireen Missi |
View all properties for this run ↗︎ |
Test results | |
---|---|
Failures |
0
|
Flaky |
0
|
Pending |
0
|
Skipped |
0
|
Passing |
438
|
✅ All Cypress E2E specs passed |
Co-authored-by: Tomi Turtiainen <10324676+tomi@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🚀
|
Co-authored-by: Tomi Turtiainen <10324676+tomi@users.noreply.github.com>
|
✅ All Cypress E2E specs passed |
Got released with |
Summary
Building on the work done in #5178
This PR creates a new version of Remove Duplicate node with the ability to remove duplicates across executions. it supports three logic types:
the deduplication happens in two contexts Node and Workflow.
Related Linear tickets, Github issues, and Community forum posts
https://linear.app/n8n/issue/NODE-1487/dedupe-p0
Review / Merge checklist
release/backport
(if the PR is an urgent fix that needs to be backported)