Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When linking 3+ datasets, all records that match against the master source should have same z_cluster #108

Closed
delta824 opened this issue Jan 4, 2022 · 2 comments
Assignees

Comments

@delta824
Copy link

delta824 commented Jan 4, 2022

All records that match against the master source should have same z_cluster. The output is currently in pairs.

@sonalgoyal
Copy link
Member

Thanks for reporting this, it makes sense to align the output with same z_cluster from the master source. Will fix!

@navinrathore
Copy link
Contributor

For Linking phase only, cluster id could be comprised of z_id. In this case, z_id of source[0]. Thereafter, all the related records will automatically have unique cluster_id.

Else, set of records with same z_id shall be given some other unique identifier.

sonalgoyal added a commit that referenced this issue Jan 5, 2022
Giving same cluster id to all records linked from multiple sources #108
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants