Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Successful clustering methods for village column #52

Closed
tgnadt opened this issue Oct 7, 2019 · 1 comment
Closed

Successful clustering methods for village column #52

tgnadt opened this issue Oct 7, 2019 · 1 comment

Comments

@tgnadt
Copy link

tgnadt commented Oct 7, 2019

In the [second lesson "Working with Open Refine"] (https://datacarpentry.org/openrefine-socialsci/02-working-with-openrefine)
under "Using clustering to detect possible typing errors" in step 6, it is stated that

You should find no more clusters are found. None of the available methods offered to cluster Ruaca-Nhamuenda with Ruaca or Chirdozo with Chirodzo.

This is incorrect as e.g. for the method nearest neighbor with ppm selected, Radius set to "4" and Block Chars set to "1", these clusters are correctly identified, which is also shown on this screenshot
There are also other clustering settings that lead to this result.

Since directly below the steps it is mentioned that using different or more clustering and merging methods here will result in different results later on, I am not sure what a change here would affect, which is why I am submitting this as an issue.

@mariadelmarq
Copy link
Contributor

Latest pull request #78 addresses this issue, cab be safely closed now.

@bencomp bencomp closed this as completed Nov 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants