Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: apply links based on Levensthein-distance #531

Closed
akosbalasko opened this issue Sep 21, 2023 · 15 comments
Closed

Feature: apply links based on Levensthein-distance #531

akosbalasko opened this issue Sep 21, 2023 · 15 comments

Comments

@akosbalasko
Copy link
Owner

matboehmer's great idea is, in order to increase the number of the recognizable chains between links two end, that Yarle could try to do it by calculating a Levensthein distance between the text of the link and the existing notes' title created and apply the link to the minimal one.

If more than one notes has minimal distance, based on another setting Yarle could do the followings:

  1. do not add a link to any of them (it results link loss)
  2. link to all of the notes (it results extra links which were not set in Evernote)
  3. link to the first of them (may result link mixtures)

As an MVP I would implement case 3.

@matboehmer
Copy link

Thanks! Really looking forward to this one. Happy to serve as a tester.

@akosbalasko
Copy link
Owner Author

Hi @matboehmer !
I've created a pre-release with this Levensthein-distance linking feature, feel free to download from here https://github.com/akosbalasko/yarle/releases/tag/v5.8.0 and test it.
Thanks a lot!

@matboehmer
Copy link

Thanks, great! How can I run the code using npx or any other way? I am not sure if npx -p yarle-evernote-to-md@5.8.0 yarle --configFile config.json uses the latest code.

@akosbalasko
Copy link
Owner Author

@matboehmer
yes yes, it should work as you wrote, just extend your config.json with a new property:

useLevenshteinForLinks: true

@matboehmer
Copy link

Thanks, got it! However, it does not work for me. It seems like applyLinks in apply-links.js is only called once and also the if (options.useLevenshteinForLinks) block is only called once (I added a console output for debugging). However, in the test set I posted in #530 there are 4 links. So, from my understanding the levenshtein lookup should also be done 4 times?

@akosbalasko
Copy link
Owner Author

hm... it is iterated through the recognized links and replaces the link URLs everywhere in the notes folder. Let me check.

@akosbalasko
Copy link
Owner Author

It's hard to create a real test for multiple links, the Evernote fails to sync for me currently. So it will take a bit of time, sorry.

@akosbalasko
Copy link
Owner Author

@matboehmer could you pls give it a try via the UI?
thanks a lot!

@matboehmer
Copy link

Same result; also does not work using the app UI. Does it work for you? Do you have some test data you could share?

@matboehmer
Copy link

It works for me with your data set, but not with the one I postet here #530 (comment)

@akosbalasko
Copy link
Owner Author

I think that one, what you shared in the comment reflects a different issue which cannot be resolved easily.
What i implemented is that if the the referenced note is recognized by its note text's shortest Levenshtein-distance.
For instance if the text of the note is mistyped like notA is typed instead of noteA, and there is no notes that's name is more similar than this, then notA is going to be picked.

@matboehmer
Copy link

matboehmer commented Oct 2, 2023

In my example data in #530 (comment) the wrong link is created as [[first-note|second note]] in both files first-note and second-note. However, the link [[first-note|second note]] could be fixed to [[second-note|second note]] (i.e., replacing first-note with second-note) by looking up a proper link target using Levensthein distance.

@akosbalasko
Copy link
Owner Author

@matboehmer ,
Okay, I found a bug around the unique id recognizer that caused that the links could overlap each other. Now it is fixed, I checked with your example, and as I see it fixes your issue, but please confirm.
Thanks a lot!

@matboehmer
Copy link

Great, thank you! Works perfectly now on my test data set and already really good on my real data set. Thank you very much for adding this feature!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants