Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the algorithm for satisfying the min length #9

Closed
wants to merge 1 commit into from

Conversation

aradalvand
Copy link
Contributor

@aradalvand aradalvand commented Sep 1, 2023

Resolves #6

I was told that if I came up with an "elegant fix (meaning does not increase complexity of code and has clear benefits)" I could make a PR.

The benefits that this change offers:

  • Adds only as many junk characters as necessary to satisfy the min length requirement (and thereby fixes this problem, bringing about a more sane behavior in this regard)
  • Simplifies the min length algorithm — bar the comments, there are now fewer lines of code in the encoder in total and fewer branches than there were before (so, it doesn't increase the complexity).

The only "con" of this approach (or you could say its "cost"), is one additional reserved character. But I think the advantages justify that.

This also does NOT suffer from the "accidentally appending a blocked substring" problem which you alluded to in this comment, because if the junk characters that the min length algorithm adds either happen to be a blocked word themselves, or happen to form a blocked word when prepended to the rest of the ID, the if (this.isBlockedId(id)) block right after it would take care of that, by adding a throwaway number and re-encoding. So that's a problem that's elegantly and almost "automatically" solved, so to speak.

From a more "theoretical" PoV, you could also argue that the previous technique of adding throwaway numbers is more suited to avoiding blocked words (since it causes the entire ID to change), but is a bit too "radical" as a way of just satisfying the min-length, whereas surgically adding junk characters (like this PR is doing) is more appropriate as the min-length algo.

Again, no tests yet to first get feedback on the change itself.

@aradalvand
Copy link
Contributor Author

aradalvand commented Sep 4, 2023

@4kimov ping

@4kimov
Copy link
Member

4kimov commented Sep 5, 2023

Adding an additional reserved character means taking away from the encoding alphabet, which means you're making IDs longer. There's a new design being tested with at most one reserved character.

Coincidentally, it would also solve the problem you describe.

@aradalvand
Copy link
Contributor Author

aradalvand commented Sep 5, 2023

@4kimov The new design is fantastic! Thank you so much!

Feel free to close this PR then if #11 is likely to get implemented.

@aradalvand aradalvand closed this Sep 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Sqids adds too many junk characters when encoding multiple numbers with a min length specified
2 participants