From 4c10dc4bac2cf6ad1fd25062d08707f0eb7c3076 Mon Sep 17 00:00:00 2001 From: lawrencegripper Date: Thu, 25 Jul 2024 21:22:36 +0100 Subject: [PATCH 1/2] =?UTF-8?q?Speedier=20`Sqids.new`=20=F0=9F=A6=91?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit With the current implementation a call to `Sqid.new` can take ~3/4ms. This improves this in two ways: 1. If the Default alphabet and blocklist are used skip filtering as they already meet the rules That gets `Sqid.new` down from ~3ms to 0.07ms 2. When not using defaults, move `alphabet.downcase` out of the `select` loop so we don't call it once per block list item. That gets us from 3.315ms to 0.95ms 🥳 Numbers and benchmarking approach in this issue here: - https://github.com/sqids/sqids-ruby/issues/6 --- lib/sqids.rb | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-) diff --git a/lib/sqids.rb b/lib/sqids.rb index 7f2b470..d278b01 100644 --- a/lib/sqids.rb +++ b/lib/sqids.rb @@ -29,9 +29,19 @@ def initialize(options = {}) "Minimum length has to be between 0 and #{min_length_limit}" end - filtered_blocklist = blocklist.select do |word| - word.length >= 3 && (word.downcase.chars - alphabet.downcase.chars).empty? - end.to_set(&:downcase) + filtered_blocklist = if blocklist == DEFAULT_BLOCKLIST && alphabet == DEFAULT_ALPHABET + # If the blocklist is the default one, we don't need to filter it + # we already know it's valid (lowercase and words longer than 3 chars) + blocklist + else + # Downcase the alphabet once, rather than in the loop, to save significant time + # with large blocklists + downcased_alphabet = alphabet.downcase.chars + # Filter the blocklist + blocklist.select do |word| + word.length >= 3 && (word.downcase.chars - downcased_alphabet).empty? + end.to_set(&:downcase) + end @alphabet = shuffle(alphabet) @min_length = min_length From 6d0e5b4a53984b4d2a37e2781ec317d444b57f7e Mon Sep 17 00:00:00 2001 From: lawrencegripper Date: Thu, 25 Jul 2024 21:29:31 +0100 Subject: [PATCH 2/2] Add note in docs about cost of `Sqid.new` with custom large block lists. --- README.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/README.md b/README.md index f1eb3cd..30792d0 100644 --- a/README.md +++ b/README.md @@ -86,6 +86,11 @@ id = sqids.encode([1, 2, 3]) # 'se8ojk' numbers = sqids.decode(id) # [1, 2, 3] ``` +> [!WARNING] +> If you provide a large custom blocklist and/or custom alphabet, calls to `Sqid.new` can take +> ~1ms. You should create a singleton instance of `Sqid` at service start and reusing that rather than +> repeatedly calling `Squid.new` + ## 📝 License [MIT](LICENSE)