Skip to content

Commit

Permalink
wip
Browse files Browse the repository at this point in the history
  • Loading branch information
4kimov committed Jul 2, 2023
1 parent e0833a4 commit 616cae1
Show file tree
Hide file tree
Showing 5 changed files with 36 additions and 36 deletions.
2 changes: 1 addition & 1 deletion .gitattributes
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
package-lock.json binary
src/blacklist.json binary
src/blocklist.json binary
12 changes: 6 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ npm run lint
1. The user is not required to provide randomized input anymore (there's still support for custom IDs).
1. Better internal alphabet shuffling function.
1. With default alphabet - Hashids is using base 49 for encoding-only, whereas Sqids is using base 60.
1. Safer public IDs, with support for custom blacklist of words.
1. Safer public IDs, with support for custom blocklist of words.
1. Separators are no longer limited to characters "c, s, f, h, u, i, t". Instead, it's one rotating separator assigned on the fly.
1. Simpler & smaller implementation: only "encode", "decode", "minValue", "maxValue" functions.

Expand All @@ -53,22 +53,22 @@ Here's how encoding works:
- If this is the first time, a throwaway number is prepended to the input array.
- Number are encoded again to generate a new ID (this time partitioned).
- If the `minLength` requirement is still not met, a new ID is composed in this way: the `prefix` character + a slice of the alphabet to make up the missing length + the rest of the ID without the `prefix` character.
1. If the blacklist function matches the generated ID:
1. If the blocklist function matches the generated ID:
- If this is the first time, a throwaway number is prepended to the input array & encoding restarts (this time partitioned).
- If the throwaway number has also matched the blacklist, then the throwaway number is incremented & encoding restarts.
- If the throwaway number has also matched the blocklist, then the throwaway number is incremented & encoding restarts.

Decoding is the same process but in reverse, with a few exceptions:

- Once the `partition` character is found, everything to the left of it gets thrown away.
- There is nothing done regarding `blacklist` and `minLength` requirements, those are used for encoding.
- There is nothing done regarding `blocklist` and `minLength` requirements, those are used for encoding.

## 📦 Porting to a new language

Implementations of new languages are more than welcome! To start:

1. Make sure you have access to the org's repo. The format is `https://github.com/sqids/sqids-[LANGUAGE]`. If you don't have access, ask one of the maintainers to add you; if it doesn't exist, ask [@4kimov](https://github.com/4kimov).
1. The main spec is here: <https://github.com/sqids/sqids-spec/blob/main/src/index.ts>. It's under 400 lines of code and heavily commented. Comments are there for clarity, they don't have to exist in your own implementation.
1. Please use the blacklist from <https://github.com/sqids/sqids-blacklist> (copy and paste the output it gives you into your own code). It will contain the most up-to-date list. Do not copy and paste the blacklist from other implementations, as they might not be up-to-date.
1. Please use the blocklist from <https://github.com/sqids/sqids-blocklist> (copy and paste the output it gives you into your own code). It will contain the most up-to-date list. Do not copy and paste the blocklist from other implementations, as they might not be up-to-date.
1. Be sure to implement unit tests. We want to make sure all implementations produce the same IDs. Unit tests are here: <https://github.com/sqids/sqids-spec/tree/main/tests>.
1. If you're publishing to a package manager, please add a co-maintainer so more than one person has access.
1. When done, please let [@4kimov](https://github.com/4kimov) know so we can update the website.
Expand All @@ -77,7 +77,7 @@ Implementations of new languages are more than welcome! To start:

- The reason `prefix` character is used is to randomize sequential inputs (eg: [0, 1], [0, 2], [0, 3]). Without the extra `prefix` character embedded into the ID, the output would start with the same characters.
- Internal shuffle function does not use random input. It consistently produces the same output.
- If new words are blacklisted (or removed from the blacklist), the `encode()` function might produce new IDs, but the `decode()` function would still work for old/blocked IDs, plus new IDs. So, there's more than one ID that can be produced for same numbers.
- If new words are blocked (or removed from the blocklist), the `encode()` function might produce new IDs, but the `decode()` function would still work for old/blocked IDs, plus new IDs. So, there's more than one ID that can be produced for same numbers.

## 🍻 License

Expand Down
File renamed without changes.
32 changes: 16 additions & 16 deletions src/index.ts
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
import defaultBlacklist from './blacklist.json';
import defaultBlocklist from './blocklist.json';

type SqidsOptions = {
alphabet?: string;
minLength?: number;
blacklist?: Set<string>;
blocklist?: Set<string>;
};

export const defaultOptions = {
Expand All @@ -12,18 +12,18 @@ export const defaultOptions = {
// `minLength` is the minimum length IDs should be
minLength: 0,
// a list of words that should not appear anywhere in the IDs
blacklist: new Set<string>()
blocklist: new Set<string>()
};

export default class Sqids {
private alphabet: string;
private minLength: number;
private blacklist: Set<string>;
private blocklist: Set<string>;

constructor(options?: SqidsOptions) {
const alphabet = options?.alphabet ?? defaultOptions.alphabet;
const minLength = options?.minLength ?? defaultOptions.minLength;
const blacklist = options?.blacklist ?? new Set<string>(defaultBlacklist);
const blocklist = options?.blocklist ?? new Set<string>(defaultBlocklist);

// check the length of the alphabet
if (alphabet.length < 5) {
Expand All @@ -46,25 +46,25 @@ export default class Sqids {
);
}

// clean up blacklist:
// 1. all blacklist words should be lowercase
// clean up blocklist:
// 1. all blocklist words should be lowercase
// 2. no words less than 3 chars
// 3. if some words contain chars that are not in the alphabet, remove those
const filteredBlacklist = new Set<string>();
const filteredBlocklist = new Set<string>();
const alphabetChars = alphabet.split('');
for (const word of blacklist) {
for (const word of blocklist) {
if (word.length >= 3) {
const wordChars = word.split('');
const intersection = wordChars.filter((c) => alphabetChars.includes(c));
if (intersection.length == wordChars.length) {
filteredBlacklist.add(word.toLowerCase());
filteredBlocklist.add(word.toLowerCase());
}
}
}

this.alphabet = this.shuffle(alphabet);
this.minLength = minLength;
this.blacklist = filteredBlacklist;
this.blocklist = filteredBlocklist;
}

/**
Expand Down Expand Up @@ -98,7 +98,7 @@ export default class Sqids {
* Internal function that encodes an array of unsigned integers into an ID
*
* @param {array.<number>} numbers Positive integers to encode into an ID
* @param {boolean} partitioned If true, the first number is always a throwaway number (used either for blacklist or padding)
* @param {boolean} partitioned If true, the first number is always a throwaway number (used either for blocklist or padding)
* @returns {string} Generated ID
*/
private encodeNumbers(numbers: number[], partitioned = false): string {
Expand All @@ -114,7 +114,7 @@ export default class Sqids {
// prefix is the first character in the generated ID, used for randomization
const prefix = alphabet.charAt(0);

// partition is the character used instead of the first separator to indicate that the first number in the input array is a throwaway number. this character is used only once to handle blacklist and/or padding. it's omitted completely in all other cases
// partition is the character used instead of the first separator to indicate that the first number in the input array is a throwaway number. this character is used only once to handle blocklist and/or padding. it's omitted completely in all other cases
const partition = alphabet.charAt(1);

// alphabet should not contain `prefix` or `partition` reserved characters
Expand Down Expand Up @@ -170,7 +170,7 @@ export default class Sqids {
if (partitioned) {
/* c8 ignore next 2 */
if (numbers[0] + 1 > this.maxValue()) {
throw new Error('Ran out of range checking against the blacklist');
throw new Error('Ran out of range checking against the blocklist');
} else {
numbers[0] += 1;
}
Expand Down Expand Up @@ -304,7 +304,7 @@ export default class Sqids {
private isBlockedId(id: string): boolean {
id = id.toLowerCase();

for (const word of this.blacklist) {
for (const word of this.blocklist) {
// no point in checking words that are longer than the ID
if (word.length <= id.length) {
if (id.length <= 3 || word.length <= 3) {
Expand All @@ -318,7 +318,7 @@ export default class Sqids {
return true;
}
} else if (id.includes(word)) {
// otherwise, check for blacklisted word anywhere in the string
// otherwise, check for blocked word anywhere in the string
return true;
}
}
Expand Down
26 changes: 13 additions & 13 deletions tests/blacklist.test.ts → tests/blocklist.test.ts
Original file line number Diff line number Diff line change
@@ -1,42 +1,42 @@
import { expect, test } from 'vitest';
import Sqids from '../src/index.ts';

test('if no custom blacklist param, use the default blacklist', () => {
test('if no custom blocklist param, use the default blocklist', () => {
const sqids = new Sqids();

expect.soft(sqids.decode('sexy')).toEqual([200044]);
expect.soft(sqids.encode([200044])).toBe('d171vI');
});

test(`if an empty blacklist param passed, don't use any blacklist`, () => {
test(`if an empty blocklist param passed, don't use any blocklist`, () => {
const sqids = new Sqids({
blacklist: new Set([])
blocklist: new Set([])
});

expect.soft(sqids.decode('sexy')).toEqual([200044]);
expect.soft(sqids.encode([200044])).toBe('sexy');
});

test('if a non-empty blacklist param passed, use only that', () => {
test('if a non-empty blocklist param passed, use only that', () => {
const sqids = new Sqids({
blacklist: new Set([
blocklist: new Set([
'AvTg' // originally encoded [100000]
])
});

// make sure we don't use the default blacklist
// make sure we don't use the default blocklist
expect.soft(sqids.decode('sexy')).toEqual([200044]);
expect.soft(sqids.encode([200044])).toBe('sexy');

// make sure we are using the passed blacklist
// make sure we are using the passed blocklist
expect.soft(sqids.decode('AvTg')).toEqual([100000]);
expect.soft(sqids.encode([100000])).toBe('7T1X8k');
expect.soft(sqids.decode('7T1X8k')).toEqual([100000]);
});

test('blacklist', () => {
test('blocklist', () => {
const sqids = new Sqids({
blacklist: new Set([
blocklist: new Set([
'8QRLaD', // normal result of 1st encoding, let's block that word on purpose
'7T1cd0dL', // result of 2nd encoding
'UeIe', // result of 3rd encoding is `RA8UeIe7`, let's block a substring
Expand All @@ -49,9 +49,9 @@ test('blacklist', () => {
expect.soft(sqids.decode('TM0x1Mxz')).toEqual([1, 2, 3]);
});

test('decoding blacklisted words should still work', () => {
test('decoding blocklist words should still work', () => {
const sqids = new Sqids({
blacklist: new Set(['8QRLaD', '7T1cd0dL', 'RA8UeIe7', 'WM3Limhw', 'LfUQh4HN'])
blocklist: new Set(['8QRLaD', '7T1cd0dL', 'RA8UeIe7', 'WM3Limhw', 'LfUQh4HN'])
});

expect.soft(sqids.decode('8QRLaD')).toEqual([1, 2, 3]);
Expand All @@ -61,9 +61,9 @@ test('decoding blacklisted words should still work', () => {
expect.soft(sqids.decode('LfUQh4HN')).toEqual([1, 2, 3]);
});

test('match against a short blacklisted word', () => {
test('match against a short blocklist word', () => {
const sqids = new Sqids({
blacklist: new Set(['pPQ'])
blocklist: new Set(['pPQ'])
});

expect.soft(sqids.decode(sqids.encode([1000]))).toEqual([1000]);
Expand Down

0 comments on commit 616cae1

Please sign in to comment.