Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix incorrect matrix initialization #291

Merged
merged 3 commits into from
Aug 24, 2023

Conversation

rexxars
Copy link
Contributor

@rexxars rexxars commented Jun 16, 2020

Unless I am missing something, the initialization of the matrix in the LCS filter is wrong. It currently puts a number (length of first array + 1) into the matrix, then immediately overwrites it.

While this shouldn't produce any bugs, I found it a little confusing when reading the code. As I see it there are two potential fixes:

  1. Start with an empty array
  2. Use the new Array(length) syntax

I assume the latter was the intention, so that's what the PR implements, but happy to change it if you want.

@amiller-gh
Copy link

You'll want to do the same on line 21 🙂

I've noticed some performance lag in this implementation of LCS with exceptionally long lists as well. In part because it creates so many objects, forcing a number of garbage collections throughout, and because of a number of repeat comparisons, nested loops, and duplicated logic.
Screen Shot 2020-07-06 at 7 35 58 PM

The entire matrix can be represented in a single array of

const matrix = new Array((len1+1) * (len2+1));

You can access any index (computed as needed) in the nested for loops by using:

const idx = (x * len1) + y;
const above = (x * len1) + y - 1;
const left = ((x - 1) * len1) + y ;

And most importantly, the new Array call does not need any instantiation. Instead, start the loops from 0 and use:

if (x === 0) { matrix[idx] = len1; }
else if (y === 0) { matrix[idx] = len2; )

before an other logic to fill default values in the same nested loop as the rest of the LCS logic.

Math.max can also be a hair slower than a simple ternary operation too and should be replaced.

Experimenting with the above changes, I was able to eke out an half second of performance gains from a ~3s) LCS comparison of two 10,000 item string arrays. Would be great to see it formalized and integrated!

@rexxars
Copy link
Contributor Author

rexxars commented Jul 19, 2020

@amiller-gh Why don't you send a pull request for these changes? That sounds like a great improvement!

Copy link
Collaborator

@Methuselah96 Methuselah96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Sorry for the delayed response.

@Methuselah96 Methuselah96 changed the title fix incorrect matrix initialization Fix incorrect matrix initialization Aug 24, 2023
@Methuselah96 Methuselah96 merged commit 4cb5805 into benjamine:master Aug 24, 2023
1 check passed
@rexxars rexxars deleted the fix-lcs-matrix-init branch December 15, 2023 16:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants