Skip to content

Commit

Permalink
Fix diffWords treating numbers and underscores as not being word char…
Browse files Browse the repository at this point in the history
…acters (#554)

* Add test for bug #553

* Fix bug

* Add release notes
  • Loading branch information
ExplodingCabbage authored Sep 5, 2024
1 parent f9972d6 commit 7d113b6
Show file tree
Hide file tree
Showing 3 changed files with 33 additions and 1 deletion.
4 changes: 4 additions & 0 deletions release-notes.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
# Release Notes

## Future 7.0.0 release

- [#554](https://github.com/kpdecker/jsdiff/pull/554) **`diffWords` treats numbers and underscores as word characters again.** This behaviour was broken in v6.0.0.

## 6.0.0

This is a release containing many, *many* breaking changes. The objective of this release was to carry out a mass fix, in one go, of all the open bugs and design problems that required breaking changes to fix. A substantial, but exhaustive, changelog is below.
Expand Down
2 changes: 1 addition & 1 deletion src/diff/word.js
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ import { longestCommonPrefix, longestCommonSuffix, replacePrefix, replaceSuffix,
// - U+02DC ˜ ˜ Small Tilde
// - U+02DD ˝ ˝ Double Acute Accent
// Latin Extended Additional, 1E00–1EFF
const extendedWordChars = 'a-zA-Z\\u{C0}-\\u{FF}\\u{D8}-\\u{F6}\\u{F8}-\\u{2C6}\\u{2C8}-\\u{2D7}\\u{2DE}-\\u{2FF}\\u{1E00}-\\u{1EFF}';
const extendedWordChars = 'a-zA-Z0-9_\\u{C0}-\\u{FF}\\u{D8}-\\u{F6}\\u{F8}-\\u{2C6}\\u{2C8}-\\u{2D7}\\u{2DE}-\\u{2FF}\\u{1E00}-\\u{1EFF}';

// Each token is one of the following:
// - A punctuation mark plus the surrounding whitespace
Expand Down
28 changes: 28 additions & 0 deletions test/diff/word.js
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,34 @@ describe('WordDiff', function() {
'.'
]);
});

// Test for bug reported at https://github.com/kpdecker/jsdiff/issues/553
it('should treat numbers as part of a word if not separated by whitespace or punctuation', () => {
expect(
wordDiff.tokenize(
'Tea Too, also known as T2, had revenue of 57m AUD in 2012-13.'
)
).to.deep.equal([
'Tea ',
' Too',
', ',
' also ',
' known ',
' as ',
' T2',
', ',
' had ',
' revenue ',
' of ',
' 57m ',
' AUD ',
' in ',
' 2012',
'-',
'13',
'.'
]);
});
});

describe('#diffWords', function() {
Expand Down

0 comments on commit 7d113b6

Please sign in to comment.