Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use array path when delimiter is false #29

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 10 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,19 +87,20 @@ HashDiff.unpatch!(b, diff).should == a

### Options

There are six options available: `:delimiter`, `:similarity`,
`:strict`, `:numeric_tolerance`, `:strip` and `:case_insensitive`.
There are six options available: `:stringify_keys`, `:similarity`, `:strict`,
`:numeric_tolerance`, `:strip` and `:case_insensitive`.

#### `:delimiter`

You can specify `:delimiter` to be something other than the default dot. For example:
#### `:stringify_keys`

By default, object keys are converted to strings in paths, you can override this behavior by specifying `:stringify_keys` as `false`.

```ruby
a = {a:{x:2, y:3, z:4}, b:{x:3, z:45}}
b = {a:{y:3}, b:{y:3, z:30}}
a = {'a' => 1}
b = {'a' => 1, :b => 2}

diff = HashDiff.diff(a, b, :delimiter => '\t')
diff.should == [['-', 'a\tx', 2], ['-', 'a\tz', 4], ['-', 'b\tx', 3], ['~', 'b\tz', 45, 30], ['+', 'b\ty', 3]]
diff = HashDiff.diff(a, b, :stringify_keys => false)
diff.should == [['+', [:b], 2]]
```

#### `:similarity`
Expand Down Expand Up @@ -164,7 +165,7 @@ end
diff.should == [['~', 'b', 'boat', 'truck']]
```

The yielded params of the comparison block is `|path, obj1, obj2|`, in which path is the key (or delimited compound key) to the value being compared. When comparing elements in array, the path is with the format `array[*]`. For example:
The yielded params of the comparison block is `|path, obj1, obj2|`, in which path is the array of keys to the value being compared. When comparing elements in array, the key is with the format `array[*]`. For example:

```ruby
a = {a:'car', b:['boat', 'plane'] }
Expand Down
150 changes: 78 additions & 72 deletions lib/hashdiff/diff.rb
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,9 @@ module HashDiff
# @param [Array, Hash] obj2
# @param [Hash] options the options to use when comparing
# * :strict (Boolean) [true] whether numeric values will be compared on type as well as value. Set to false to allow comparing Integer, Float, BigDecimal to each other
# * :delimiter (String) ['.'] the delimiter used when returning nested key references
# * :numeric_tolerance (Numeric) [0] should be a positive numeric value. Value by which numeric differences must be greater than. By default, numeric values are compared exactly; with the :tolerance option, the difference between numeric values must be greater than the given value.
# * :strip (Boolean) [false] whether or not to call #strip on strings before comparing
# * :stringify_keys [true] whether or not to convert object keys to strings
#
# @yield [path, value1, value2] Optional block is used to compare each value, instead of default #==. If the block returns value other than true of false, then other specified comparison options will be used to do the comparison.
#
Expand All @@ -25,18 +25,19 @@ module HashDiff
#
# @since 0.0.1
def self.best_diff(obj1, obj2, options = {}, &block)
options = { }.merge!(options)
options[:comparison] = block if block_given?

opts = { :similarity => 0.3 }.merge!(options)
diffs_1 = diff(obj1, obj2, opts)
options[:similarity] = 0.3
diffs_1 = diff(obj1, obj2, options)
count_1 = count_diff diffs_1

opts = { :similarity => 0.5 }.merge!(options)
diffs_2 = diff(obj1, obj2, opts)
options[:similarity] = 0.5
diffs_2 = diff(obj1, obj2, options)
count_2 = count_diff diffs_2

opts = { :similarity => 0.8 }.merge!(options)
diffs_3 = diff(obj1, obj2, opts)
options[:similarity] = 0.8
diffs_3 = diff(obj1, obj2, options)
count_3 = count_diff diffs_3

count, diffs = count_1 < count_2 ? [count_1, diffs_1] : [count_2, diffs_2]
Expand All @@ -50,9 +51,9 @@ def self.best_diff(obj1, obj2, options = {}, &block)
# @param [Hash] options the options to use when comparing
# * :strict (Boolean) [true] whether numeric values will be compared on type as well as value. Set to false to allow comparing Integer, Float, BigDecimal to each other
# * :similarity (Numeric) [0.8] should be between (0, 1]. Meaningful if there are similar hashes in arrays. See {best_diff}.
# * :delimiter (String) ['.'] the delimiter used when returning nested key references
# * :numeric_tolerance (Numeric) [0] should be a positive numeric value. Value by which numeric differences must be greater than. By default, numeric values are compared exactly; with the :tolerance option, the difference between numeric values must be greater than the given value.
# * :strip (Boolean) [false] whether or not to call #strip on strings before comparing
# * :stringify_keys [true] whether or not to convert object keys to strings
#
# @yield [path, value1, value2] Optional block is used to compare each value, instead of default #==. If the block returns value other than true of false, then other specified comparison options will be used to do the comparison.
#
Expand All @@ -68,128 +69,133 @@ def self.best_diff(obj1, obj2, options = {}, &block)
#
# @since 0.0.1
def self.diff(obj1, obj2, options = {}, &block)
opts = {
:prefix => '',
options = {
:prefix => [],
:similarity => 0.8,
:delimiter => '.',
:strict => true,
:strip => false,
:stringify_keys => true,
:numeric_tolerance => 0
}.merge!(options)

opts[:comparison] = block if block_given?
options[:comparison] = block if block_given?

diff_internal(obj1, obj2, options)
end

# @private
#
# diff two variables

def self.diff_internal(obj1, obj2, options)
prefix = options[:prefix]

# prefer to compare with provided block
result = custom_compare(opts[:comparison], opts[:prefix], obj1, obj2)
result = custom_compare(options, prefix, obj1, obj2)
return result if result

if obj1.nil? and obj2.nil?
return []
end

if obj1.nil?
return [['~', opts[:prefix], nil, obj2]]
return [['~', prefix, nil, obj2]]
end

if obj2.nil?
return [['~', opts[:prefix], obj1, nil]]
return [['~', prefix, obj1, nil]]
end

unless comparable?(obj1, obj2, opts[:strict])
return [['~', opts[:prefix], obj1, obj2]]
unless comparable?(obj1, obj2, options[:strict])
return [['~', prefix, obj1, obj2]]
end

result = []
if obj1.is_a?(Array)
changeset = diff_array(obj1, obj2, opts) do |lcs|
return diff_array(obj1, obj2, options) do |lcs|
# use a's index for similarity
lcs.each do |pair|
result.concat(diff(obj1[pair[0]], obj2[pair[1]], opts.merge(:prefix => "#{opts[:prefix]}[#{pair[0]}]")))
end
end

changeset.each do |change|
if change[0] == '-'
result << ['-', "#{opts[:prefix]}[#{change[1]}]", change[2]]
elsif change[0] == '+'
result << ['+', "#{opts[:prefix]}[#{change[1]}]", change[2]]
lcs.flat_map do |pair|
diff_internal(obj1[pair[0]], obj2[pair[1]], options.merge(:prefix => prefix + [pair[0]]))
end
end
elsif obj1.is_a?(Hash)
if opts[:prefix].empty?
prefix = ""
else
prefix = "#{opts[:prefix]}#{opts[:delimiter]}"
end
return diff_object(obj1, obj2, options)
else
return [] if compare_values(obj1, obj2, options)
return [['~', prefix, obj1, obj2]]
end
end

deleted_keys = obj1.keys - obj2.keys
common_keys = obj1.keys & obj2.keys
added_keys = obj2.keys - obj1.keys
# @private
#
# diff object
def self.diff_object(a, b, options)
prefix = options[:prefix]
change_set = []
deleted_keys = a.keys - b.keys
common_keys = a.keys & b.keys
added_keys = b.keys - a.keys

# add deleted properties
deleted_keys.sort_by{|k,v| k.to_s }.each do |k|
custom_result = custom_compare(opts[:comparison], "#{prefix}#{k}", obj1[k], nil)
# add deleted properties
deleted_keys.sort_by{|k,v| k.to_s }.each do |k|
subpath = prefix + [options[:stringify_keys] ? "#{k}" : k]
custom_result = custom_compare(options, subpath, a[k], nil)

if custom_result
result.concat(custom_result)
else
result << ['-', "#{prefix}#{k}", obj1[k]]
end
if custom_result
change_set.concat(custom_result)
else
change_set << ['-', subpath, a[k]]
end
end

# recursive comparison for common keys
common_keys.sort_by{|k,v| k.to_s }.each {|k| result.concat(diff(obj1[k], obj2[k], opts.merge(:prefix => "#{prefix}#{k}"))) }
# recursive comparison for common keys
common_keys.sort_by{|k,v| k.to_s }.each do |k|
subpath = prefix + [options[:stringify_keys] ? "#{k}" : k]
change_set.concat(diff_internal(a[k], b[k], options.merge(:prefix => subpath)))
end

# added properties
added_keys.sort_by{|k,v| k.to_s }.each do |k|
unless obj1.key?(k)
custom_result = custom_compare(opts[:comparison], "#{prefix}#{k}", nil, obj2[k])
# added properties
added_keys.sort_by{|k,v| k.to_s }.each do |k|
unless a.key?(k)
subpath = prefix + [options[:stringify_keys] ? "#{k}" : k]
custom_result = custom_compare(options, subpath, nil, b[k])

if custom_result
result.concat(custom_result)
else
result << ['+', "#{prefix}#{k}", obj2[k]]
end
if custom_result
change_set.concat(custom_result)
else
change_set << ['+', subpath, b[k]]
end
end
else
return [] if compare_values(obj1, obj2, opts)
return [['~', opts[:prefix], obj1, obj2]]
end

result
change_set
end

# @private
#
# diff array using LCS algorithm
def self.diff_array(a, b, options = {})
opts = {
:prefix => '',
:similarity => 0.8,
:delimiter => '.'
}.merge!(options)

prefix = options[:prefix]
change_set = []

if a.size == 0 and b.size == 0
return []
elsif a.size == 0
b.each_index do |index|
change_set << ['+', index, b[index]]
change_set << ['+', prefix + [index], b[index]]
end
return change_set
elsif b.size == 0
a.each_index do |index|
i = a.size - index - 1
change_set << ['-', i, a[i]]
change_set << ['-', prefix + [i], a[i]]
end
return change_set
end

links = lcs(a, b, opts)
links = lcs(a, b, options)

# yield common
yield links if block_given?
change_set.concat(yield links) if block_given?

# padding the end
links << [a.size, b.size]
Expand All @@ -201,12 +207,12 @@ def self.diff_array(a, b, options = {})

# remove from a, beginning from the end
(x > last_x + 1) and (x - last_x - 2).downto(0).each do |i|
change_set << ['-', last_y + i + 1, a[i + last_x + 1]]
change_set << ['-', prefix + [last_y + i + 1], a[i + last_x + 1]]
end

# add from b, beginning from the head
(y > last_y + 1) and 0.upto(y - last_y - 2).each do |i|
change_set << ['+', last_y + i + 1, b[i + last_y + 1]]
change_set << ['+', prefix + [last_y + i + 1], b[i + last_y + 1]]
end

# update flags
Expand Down
8 changes: 3 additions & 5 deletions lib/hashdiff/lcs.rb
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,8 @@ module HashDiff
#
# caculate array difference using LCS algorithm
# http://en.wikipedia.org/wiki/Longest_common_subsequence_problem
def self.lcs(a, b, options = {})
opts = { :similarity => 0.8 }.merge!(options)

opts[:prefix] = "#{opts[:prefix]}[*]"
def self.lcs(a, b, options)
options = options.merge({:prefix => options[:prefix] + [-1]})
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a comment here why plus -1?


return [] if a.size == 0 or b.size == 0

Expand All @@ -19,7 +17,7 @@ def self.lcs(a, b, options = {})
(b_start..b_finish).each do |bi|
lcs[bi] = []
(a_start..a_finish).each do |ai|
if similar?(a[ai], b[bi], opts)
if similar?(a[ai], b[bi], options)
topleft = (ai > 0 and bi > 0)? lcs[bi-1][ai-1][1] : 0
lcs[bi][ai] = [:topleft, topleft + 1]
elsif
Expand Down
24 changes: 14 additions & 10 deletions lib/hashdiff/patch.rb
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,11 @@ module HashDiff
# Apply patch to object
#
# @param [Hash, Array] obj the object to be patched, can be an Array or a Hash
# @param [Array] changes e.g. [[ '+', 'a.b', '45' ], [ '-', 'a.c', '5' ], [ '~', 'a.x', '45', '63']]
# @param [Array] changes with either string paths or array paths:
# [[ '+', 'a.b', '45' ], [ '-', 'a.c', '5' ], [ '~', 'a.x', '45', '63']]
# [[ '+', ['a', 'b'], '45' ], [ '-', ['a', 'c'], '5' ], [ '~', ['a', 'x'], '45', '63']]
# @param [Hash] options supports following keys:
# * :delimiter (String) ['.'] delimiter string for representing nested keys in changes array
# * :delimiter (String) ['.'] delimiter string for representing nested keys in changes array, ignored for array paths
#
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine just to remove delimiter.

# @return the object after patch
#
Expand All @@ -17,19 +19,19 @@ def self.patch!(obj, changes, options = {})
delimiter = options[:delimiter] || '.'

changes.each do |change|
parts = decode_property_path(change[1], delimiter)
parts = change[1].is_a?(Array) ? change[1] : decode_property_path(change[1], delimiter)
last_part = parts.last

parent_node = node(obj, parts[0, parts.size-1])

if change[0] == '+'
if last_part.is_a?(Integer)
if parent_node.is_a?(Array)
parent_node.insert(last_part, change[2])
else
parent_node[last_part] = change[2]
end
elsif change[0] == '-'
if last_part.is_a?(Integer)
if parent_node.is_a?(Array)
parent_node.delete_at(last_part)
else
parent_node.delete(last_part)
Expand All @@ -45,9 +47,11 @@ def self.patch!(obj, changes, options = {})
# Unpatch an object
#
# @param [Hash, Array] obj the object to be unpatched, can be an Array or a Hash
# @param [Array] changes e.g. [[ '+', 'a.b', '45' ], [ '-', 'a.c', '5' ], [ '~', 'a.x', '45', '63']]
# @param [Array] changes with either string paths or array paths:
# [[ '+', 'a.b', '45' ], [ '-', 'a.c', '5' ], [ '~', 'a.x', '45', '63']]
# [[ '+', ['a', 'b'], '45' ], [ '-', ['a', 'c'], '5' ], [ '~', ['a', 'x'], '45', '63']]
# @param [Hash] options supports following keys:
# * :delimiter (String) ['.'] delimiter string for representing nested keys in changes array
# * :delimiter (String) ['.'] delimiter string for representing nested keys in changes array, ignored for array paths
#
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can just remove the delimiter and break compatibility.

# @return the object after unpatch
#
Expand All @@ -56,19 +60,19 @@ def self.unpatch!(obj, changes, options = {})
delimiter = options[:delimiter] || '.'

changes.reverse_each do |change|
parts = decode_property_path(change[1], delimiter)
parts = change[1].is_a?(Array) ? change[1] : decode_property_path(change[1], delimiter)
last_part = parts.last

parent_node = node(obj, parts[0, parts.size-1])

if change[0] == '+'
if last_part.is_a?(Integer)
if parent_node.is_a?(Array)
parent_node.delete_at(last_part)
else
parent_node.delete(last_part)
end
elsif change[0] == '-'
if last_part.is_a?(Integer)
if parent_node.is_a?(Array)
parent_node.insert(last_part, change[2])
else
parent_node[last_part] = change[2]
Expand Down
Loading