Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correctly locate part range in which read data. #903

Merged
merged 1 commit into from
Jul 19, 2024
Merged

Conversation

mgautierfr
Copy link
Collaborator

@mgautierfr mgautierfr commented Jul 16, 2024

May fix #kiwix/kiwix-android#3827 (to check).
Fix kiwix/kiwix-android#3827

As describe in https://en.cppreference.com/w/cpp/algorithm/equal_range, equal_range is undefined behavior if bool(comp(elem, value)) does not imply !bool(comp(value, elem)).

This is the case here for exemple with :

  • value = Range{min:10, max:10}

  • elem = Range{min:10, max:11}

  • comp(value, elem)
    => value.min < elem.min && value.max <= elem.min
    => 10 < 10 && 10 <= 10
    => false && true
    => false

  • comp(elem, value)
    => elem.min < value.min && elem.max <= value.min
    => 10 < 10 && 11 <= 10
    => false && false
    => false

lower_bound and upper_bound don't have such requirement on comp.

@kelson42 kelson42 added this to the 9.2.3 milestone Jul 16, 2024
@mgautierfr mgautierfr marked this pull request as ready for review July 17, 2024 16:16
@kelson42
Copy link
Contributor

This PR has been confirmed to fix the big (on kiwix-android). Waiting @veloman-yunkan review to merge and release.

Copy link
Collaborator

@veloman-yunkan veloman-yunkan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered an alternative fix - make FileCompound skip 0-sized FileParts?

Comment on lines 90 to 99
#if ! defined(__APPLE__)
return equal_range(Range(offset, offset+size));
#else
// Workaround for https://github.com/openzim/libzim/issues/398
// Under MacOS the implementation of std::map::equal_range() makes
// assumptions about the properties of the key comparison function and
// abuses the std::map requirement that it must contain unique keys. As
// a result, when a map m is queried with an element k that is
// equivalent to more than one keys present in m,
// m.equal_range(k).first may be different from m.lower_bound(k) (the
// latter one returning the correct result).
const Range queryRange(offset, offset+size);
return {lower_bound(queryRange), upper_bound(queryRange)};
#endif // ! defined(__APPLE__)
const Range queryRange(offset, offset+size);
return {lower_bound(queryRange), upper_bound(queryRange)};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Regarding the commit message - it refers to the std::equal_range() algorithm whereas here std::map<K,T,C,A>::equal_range() member function is used which doesn't have the said restriction in the spec.

  2. It would be nice to preserve an adjusted comment explaining why lower_bound() + upper_bound() is used instead of equal_range().

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the commit message - it refers to the std::equal_range() algorithm whereas here std::map<K,T,C,A>::equal_range() member function is used which doesn't have the said restriction in the spec.

Indeed. But from https://stackoverflow.com/questions/67042750/should-setequal-range-return-pair-setlower-bound-setupper-bound it seems we have the same issue.

And this is confirmed with https://gist.github.com/mgautierfr/dd2cb595cec4bb9c569244187c88c00d

It would be nice to preserve an adjusted comment explaining why lower_bound() + upper_bound() is used instead of equal_range().

Added

As describe in https://en.cppreference.com/w/cpp/algorithm/equal_range,
equal_range is undefined behavior if `bool(comp(elem, value))` does not imply
`!bool(comp(value, elem))`.

This is the case here for exemple with :
- value = Range{min:10, max:10}
- elem = Range{min:10, max:11}

- comp(value, elem)
 => value.min < elem.min && value.max <= elem.min
 => 10 < 10 && 10 <= 10
 => false && true
 => false

- comp(elem, value)
 => elem.min < value.min && elem.max <= value.min
 => 10 < 10 && 11 <= 10
 => false && false
 => false

 `lower_bound` and `upper_bound` don't have such requirement on `comp`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants