Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

upstream: fix PriorityStateManager indexing #3856

Merged
merged 3 commits into from
Jul 16, 2018
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion source/common/upstream/upstream_impl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -695,7 +695,8 @@ void PriorityStateManager::updateClusterPrioritySet(
HostVectorSharedPtr hosts(std::move(current_hosts));
LocalityWeightsMap empty_locality_map;
LocalityWeightsMap& locality_weights_map =
priority_state_.empty() ? empty_locality_map : priority_state_[priority].second;
priority_state_.size() > priority ? priority_state_[priority].second : empty_locality_map;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we get into this situation? Should there be some test that covers this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't actually know enough about the PriorityStateManager to say, but there was a test that was invoking this undefined behavior and getting lucky, I guess, because it continued to pass. I only saw it failing when I added the assert in 89f27cb. @dio, thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about that. Thanks, @akonradi for catching this. The relevant test is when we clearing endpoints in this test case:

TEST_F(EdsTest, RemoveUnreferencedLocalities) {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we keep the assertion?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it seems like there should be some test that should fail w/o this fix either via assertion or something else?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some digging. We caught this during the Google import because our std::vector::operator[] implementation does a bounds check, which is not in the spec. Unfortunately, there's no clang sanitizer we can use to do this here since the std::vector implementation is in code, and indexing out-of-bounds doesn't trigger an ASAN violation if the vector allocated extra space.

It looks like both libstdc++ and libc++ have their own ways to enable bounds checks. I tried enabling _GLIBCXX_DEBUG locally but it causes compile errors elsewhere that I don't want to fix in this PR. Moving forward, we should probably define both _GLIBCXX_DEBUG and _LIBCPP_DEBUG for ASAN builds, but that's going to be a future PR. Happy to open an issue if that sounds reasonable.

For now, I've added back in the ASSERT. It feels a little redundant now that the bug has been fixed but it sounds like that's preferred over doing nothing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK that's fine. There is another issue already opened on using _GLIBXX_DEBUG: #2556

ASSERT(priority_state_.size() > priority || locality_weights_map.empty());
LocalityWeightsSharedPtr locality_weights;
std::vector<HostVector> per_locality;

Expand Down