Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

release-20.2: kvcoord: fix rangefeed retries on transport errors #67024

Merged

Commits on Jun 30, 2021

  1. kvcoord: fix rangefeed retries on transport errors

    `DistSender.RangeFeed()` was meant to retry transport errors after
    refreshing the range descriptor (invalidating the cached entry).
    However, due to an incorrect error type check (`*sendError` vs
    `sendError`), these errors failed the range feed without invalidating
    the cached range descriptor.
    
    This was particularly severe in cases where a large number of nodes had
    been decommissioned, where some stale range descriptors on some nodes
    contained only decommissioned nodes. Since change feeds set up range
    feeds across many nodes and ranges in the cluster, they are likely to
    encounter these decommissioned nodes and return an error -- and since
    the descriptor cache wasn't invalidated they would keep erroring until
    the nodes were restarted such that the caches were flushed (often
    requiring a full cluster restart).
    
    Release note (bug fix): Change feeds now properly invalidate cached
    range descriptors and retry when encountering decommissioned nodes.
    erikgrinaker committed Jun 30, 2021
    Configuration menu
    Copy the full SHA
    fd7d9b7 View commit details
    Browse the repository at this point in the history