-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add script to quickly reproduce issue 14370 #14912
Closed
Closed
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -129,6 +129,27 @@ gofail-disable: install-gofail | |
install-gofail: | ||
cd tools/mod; go install go.etcd.io/gofail@${GOFAIL_VERSION} | ||
|
||
# Reproduce historical issues | ||
|
||
.PHONY: reproduce-issue14370 | ||
reproduce-issue14370: ./bin/etcd-v3.5.4-failpoints | ||
cp ./bin/etcd-v3.5.4-failpoints ./bin/etcd | ||
GO_TEST_FLAGS='-v --run=TestLinearizability/Issue14370 --count 100 --failfast' make test-linearizability | ||
|
||
./bin/etcd-v3.5.4-failpoints: | ||
rm -rf /tmp/etcd-release-v3.5.4/ | ||
mkdir -p /tmp/etcd-release-v3.5.4/ | ||
cd /tmp/etcd-release-v3.5.4/; \ | ||
git clone https://github.com/etcd-io/etcd.git .; \ | ||
git checkout v3.5.4; \ | ||
go get go.etcd.io/gofail@${GOFAIL_VERSION}; \ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think that lines L145-L149,L151 are generic code that could have it's 'makefile' it's
|
||
(cd server; go get go.etcd.io/gofail@${GOFAIL_VERSION}); \ | ||
(cd etcdctl; go get go.etcd.io/gofail@${GOFAIL_VERSION}); \ | ||
(cd etcdutl; go get go.etcd.io/gofail@${GOFAIL_VERSION}); \ | ||
FAILPOINTS=true ./build; | ||
mkdir -p ./bin | ||
cp /tmp/etcd-release-v3.5.4/bin/etcd ./bin/etcd-v3.5.4-failpoints | ||
|
||
# Cleanup | ||
|
||
clean: | ||
|
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two comments:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ad 1. It will work as both gofail binary and libraries are from same version taken from main branch
GOFAIL_VERSION
.Ad 2. I can agree that we shouldn't put everything into root scripts, however I'm against adding this as a script. Makefile ensures that we keep code simple, which we will loose when we migrate everything to bash scripts. See 700 line
./scripts/tests.sh
.I think we our makefile is still pretty simple and readable. Splitting things too early is also a bad pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sooner or later, eventually we will bump new gofail version, and update all the existing failpoint names (e.g.
beforeCommit
-->backendBeforeCommit
), and also update linearizability test (e.g.backend/beforeCommit
-->backendBeforeCommit
). Will the script in this PR be still working at that time?FYI. I just created a tag v0.1.0 for gofail: https://github.com/etcd-io/gofail/releases/tag/v0.1.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with Benjamin:
- Makefile shouldn't cary project's history. It should contains up-to-date receipts to work with the current code.
We should answer what's the purpose of the code:
a) Just documentation - historical note how the problem was reproductionable in 3.5.4 ?
b) Potential regression test - that can be applied to up-to-date sources to make sure the similar scenario does not happens (or even can be used with bisect).
I prefer the latter.
I'm thinking about putting this code to
/tests/repros
that would lead to:/tests/repros/Issue14370.sh
it would make sense to code only i--local
mode to cover the current code.v3.5.4-repro
with cherry picked/tests/repros/Issue14370.sh
to have a historical repro with all the toolkit compatible with v3.5.4 (assuming toolkit longterm will change and we want to keep it working)Testing any revision since v3.5.4-repro forward is as simple as:
And this mitigates risks at changes at HEAD to the new toolkit will break the script working for the older revisions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Purpose of this code is to ensure that during development of linearizability testing, we don't loose ability to reproduce historical issues. We are testing 1to1 same scenarios on main branch, however due to nature of failure injection testing it's hard to guarantee those scenarios will maintain ability to reproduce those issues. Imagine that we have a issue that can be only reproduce under high volume of PUT requests, if during development of linearizability tests we change request time proportions we might no longer be able to reproduce this.
I use those scripts as a sanity checks during linearizability test development. Unfortunately we cannot automatically test this as issues don't have 100% reproducability as they are based on race condition. I hope that in future improvements to gofail will allow us to do that by using sleeps to always reproduce a race, but we are far from it.
Those scenarios are already running on main branch. No need for local mode.
Really don't like idea of maintaining large number of repro only tags.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes - I wrote local to stress it doesn't have a code to to
git checkout
of another branch intmp
dirI hope this will be not a large number. It's worth doing only for big issues.
Thinking what's the reliable path to replay on 3.5.4 when the main will be 3.7.x and making sure all the scripts from 3.7.x work well both: on 3.5.4 and on 3.7.4 ?
And if we don't want the repro tag... the only reliable alternative I see is:
Where
./tests/repros/repro_at.sh
have on orchestration of checkout ofref
but finally calls Issues14370.sh from current branch on thetmp
checked out code.