Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

services/horizon: Protect 'currentState' variable using Mutex to prevent race condition. #4889

Merged

Conversation

urvisavla
Copy link
Contributor

PR Checklist

PR Structure

  • This PR has reasonably narrow scope (if not, break it down into smaller PRs).
  • This PR avoids mixing refactoring changes with feature changes (split into two PRs
    otherwise).
  • This PR's title starts with name of package that is most changed in the PR, ex.
    services/friendbot, or all or doc if the changes are broad or impact many
    packages.

Thoroughness

  • This PR adds tests for the most critical parts of the new functionality or fixes.
  • I've updated any docs (developer docs, .md
    files, etc... affected by this change). Take a look in the docs folder for a given service,
    like this one.

Release planning

  • I've updated the relevant CHANGELOG (here for Horizon) if
    needed with deprecations, added features, breaking changes, and DB schema changes.
  • I've decided if this PR requires a new major/minor version according to
    semver, or if it's mainly a patch change. The PR is targeted at the next
    release branch if it's not a patch change.

What

Protect the currentState variable using a Mutex to avoid a race condition caused by reading the currentState variable from UpdateCoreLedgerState() in one goroutine while updating it in the ingestion loop from another goroutine.
Since the ingestion loop runs only every few seconds and UpdateCoreLedgerState() executes once per second, the performance impact is expected to be minimal.

Why

see #4888

@urvisavla urvisavla force-pushed the 4888/integration-test-race-condition branch from 5ac40f1 to b8ddf99 Compare June 1, 2023 19:36
@urvisavla urvisavla changed the title Protect 'currentState' variable using Mutex to prevent race condition. services/horizon: Protect 'currentState' variable using Mutex to prevent race condition. Jun 1, 2023
@urvisavla urvisavla marked this pull request as ready for review June 1, 2023 20:03
@urvisavla
Copy link
Contributor Author

@tsachiherman I'm not sure how to test this race condition in unit test but the integrations test with -race seem to be passing.

@urvisavla urvisavla self-assigned this Jun 1, 2023
@tsachiherman
Copy link
Contributor

tsachiherman commented Jun 1, 2023

@tsachiherman I'm not sure how to test this race condition in unit test but the integrations test with -race seem to be passing.

I was hoping for something in the lines of

func TestCurrentStateRaceCondition(t *testing.T) {
	config := Config{
		CoreSession:              &db.Session{DB: &sqlx.DB{}},
		HistorySession:           &db.Session{DB: &sqlx.DB{}},
		DisableStateVerification: true,
		HistoryArchiveURLs:       []string{"https://history.stellar.org/prd/core-live/core_live_001"},
		CheckpointFrequency:      64,
	}

	sys, _ := NewSystem(config)
	s := sys.(*system)

	timer := time.NewTimer(2000 * time.Millisecond)
	getCh := make(chan bool, 1)
	doneCh := make(chan bool, 1)
	go func() {
		var cur stopState
		for range getCh {
			s.runStateMachine(cur)
		}
		close(doneCh)
	}()
forloop:
	for {
		s.GetCurrentState()
		select {
		case <-timer.C:
			break forloop
		default:
		}
		getCh <- true
	}
	close(getCh)
	<-doneCh
}

i.e. emulate what happened in this use case and show that the unit test ( which used to fail ) isn't failing anymore.

@urvisavla urvisavla merged commit b060996 into stellar:master Jun 5, 2023
urvisavla added a commit that referenced this pull request Jun 5, 2023
urvisavla added a commit that referenced this pull request Jun 5, 2023
* services/horizon: Suppress Core timeout error (#4860)
* services/horizon: Protect 'currentState' variable using Mutex to prevent race condition. (#4889)
* services/horizon: Modify the tests due to changes in the Begin function signature.
tsachiherman added a commit that referenced this pull request Jun 13, 2023
* updated core git ref for tests (#4879)

* update toml to support LIMIT_TX_QUEUE_SOURCE_ACCOUNT (#4882)

* horizon/ingest/processors: SAC storage entry by different key name (#4884)

* services/horizon: Reenable InvokeHostFunction integration tests (#4887)

* horizon: 4446/supress core timeout error (#4894)

* services/horizon: Suppress Core timeout error (#4860)
* services/horizon: Protect 'currentState' variable using Mutex to prevent race condition. (#4889)
* services/horizon: Modify the tests due to changes in the Begin function signature.

* xdr: changes for auth and removal of mutli-invoke (wip)

* services/horizon: Reenable SAC integration tests (#4891)

* services/horizon: optionally add soroban-rpc to integration tests (#4892)

* Update core image

* Update stellar-xdr next commit

* Update horizon ingestion

* update txnbuild

* Updates

* Update fmt

* fix txnbuild tests

* updates

* Update horizon tests

* add ingest test

* Update operation processor test

* Update tests

* Update tests

* Remove sac test changes to resolve merge conflicts for now

* Formatting

* 4902 Add mutex for concurrent access in GetLatestLedgerSequence (#4903)

* Update tests

* Update sac test

* update tests

---------

Co-authored-by: shawn <sreuland@users.noreply.github.com>
Co-authored-by: Tsachi Herman <24438559+tsachiherman@users.noreply.github.com>
Co-authored-by: Alfonso Acosta <alfonso@stellar.org>
Co-authored-by: urvisavla <urvi.savla@stellar.org>
@tsachiherman tsachiherman mentioned this pull request Jun 13, 2023
7 tasks
@urvisavla urvisavla linked an issue Jun 15, 2023 that may be closed by this pull request
tsachiherman added a commit that referenced this pull request Jul 3, 2023
* all: enforce simplified Golang code (#4852)

* Update completed sprint on issue/pr closed (#4857)

* Bump core image to latest stable release v19.10.0

* Add a simple test for asset case sorting in ascii (#4876)

* services/horizon: Suppress Core timeout error (#4860)

Suppress Core timeout error when ingestion state machine is in build state.

* Update CHANGELOG.md for latest release (#4828)

* Bump core image to latest release v19.11.0 (#4885)

* services/horizon: Protect 'currentState' variable using Mutex to prevent race condition. (#4889)

* services/horizon: Update default for  --captive-core-use-db to true (#4877)

* 4856: Update default for  --captive-core-use-db to true
* Update CHANGELOG.md

* services/horizon: Improve error handling for when stellar-core crashes (#4893)

* Parse LIMIT_TX_QUEUE_SOURCE_ACCOUNT in core config

* updated changelog for 2.26.0 release notes

* Pinning and updates golang and ubuntu images

* services/horizon: Fix ledger endpoint url in HAL (#4928)

* Goreplay middleware (#4932)

* tools/goreplay-middleware: Add goreplay middleware
* Fix linter errors

---------
Co-authored-by: Bartek Nowotarski <bartek@nowotarski.info>

* all: Fix improper use of errors.Wrap (#4926)

* all: Fix improper use of errors.Wrap

`errors.Wrap` method returns nil if the first argument passed is also nil.
If `errors.Wrap` is copied from a condition like `if err != nil` to another
one which  also returns `errors.Wrap` but does not overwrite `err` before
the returned value will always be `nil`.

* Update services/horizon/internal/db2/history/claimable_balances.go

Co-authored-by: George <Shaptic@users.noreply.github.com>

---------

Co-authored-by: George <Shaptic@users.noreply.github.com>
Co-authored-by: Tsachi Herman <24438559+tsachiherman@users.noreply.github.com>

* fix apt repo reference to focal now (#4929)

* fixed go fmt on bindata

* fixed merge conflict snippet

* fixed manual merge commit omition

---------

Co-authored-by: Alfonso Acosta <alfonso@stellar.org>
Co-authored-by: Paul Bellamy <paul@stellar.org>
Co-authored-by: Mehmet <119539688+mbsdf@users.noreply.github.com>
Co-authored-by: mlo <marta.lohova@gmail.com>
Co-authored-by: urvisavla <urvi.savla@stellar.org>
Co-authored-by: stellarsaur <126507441+stellarsaur@users.noreply.github.com>
Co-authored-by: Molly Karcher <molly@stellar.org>
Co-authored-by: Bartek Nowotarski <bartek@nowotarski.info>
Co-authored-by: George <Shaptic@users.noreply.github.com>
Co-authored-by: Tsachi Herman <24438559+tsachiherman@users.noreply.github.com>
tsachiherman added a commit that referenced this pull request Jul 11, 2023
* all: enforce simplified Golang code (#4852)

* Update completed sprint on issue/pr closed (#4857)

* Bump core image to latest stable release v19.10.0

* Add a simple test for asset case sorting in ascii (#4876)

* services/horizon: Suppress Core timeout error (#4860)

Suppress Core timeout error when ingestion state machine is in build state.

* Update CHANGELOG.md for latest release (#4828)

* Bump core image to latest release v19.11.0 (#4885)

* services/horizon: Protect 'currentState' variable using Mutex to prevent race condition. (#4889)

* services/horizon: Update default for  --captive-core-use-db to true (#4877)

* 4856: Update default for  --captive-core-use-db to true
* Update CHANGELOG.md

* xdr: changes for auth and removal of mutli-invoke (wip) (#4900)

* updated core git ref for tests (#4879)

* update toml to support LIMIT_TX_QUEUE_SOURCE_ACCOUNT (#4882)

* horizon/ingest/processors: SAC storage entry by different key name (#4884)

* services/horizon: Reenable InvokeHostFunction integration tests (#4887)

* horizon: 4446/supress core timeout error (#4894)

* services/horizon: Suppress Core timeout error (#4860)
* services/horizon: Protect 'currentState' variable using Mutex to prevent race condition. (#4889)
* services/horizon: Modify the tests due to changes in the Begin function signature.

* xdr: changes for auth and removal of mutli-invoke (wip)

* services/horizon: Reenable SAC integration tests (#4891)

* services/horizon: optionally add soroban-rpc to integration tests (#4892)

* Update core image

* Update stellar-xdr next commit

* Update horizon ingestion

* update txnbuild

* Updates

* Update fmt

* fix txnbuild tests

* updates

* Update horizon tests

* add ingest test

* Update operation processor test

* Update tests

* Update tests

* Remove sac test changes to resolve merge conflicts for now

* Formatting

* 4902 Add mutex for concurrent access in GetLatestLedgerSequence (#4903)

* Update tests

* Update sac test

* update tests

---------

Co-authored-by: shawn <sreuland@users.noreply.github.com>
Co-authored-by: Tsachi Herman <24438559+tsachiherman@users.noreply.github.com>
Co-authored-by: Alfonso Acosta <alfonso@stellar.org>
Co-authored-by: urvisavla <urvi.savla@stellar.org>

* services/horizon: Improve error handling for when stellar-core crashes (#4893)

* Parse LIMIT_TX_QUEUE_SOURCE_ACCOUNT in core config

* updated changelog for 2.26.0 release notes

* Pinning and updates golang and ubuntu images

* xdr: update per 7b403105 (#4923)

* Simplify ScError.Equals

* Add missing case for ScVal.Equals

* Add LedgerEntry.SetContractData and LedgerEntry.SetContractCode methods

* xdr updates

* fix formatting

* update

* update transfer_event_xdr.bin

* update

* update

* update

* bugfix

* update

* update

* update

* handle nil entries

* fix few linting issues

* additional linting

* Fix missing err check

* fix missing nil check

* Minor code checker fix

* Fix minor naming

* fix hard-coded values.

* warn and don't set expiration ledger

* Warn and don't set expirationLedger

* update new xdr

* update per peer review.

* Add effect for bumpFootprintExpirationOp

* Fix govet-caught bug

* update per 7b403105788e33044e089c4c2f957df8ddabaca8

* update fmt

* fix unit test

* Update core image

---------

Co-authored-by: Paul Bellamy <paul@stellar.org>
Co-authored-by: Simon Chow <simon.chow@stellar.org>

* services/horizon: Fix ledger endpoint url in HAL (#4928)

* Don't crash on LedgerCloseMetaV2 ingestion (#4927)

* xdr: update per xdr version e372df9 (#4930)

* update

* update

* update

* bugfix

* update

* update per linter

* update per peer review.

* Update to use 19.11.1-1349.fae91b092.focal~soroban

* Goreplay middleware (#4932)

* tools/goreplay-middleware: Add goreplay middleware
* Fix linter errors

---------
Co-authored-by: Bartek Nowotarski <bartek@nowotarski.info>

* Update to final core preview 10 image

* all: Fix improper use of errors.Wrap (#4926)

* all: Fix improper use of errors.Wrap

`errors.Wrap` method returns nil if the first argument passed is also nil.
If `errors.Wrap` is copied from a condition like `if err != nil` to another
one which  also returns `errors.Wrap` but does not overwrite `err` before
the returned value will always be `nil`.

* Update services/horizon/internal/db2/history/claimable_balances.go

Co-authored-by: George <Shaptic@users.noreply.github.com>

---------

Co-authored-by: George <Shaptic@users.noreply.github.com>
Co-authored-by: Tsachi Herman <24438559+tsachiherman@users.noreply.github.com>

* fix apt repo reference to focal now (#4929)

* LedgerChangeReader: Support State Expiration & Eviction (#4941)

* Simplify ScError.Equals

* Add missing case for ScVal.Equals

* Add LedgerEntry.SetContractData and LedgerEntry.SetContractCode methods

* update generated xdr to 0f5e556

* Add LedgerCloseMetaV2 support

* Update ledgerTransaction.GetOperationEvents

* Make LedgerChangeReader emit evictions

* Fixing up after merge

* Add test for LedgerChangeReader extensions and evictions

* Fix typo

* Add xdr.LedgerEntryData.ExpirationLedgerSeq helper

* review feedback

* merge latest master to soroban-xdr-next-next (#4937)

* all: enforce simplified Golang code (#4852)

* Update completed sprint on issue/pr closed (#4857)

* Bump core image to latest stable release v19.10.0

* Add a simple test for asset case sorting in ascii (#4876)

* services/horizon: Suppress Core timeout error (#4860)

Suppress Core timeout error when ingestion state machine is in build state.

* Update CHANGELOG.md for latest release (#4828)

* Bump core image to latest release v19.11.0 (#4885)

* services/horizon: Protect 'currentState' variable using Mutex to prevent race condition. (#4889)

* services/horizon: Update default for  --captive-core-use-db to true (#4877)

* 4856: Update default for  --captive-core-use-db to true
* Update CHANGELOG.md

* services/horizon: Improve error handling for when stellar-core crashes (#4893)

* Parse LIMIT_TX_QUEUE_SOURCE_ACCOUNT in core config

* updated changelog for 2.26.0 release notes

* Pinning and updates golang and ubuntu images

* services/horizon: Fix ledger endpoint url in HAL (#4928)

* Goreplay middleware (#4932)

* tools/goreplay-middleware: Add goreplay middleware
* Fix linter errors

---------
Co-authored-by: Bartek Nowotarski <bartek@nowotarski.info>

* all: Fix improper use of errors.Wrap (#4926)

* all: Fix improper use of errors.Wrap

`errors.Wrap` method returns nil if the first argument passed is also nil.
If `errors.Wrap` is copied from a condition like `if err != nil` to another
one which  also returns `errors.Wrap` but does not overwrite `err` before
the returned value will always be `nil`.

* Update services/horizon/internal/db2/history/claimable_balances.go

Co-authored-by: George <Shaptic@users.noreply.github.com>

---------

Co-authored-by: George <Shaptic@users.noreply.github.com>
Co-authored-by: Tsachi Herman <24438559+tsachiherman@users.noreply.github.com>

* fix apt repo reference to focal now (#4929)

* fixed go fmt on bindata

* fixed merge conflict snippet

* fixed manual merge commit omition

---------

Co-authored-by: Alfonso Acosta <alfonso@stellar.org>
Co-authored-by: Paul Bellamy <paul@stellar.org>
Co-authored-by: Mehmet <119539688+mbsdf@users.noreply.github.com>
Co-authored-by: mlo <marta.lohova@gmail.com>
Co-authored-by: urvisavla <urvi.savla@stellar.org>
Co-authored-by: stellarsaur <126507441+stellarsaur@users.noreply.github.com>
Co-authored-by: Molly Karcher <molly@stellar.org>
Co-authored-by: Bartek Nowotarski <bartek@nowotarski.info>
Co-authored-by: George <Shaptic@users.noreply.github.com>
Co-authored-by: Tsachi Herman <24438559+tsachiherman@users.noreply.github.com>

* Refactor `xdr.LedgerEntry.LedgerKey` method (#4942)

* Simplify ScError.Equals

* Add missing case for ScVal.Equals

* Add LedgerEntry.SetContractData and LedgerEntry.SetContractCode methods

* update generated xdr to 0f5e556

* Add LedgerCloseMetaV2 support

* Update ledgerTransaction.GetOperationEvents

* Make LedgerChangeReader emit evictions

* Fixing up after merge

* Add test for LedgerChangeReader extensions and evictions

* Fix typo

* Add xdr.LedgerEntryData.ExpirationLedgerSeq helper

* review feedback

* Refactor LedgerEntry.LedgerKey method to avoid panics, and only have a single copy

* Fix govet.sh

* Fixing govet

* Fix bug

* Remove unneeded code bits

* s/marshalling/marshaling

* Review feedback

* integration tests: Add horizon test support for new bump/restore footprint ops (#4944)

* use the new core image for local integration tests

* Add bump/restoreFootprint ops to the horizon reingestion integration tests

* services/horizon: Remove command line flag --remote-captive-core-url (#4940)

* txnbuild: Make bump and restore footprint soroban operations (#4946)

* integration tests: fix integration tests for preview 10 (#4938)

this pr fixes the invokehostfunciton_test only, sac_test and contract event tests will be addressed in separate, follow-onw pr.

* fix sac tests for preview 10 data model (#4951)

* fix merge issue.

---------

Co-authored-by: Alfonso Acosta <alfonso@stellar.org>
Co-authored-by: Paul Bellamy <paul@stellar.org>
Co-authored-by: Mehmet <119539688+mbsdf@users.noreply.github.com>
Co-authored-by: mlo <marta.lohova@gmail.com>
Co-authored-by: urvisavla <urvi.savla@stellar.org>
Co-authored-by: stellarsaur <126507441+stellarsaur@users.noreply.github.com>
Co-authored-by: chowbao <simon.chow765@gmail.com>
Co-authored-by: shawn <sreuland@users.noreply.github.com>
Co-authored-by: Molly Karcher <molly@stellar.org>
Co-authored-by: Shawn Reuland <shawn@stellar.org>
Co-authored-by: Simon Chow <simon.chow@stellar.org>
Co-authored-by: Bartek Nowotarski <bartek@nowotarski.info>
Co-authored-by: George <Shaptic@users.noreply.github.com>
tsachiherman added a commit that referenced this pull request Jul 28, 2023
* Bump golang.org/x/text from 0.3.7 to 0.3.8

Bumps [golang.org/x/text](https://github.com/golang/text) from 0.3.7 to 0.3.8.
- [Release notes](https://github.com/golang/text/releases)
- [Commits](golang/text@v0.3.7...v0.3.8)

---
updated-dependencies:
- dependency-name: golang.org/x/text
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

* all: enforce simplified Golang code (#4852)

* Update completed sprint on issue/pr closed (#4857)

* Bump core image to latest stable release v19.10.0

* Add a simple test for asset case sorting in ascii (#4876)

* services/horizon: Suppress Core timeout error (#4860)

Suppress Core timeout error when ingestion state machine is in build state.

* Update CHANGELOG.md for latest release (#4828)

* Bump core image to latest release v19.11.0 (#4885)

* services/horizon: Protect 'currentState' variable using Mutex to prevent race condition. (#4889)

* services/horizon: Update default for  --captive-core-use-db to true (#4877)

* 4856: Update default for  --captive-core-use-db to true
* Update CHANGELOG.md

* services/horizon: Improve error handling for when stellar-core crashes (#4893)

* Parse LIMIT_TX_QUEUE_SOURCE_ACCOUNT in core config

* updated changelog for 2.26.0 release notes

* Pinning and updates golang and ubuntu images

* services/horizon: Fix ledger endpoint url in HAL (#4928)

* Goreplay middleware (#4932)

* tools/goreplay-middleware: Add goreplay middleware
* Fix linter errors

---------
Co-authored-by: Bartek Nowotarski <bartek@nowotarski.info>

* all: Fix improper use of errors.Wrap (#4926)

* all: Fix improper use of errors.Wrap

`errors.Wrap` method returns nil if the first argument passed is also nil.
If `errors.Wrap` is copied from a condition like `if err != nil` to another
one which  also returns `errors.Wrap` but does not overwrite `err` before
the returned value will always be `nil`.

* Update services/horizon/internal/db2/history/claimable_balances.go

Co-authored-by: George <Shaptic@users.noreply.github.com>

---------

Co-authored-by: George <Shaptic@users.noreply.github.com>
Co-authored-by: Tsachi Herman <24438559+tsachiherman@users.noreply.github.com>

* fix apt repo reference to focal now (#4929)

* services/horizon: Remove command line flag --remote-captive-core-url (#4940)

* Small changes - 1

* Remove ingestion filtering flag

* services/horizon: Fix Horizon connectivity to core in standalone docker (#4956)

The default value of STELLAR_CORE_URL (localhost:11626) in standalone network mode doesn't work. We need to explictly set STELLAR_CORE_URL to http://host.docker.internal:11626, to allow Horizon to access the host container's port to connect with the core container.

* Bump core image to latest release v19.12.0 (#4953)

* Add new function HideFlag

* Fix lint warnings

* services/horizon: Add optional configuration parameter NETWORK (#4949)

The PR introduces a new optional Horizon configuration parameter called NETWORK. This parameter allows users to specify the desired Stellar network, pubnet or testnet. When the NETWORK parameter is set, Horizon automatically adjusts the remaining configuration settings and generates the corresponding captive core config file.

* Update flags.go

* Add IsHidden variable

* Update IsHidden

* Remove individual IsHidden option

* Change to Hidden

* Add tests for ingestion-filtering cmd flag

* Make changes - 1

* Make changes - 2

* Remove race condition in test

* Update command_line_args_test.go

* Update command_line_args_test.go

* Update command_line_args_test.go

* Update command_line_args_test.go

* Services/horizon: Skip querying stellar-core on 127.0.0.1 when Horizon is in build state (#4977)

* Resolve Horizon CI failures caused by the failure to install the right version of libc++  (#4986)

* Update command_line_args_test.go

Update command_line_args_test.go

Remove command_line_args_test.go

* Extend timeout for integration tests

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Paul Bellamy <paul@stellar.org>
Co-authored-by: Mehmet <119539688+mbsdf@users.noreply.github.com>
Co-authored-by: mlo <marta.lohova@gmail.com>
Co-authored-by: urvisavla <urvi.savla@stellar.org>
Co-authored-by: stellarsaur <126507441+stellarsaur@users.noreply.github.com>
Co-authored-by: Molly Karcher <molly@stellar.org>
Co-authored-by: Shawn Reuland <shawn@stellar.org>
Co-authored-by: shawn <sreuland@users.noreply.github.com>
Co-authored-by: Bartek Nowotarski <bartek@nowotarski.info>
Co-authored-by: George <Shaptic@users.noreply.github.com>
Co-authored-by: Tsachi Herman <24438559+tsachiherman@users.noreply.github.com>
Co-authored-by: Aditya Vyas <aditya.vyas@stellar.org>
Co-authored-by: Aditya Vyas <adityavyas17@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

services/horizon: Race condition found in integration tests
3 participants