Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: make algorand client persistent for test fixture #356

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

joe-p
Copy link
Contributor

@joe-p joe-p commented Dec 13, 2024

Proposed Changes

  • Restores algorand client to be persistent for a given fixture as it was in ad7dc8e

@joe-p
Copy link
Contributor Author

joe-p commented Dec 13, 2024

This is seemingly breaking tests inconsistently but not entirely sure why. I can investigate further but if anyone has some insight that would be appreciated. @robdmoore do you know why this was changed in the first place?

@neilcampbell
Copy link
Contributor

neilcampbell commented Dec 14, 2024

@joe-p Yeah that's why it was changed.
From memory it was due to parallel test execution resulting in a non thread safe like behaviour (obviously no threads in JS) on the AlgorandClient.

Because AlgorandClient is stateful, it's possible for a test to change the AlgorandClient internal state, whilst another test is also being executing. I'm assuming it's whilst an async function is being awaited or something like that.

@joe-p
Copy link
Contributor Author

joe-p commented Dec 14, 2024

Ok a couple of things

  1. This is a breaking change that was not mentioned in the release notes or migration guide
  2. This creates a noticeably worse developer experience when testing. For example, what needed to change in the reti tests is quite ugly (and also does not properly work):
    beforeEach(async () => {
        await fixture.beforeEach()
        // Propagate signers from any `beforeAll` calls
        fixture.algorand.account.setSigners(validatorMasterAlgorandClient.account)
        // Register the test account for this test with the validatorMasterClient
        validatorMasterAlgorandClient.setSignerFromAccount(fixture.context.testAccount)
        // Register any generated test accounts for this test with the validatorMasterClient
        const generator = fixture.context.generateAccount
        fixture.context.generateAccount = async (params) => {
            const account = await generator({ initialFunds: params.initialFunds, suppressLog: true })
            validatorMasterAlgorandClient.setSignerFromAccount(account)
            return account
        }
    })

As such we should try to fix this behavior.

In 5918017 I have fixed the problems caused by the race conditions, but the tests are failing locally for me because waitForIndexer seemingly does not resolve correctly. As noted in the comment, const waitForIndexer = () => new Promise((resolve) => setTimeout(resolve, 15_000)) will result in passing tests so we need to further debug waitForIndexer.

@joe-p
Copy link
Contributor Author

joe-p commented Dec 16, 2024

Once #358 is merged I can rebase this branch which should make it ready to merge

@neilcampbell
Copy link
Contributor

I'm honestly not sure that we should be caching the AlgorandClient between calls to beforeEach. Most of the time I want a completely fresh state for each test, as it generally helps with determinism.

Additionally with this change, there isn't a way to easily reset the AlgorandClient without creating a new AlgorandFixture.

Would it work in the reti case to run fixture.beforeEach using beforeAll, so beforeAll(localnet.beforeEach, 10_000)?

@joe-p
Copy link
Contributor Author

joe-p commented Dec 17, 2024

This is an undocumented breaking change that is blocking developers from updating. That alone I think is a good enough reason to revert it back. Developers I've talked (and myself) to expect the fixture to be persistent (like pytest fixtures) which is why this was made persistent in v6 in the first place. If we want to expose a way to generate a new client we can add a function to the fixture context.

@neilcampbell
Copy link
Contributor

Playing devils advocate, this change to revert to the v6 behaviour would introduce a breaking change to both v7 and v8, so could impact devs who have migrated from pre v6 versions to v7/8.

I just want to make it clear that in the current state (unless I'm missing something), developers can control the lifetime of the AlgorandClient instance, by deciding where they call beforeEach. If you want a singleton, you can use localnet.beforeEach() directly after the const localnet = algorandFixture(...). If you want per suite semantics you can use beforeAll(localnet.beforeEach) and if you want per test semantics you can use beforeEach(localnet.beforeEach). I still feel that per test semantics as the default is the safest option.

@lempira Any thoughts on this one?

@joe-p
Copy link
Contributor Author

joe-p commented Dec 17, 2024

I think v7/v8 is new enough that it is an acceptable change to make, especially considering it was undocumented. If we need to release as v9 then so be it. It's better to restore functionality than try to stay on v8.

I just want to make it clear that in the current state (unless I'm missing something), developers can control the lifetime of the AlgorandClient instance, by deciding where they call beforeEach. If you want a singleton, you can use localnet.beforeEach() directly after the const localnet = algorandFixture(...). If you want per suite semantics you can use beforeAll(localnet.beforeEach) and if you want per test semantics you can use beforeEach(localnet.beforeEach). I still feel that per test semantics as the default is the safest option.

The problem is that tests might be using other parts of the fixture, such as testAccount. There is also educational resources (ie. bootcamps) that use v6

@pbennett
Copy link

I started work on upgrading my NFD contracts and tests to algokit v8. I've wasted at least two days to no avail - it's completely broken and I've stopped entirely.
This isn't changing a few arguments to fix things (because v6->v7/v8 required touching basically EVERY single call) - this is wasting time over a broken framework.
The recent changes to v7/v8 were decided with no feedback from the people actually using these things. So far it seems like I'm the only one that's actually exercised these libraries and I'm telling you, this makes me want to abandon it all.
In some cases, the literal raw sdk would probably be more obvious and cleaner.

It just takes one look at that monstrosity of a before call that was added to Reti that Joe quoted above as part of v6->v7 integration to say that none of that has ANY business being in a test for end-user developers.
If that simple block isn't as big of a red flag as you can imagine, I don't know what to tell you. It shows something is very wrong.

If I create test accounts as a part of tests, often I'll have a series of tests that build upon state (because the state itself might be complex or time consuming to create - so its layered). Any accounts created, should be able to be signed anywhere after that point. In tests there's no reason to reset keys nor is there a risk. We're not testing vulnerabilities of in-memory key storage here guys.
There's no way signing should be isolated like this - at all. You could just had a 'signer' interface -have that be undefined by default and when undefined always use a single common 'default' global signer (which is how v6 worked?). ANY account that's generated or loaded from kmd localnet, etc. is available there for signing via ANY client instance. Need something different for whatever reason? override signer implementation. Then you're free to reset the world all you want.

@SilentRhetoric
Copy link
Contributor

A few comments, in no particular order:

  1. I would expect a fixture to persist across tests by default. Adding accounts to the client, as a specific example, does not seem like something that would break integration tests that run in parallel, as it is just adding keys to the "keyring" of the client, figuratively speaking. I have general concerns about how much state is opaquely managed by the AlgorandClient class.
  2. The suggested pattern of doing beforeAll(localnet.beforeEach) is unintuitive to me. I would expect to compose my beforeAll and beforeEach logic in my testing code, not to use beforeEach functionality baked into the AlgorandClient class.
  3. Algorand integration testing is challenging because it often requires composing together a combination of three approaches, and I think the Utils fixture should primarily enable the first one in its role as a testing fixture:
  • Persisted state, like an admin account that controls an app, "user" accounts, singleton smart contracts, etc.
  • Semi-persisted state, like a contract that experiences a series of interactions in a defined order
  • Atomic actions, like individual one-off interactions with a smart contract factory pattern

@lempira
Copy link
Contributor

lempira commented Dec 17, 2024

I haven't delved deep into the utils-ts code and tests, particularly for the v7/8 release, so take my comments and suggestions in that context.

From what I understand, the reason for removing the client's persistence from the fixture was to better control the client state when setting up and tearing down each test. If that's the case, then I agree with the general philosophy that it's better to have better control of a fixture test state over what the test may look like. The reason is that I have been burnt many times by flaky tests that work locally and don't in CI because they are usually run in multiple threads where the order of the tests usually matters because tests are trying to access or mutate the same state. I agree that the snippet that Joe added looks more gnarly, but we could make composable fixtures that abstract the most common setups so they don't look so complex when used in each test. It would also allow the reuse of those different setups as necessary.

I am surprised that client persistence (non-determinism) hasn't been an issue before. It sounds like the tests are indeed idempotent despite having that shared state. Perhaps this is because each account that is created in the tests is different and thus is isolated from other tests. This makes me less wary to revert the change as proposed in the PR. I agree with Joe since this testing fixture is used in many places and seems to have effects on external repos like RETI; this should have been documented as part of the major breaking changes. Given the importance of RETI to staking rewards, I don't think we shouldn't add additional friction to the deployment, unless it's something critical, which I don't deem this to be.

Suggested path forward:

  • Add persistence back to fixtures (i.e. merge this PR)
  • Have a separate meeting to come to an agreement on idempotent tests and fixtures as a general philosophy for all TS testing going forward. Depending on decision, we could build that into a future major version of utils. (I say TS testing, but it's really for all testing. Pytest makes this much easier, IMO, through the use of fixtures scope than the "before" in JS) I do have strong opinions about idempotent tests, as I am sure others do, and this deserves its own discussion outside of this PR.

Other questions re: this PR

  • If we merge the persistence back in, won't we also have to change all the utils tests that use the fixture as well?
  • I am not understanding why RETI tests have to be changed with this upgrade? I am guessing that the RETI code uses or has a dep on this fixture. If that's the case, why wouldn't RETI create its own fixtures for its tests?

@lempira
Copy link
Contributor

lempira commented Dec 18, 2024

I was playing around this a little more. I merged #358 to this branch, and the tests passed, but I am going to amend my suggested path forward.

One thing I noticed with this approach is that it ties you into having the AlgorandClient run in the global scope of tests. If you did want to have a test run standalone so that it sets up and tears down in a beforeEach, for example, you wouldn't be able to use this fixture because it would have the state and particularly the previous transaction logs from the previously run tests. It would be better to have the flexibility to determine the scope of the fixture that is appropriate for your test. This is particularly needed now since v7/v8 introduced the stateful AlgorandClient.

Perhaps the 'beforeEach' fn in the current implement is a misnomer and could be replaced with something that would let the dev know that you can use that in whatever scope you want. I think that's what Neil was suggesting abovebeforeAll(localnet.beforeEach).Instead of the beforeEach, it could be like beforeAll(localnet.setUp), showing that it's clear that beforeAll is determining the scope.

Since this thread is long, I am going to set up a meeting on Thursday morning to discuss this. It might be easier to talk through the details.

@robdmoore
Copy link
Contributor

robdmoore commented Dec 18, 2024

Finally had a chance to look into this.

Apart from renames, from the v6 version with AlgorandClient the following were the key changes that were made in relation to algorandFixture:

  • 05b8a7 - algorandClient was added to the context created in beforeEach so that you could destructure it in tests more conveniently
  • b2c0f9 - A fix was made to resolve a behavioural issue where algorandClient was globally cached, which broke the transaction logs and indexer waiting behaviour - this is because previously transactionLogger was recreated every beforeEach, but algorandClient was only set the first time it was executed (with the transactionLogger from that first test run)

An inadvertent change from this was what is raised here, which is technically that was a different behaviour to what was in v6 so it should have been mentioned in the release notes. This was unintentional and was simply missed given the sheer number of changes being made along with the fact that beforeEach was always meant to set up a fresh context for each test so it just naturally felt like the way it should be.

It's fair that there are situations where it may be useful to share an AlgorandClient across test runs, noting there are some downsides in terms of test isolation, but if you know what you are doing then it's fine as long as things like testAccount get regenerated each test execution.

To David's point, both behaviours are potentially useful in different circumstances so what makes the most sense is what he suggested - that this is a configurable behaviour. As such, I've put up a pull request #359 which has an implementation that provides this functionality.

It does this by allowing a new configuration to algorandFixture to control the scoping of algorand client and transaction logger. To get the v6 like behaviour (noting this implementation doesn't have the bug that was in v6 where the transaction logger in the context was incorrect in subsequent test runs, so everything runs as you would expect) with this new change you can do:

describe('MY MODULE', () => {
  const fixture = algorandFixture({ algorandScope: 'fixture' })
  beforeEach(fixture.beforeEach, 10_000)

  test('test1', async () => {
    const { algorand, transactionLogger, testAccount } = fixture.context
    // ...
  })

  test('test2', async () => {
    const { algorand, transactionLogger, testAccount } = fixture.context
    // algorand and transactionLogger are the same as test1, testAccount is different
  })
})

By default it makes sense to encourage test isolation so if you don't specify it then we assume test scoping rather than fixture scoping, which means it regenerates every time you call beforeEach. The naming of beforeEach thus still makes sense so I don't think we need to rename it.

I added a test to illustrate it: https://github.com/algorandfoundation/algokit-utils-ts/blob/algorand-fixture-scoping/src/testing/fixtures/algorand-fixture.spec.ts.

The problem with the waitForIndexer timing out was because the previous implementation was looping through all transactions and waiting for them, but when the transaction logger is global and you run lots of tests the number of transactions that are logged increases so those calls stack up and get slow. This was easy to resolve by simply waiting for the last transaction ID, this should be safe to do given transactions are generally issued sequentially in a test.

@robdmoore
Copy link
Contributor

robdmoore commented Dec 18, 2024

@pbennett re your comments:

This isn't changing a few arguments to fix things (because v6->v7/v8 required touching basically EVERY single call) - this is wasting time over a broken framework.

Can you be more specific please so we can address specific bits of feedback.

This particular issue raised by Joe is fair feedback, but as you can see above this was an unintended side-effect not a deliberate change and there's a straightforward way to resolve it once we merge #359 where you can find all algorandFixture() calls (generally only one of these per test file?) and add in{algorandScope: 'fixture'} and the v6 behaviour will return (except with the bug that was in v6 corrected).

The recent changes to v7/v8 were decided with no feedback from the people actually using these things. So far it seems like I'm the only one that's actually exercised these libraries and I'm telling you, this makes me want to abandon it all.

I'm sorry you feel that way. We did the best we could to get betas out and ask for feedback, including from yourself and devrel and made many changes as a result of that feedback and also used it on all of our codebases to try it out. I personally went out of my way to help you get this stuff working in reti and take on board your feedback and adjust the libraries based on it. The comments we received from you were that the end result of the changes were a lot clearer than the previous versions of utils.

I acknowledge that you've been on the bleeding edge of this stuff, and were using the beta versions of algorand client in v6 as an early adopter, which we appreciate, but naturally is a more painful experience than someone following the bouncing ball now it's all released as non-beta. We've put a lot of effort into making a migration experience from stateless functions to algorand client smooth and did our best to minimise breaking changes for v6 beta usage of algorand client in v6, but unfortunately had to make a series of breaking changes to get the right end result (again, largely based on feedback).

Ultimately, this series of changes are very complex, it's a big set of changes and hard to get perfect, but we remain open to and reactive to tangible feedback and truly believe the end result is a much better library for interacting with Algorand.

In some cases, the literal raw sdk would probably be more obvious and cleaner.

I'm sorry, but I have to disagree with you there. Raw alsosdk is very low level and clunky to use. But in saying that, if you have specific examples happy to hear them so we can keep iterating based on feedback as we have already been doing.

It just takes one look at that monstrosity of a before call that was added to Reti that Joe quoted above as part of v6->v7 integration to say that none of that has ANY business being in a test for end-user developers.

Agreed, but that isn't the code that makes sense to write and #359 will nullify the need for that kind of code and allow you to do the kind of shared setup state that you want to do easily.

@pbennett
Copy link

My frustration stemmed from wasting 2 days trying to figure out (unsuccessfully) why code that worked fine stopped working, combined with having to convert everything over to the complete algokit v6->[v7|v8] overhaul. Signing not working for no obvious reason, logging failing with bigint errors (in the framework) that weren't there before (and so not seeing any issues at all). It seems like I break something every time I try any new version so it feels like what I'm using hasn't actually been tested. Documentation is basically non-existent and between my inexperience w/ JS/TS, its test frameworks and what is/isn't algokit's bit of 'opinion' on how it should/shouldn't be done it feels like I'm mostly throwing darts at things hoping it works that I shouldn't have to.

Converting Reti from v6 to v7 then v8 was beyond painful. You helped immensely w/ the v6 to v7 - but didn't even touch the UI part of it and I think you probably realized how much of a change it was for people. So going from v6 to v8 - with even a little bit of a code - you basically have to touch everything.

Even in your example, I see problems:

describe('MY MODULE', () => {
  const fixture = algorandFixture({ algorandScope: 'fixture' })
  beforeEach(fixture.beforeEach, 10_000)

  test('test1', async () => {
    const { algorand, transactionLogger, testAccount } = fixture.context
    // ...
  })

  test('test2', async () => {
    const { algorand, transactionLogger, testAccount } = fixture.context
    // algorand and transactionLogger are the same as test1, testAccount is different
  })
})

the beforeEach that calls.. beforeEach with 10,000 argument (ok? 10,000 ... what?)
I would expect to define that single fixture there - once.. the fixture is already scoped for all tests in that describe via that closure.

I would expect something more like this:

describe('MY MODULE', () => {
    const { algorand, transactionLogger, testAccount } = algorandFixture().context

  test('test1', async () => {
    // ...
  })

  test('test2', async () => {
  })
})

If I want to generate new accounts, I will generate new accounts. I 100% do not expect every test to literally create a new accounts nor start in a 'clean state'. I control that with my scoping already. We're testing contracts - with state. It's like you're expecting every test to literally re-bootstrap everything which doesn't match at all how you need to test real-world contracts.

@joe-p
Copy link
Contributor Author

joe-p commented Dec 18, 2024

A fix was made to resolve a behavioural issue where algorandClient was globally cached, which broke the transaction logs and indexer waiting behaviour

The problem with the waitForIndexer timing out was because the previous implementation was looping through all transactions and waiting for them, but when the transaction logger is global and you run lots of tests the number of transactions that are logged increases so those calls stack up and get slow.

These are a mischaracterization of the bug and its relation to the AlgorandClient. The current implementation of the transaction logger and its waitForIndexer method was implemented in a way that prevents it from working when one of the transactions never gets confirmed. That's why #358 uses round, otherwise we'd need to check if each transaction is still pending or dropped by the node.

I think this is worth pointing out because it is a key factor in the crux of the issue here. This change was made based on a false premise. I believe that for any change, especially breaking ones, we should be able to map the change to a problem it is solving. Even more importantly, we should be able to map that problem back to users otherwise we will find ourselves over-engineering. Obviously missing the bug in #358 was an oversight but once the change is made and the affects downstream become evident (i.e. the above Reti code) there should be some further though on whether it really is the best path forward. I see points being raised about determinism, but it's not really clear to me what the concern is. What is the state persisted by AlgorandClient that is believed to be potentially harmful to testing? This is a breaking change, so I think we should be very clear in why it's being made.

I also think a key issue here is naming. AlgorandFixture is not really a fixture because fixtures are by definition fixed. What AlgorandFixture really provides is a bunch of helper methods that are called within beforeEach. beforeEach is also poorly named because it doesn't describe what it actually does. Stuff just happens in the background. This is why it's been quite common to see people (myself included) confused over testAccount changing between every test, because you'd assume an account from a fixture would be fixed.

@robdmoore
Copy link
Contributor

robdmoore commented Dec 19, 2024

@pbennett

My frustration stemmed from wasting 2 days trying to figure out (unsuccessfully) why code that worked fine stopped working, combined with having to convert everything over to the complete algokit v6->[v7|v8] overhaul. Signing not working for no obvious reason, logging failing with bigint errors (in the framework) that weren't there before (and so not seeing any issues at all). It seems like I break something every time I try any new version so it feels like what I'm using hasn't actually been tested. Documentation is basically non-existent and between my inexperience w/ JS/TS, its test frameworks and what is/isn't algokit's bit of 'opinion' on how it should/shouldn't be done it feels like I'm mostly throwing darts at things hoping it works that I shouldn't have to.

As I said, I totally understand your frustration and appreciate you've felt it the most since you've stayed at the bleeding edge the whole way through. The bigint thing was because algosdk 3 introduced bigints into core objects that previously weren't there so there was never a need to use a custom replacer in JSON.stringify and was a straightforward fix. The signing problem I assume was because of the issue raised in this issue, which we have a proposed fix for. We absolutely have testing in place both within the library and in other libraries and applications, which we've been trying these changes on. These libraries are generic and can be used in so many different ways so it's impossible to catch every combination of things when releasing such a massive set of changes like this. We have put a lot of effort to try and document everything, but the act of writing documentation is a neverending battle of course and very open to understanding the specific things that were not clear so we can address them with even more docs. If you do have problems like you're describing feel free to reach out in the discord forums or here on github issues and we can try and help diagnose.

So going from v6 to v8 - with even a little bit of a code - you basically have to touch everything.

If you are going from v6 pre-AlgorandClient (i.e. stateless functions) then no, you don't need to touch anything* and you can do a gradual migration bit by bit. Usage of AlgorandClient in v6 for sure would have required some changes, but again, we did try and minimise that where we could and create a migration guide to describe what changes needed to be made.

*Outside of the breaking changes in algosdk@3 like string -> Address.

the beforeEach that calls.. beforeEach with 10,000 argument (ok? 10,000 ... what?)

The 10_000 is standard vite / jest stuff - it sets the test timeout. We recommend setting that in our standard code snippets since the first time tests start calling localnet there can be some delays and the 10_000 gives it 10s to avoid intermittent failures from that. You don't have to specify that if you set the global test timeout to be higher in the vite/jest config.

I 100% do not expect every test to literally create a new accounts nor start in a 'clean state'. I control that with my scoping already.

A core part of good practice testing is isolation between tests, so yes a clean state should be the default, and all of the tests we have written across various projects generally make use of that. Of course in some instances you may want to have shared state / setup and there's a number of ways to do that using the testing frameworks. Two examples below that will work in the way you want with the new change I put up yesterday (these will actually also both work with the current implementation on npm too without the new { algorandScope: 'fixture' } bit):

describe('MY MODULE', () => {
  const fixture = algorandFixture({ algorandScope: 'fixture' })

  let algorand: AlgorandClient
  let testAccount: Address
  beforeAll(async () => {
    await fixture.beforeEach() // Run beforeEach once only (in beforeAll) so the context doesn't change across tests
    ;({ algorand, testAccount } = fixture.context)
  }, 10_000)

  test('test1', async () => {
    console.log(testAccount)
  })
})

describe('MY MODULE', () => {
  const fixture = algorandFixture({ algorandScope: 'fixture' })
  beforeAll(fixture.beforeEach, 10_000) // Run beforeEach once only (in beforeAll) so the context doesn't change across tests

  test('test1', async () => {
    const { algorand, testAccount } = fixture.context
    console.log(testAccount)
  })
})

I would expect something more like this:

describe('MY MODULE', () => {
    const { algorand, transactionLogger, testAccount } = algorandFixture().context

  test('test1', async () => {
    // ...
  })

  test('test2', async () => {
  })
})

Unfortunately, that kind of syntax isn't possible with vite/jest because the fixture has to be async and I don't believe you can have an async definition function.

@neilcampbell
Copy link
Contributor

neilcampbell commented Dec 19, 2024

A fix was made to resolve a behavioural issue where algorandClient was globally cached, which broke the transaction logs and indexer waiting behaviour

The problem with the waitForIndexer timing out was because the previous implementation was looping through all transactions and waiting for them, but when the transaction logger is global and you run lots of tests the number of transactions that are logged increases so those calls stack up and get slow.

These are a mischaracterization of the bug and its relation to the AlgorandClient. The current implementation of the transaction logger and its waitForIndexer method was implemented in a way that prevents it from working when one of the transactions never gets confirmed. That's why #358 uses round, otherwise we'd need to check if each transaction is still pending or dropped by the node.

It's probably worth describing in detail how this change came to be, as it's actually not a mischaracterization. The reason this change was originally made was for a different reason to what you are mentioning. We started noticing weird test failures inside utils-ts whilst working on the v7 branch and writing more tests leveraging the AlgorandClient. The issue experienced was intermittent test failures when testing functions like getCreatorAppsByName which leverages indexer. We noticed that whilst we were waiting for indexer using fixture.config.waitForIndexer, the test could fail in a way which in fact didn't appear to be waiting for indexer.

In investigating this issue, we realised it was related to the scoping of the algorand (AlgorandClient) instance. The fixture code at the time was as below:

const beforeEach = async () => {
    // ...
    const transactionLogger = new TransactionLogger()
    const transactionLoggerAlgod = transactionLogger.capture(algod)
    algorand = algorand ?? AlgorandClient.fromClients({ algod: transactionLoggerAlgod, indexer, kmd })
    // ...
    context = {
      // ...
      transactionLogger: transactionLogger,
      waitForIndexer: () => transactionLogger.waitForIndexer(indexer),
      waitForIndexerTransaction: (transactionId: string) => runWhenIndexerCaughtUp(() => 
      indexer.lookupTransactionByID(transactionId).do()),
    }
}

Because algorand had a different lifetime scope to transactionLogger, it meant that first instance of transactionLogger was captured and cached inside algorand so all new transactions being sent were accumulated inside the first transactionLogger instance. Tests leveraging either transactionLogger, waitForIndexer or waitForIndexerTransaction would experience issues as they were likely using a transactionLogger that has accumulated no transactions, so test success was entirely dependent on the speed at which indexer was running.

Our frame of context at the time was that we were fixing a bug and correcting the lifetime scoping to match our documented usage. In hindsight I can definitely see your point that it is behaviourally a breaking change that should have been captured in the v6 -> v7/v8 migration guide. Which we can totally still do.

The problem you noticed in this PR, which you added the commented todo block of code for is due to the above.

I think this is worth pointing out because it is a key factor in the crux of the issue here. This change was made based on a false premise. I believe that for any change, especially breaking ones, we should be able to map the change to a problem it is solving. Even more importantly, we should be able to map that problem back to users otherwise we will find ourselves over-engineering. Obviously missing the bug in #358 was an oversight but once the change is made and the affects downstream become evident (i.e. the above Reti code) there should be some further though on whether it really is the best path forward. I see points being raised about determinism, but it's not really clear to me what the concern is. What is the state persisted by AlgorandClient that is believed to be potentially harmful to testing? This is a breaking change, so I think we should be very clear in why it's being made.

In reading this it made me realised that we hadn't done a good job telling the story of how the change came to be. Hopefully my description clarifies. The change wasn't made based on a false premise, and I believe that v6 actually still has the flaw.

I also think a key issue here is naming. AlgorandFixture is not really a fixture because fixtures are by definition fixed. What AlgorandFixture really provides is a bunch of helper methods that are called within beforeEach. beforeEach is also poorly named because it doesn't describe what it actually does. Stuff just happens in the background. This is why it's been quite common to see people (myself included) confused over testAccount changing between every test, because you'd assume an account from a fixture would be fixed.

Test fixture is a colloquial term to define a test context. The purpose is to ensure a fixed, deterministic environment for the test rather than always having a singleton lifetime/scope. I definitely think it's useful to be able to control the lifetime/scope. Pytest fixtures for example have a default 'function' (per test) scope, but can be controlled depending on your needs.

@pbennett
Copy link

If you are going from v6 pre-AlgorandClient (i.e. stateless functions) then no, you don't need to touch anything* and you can do a gradual migration bit by bit. Usage of AlgorandClient in v6 for sure would have required some changes, but again, we did try and minimize that where we could and create a migration guide to describe what changes needed to be made.

This isn't true. Literally every contract call had to change in the generated clients for example. .compose() became newGroup(). All the .send additions. send instead of execute.. args not being the first parameter but now being part of embedded object as 'args:', fee definitions changing, etc. etc. It was basically a touch everything process.

A core part of good practice testing is isolation between tests, so yes a clean state should be the default,

Sure, but the developer should decide where that isolation is, not you. Smart contracts require state and in some cases that state might literally require hundreds of transactions to create properly - specific balances, specific accounts, box state, etc. etc. That should NOT be done PER TEST. For some? Sure. For all? Absolutely not - and that's not the frameworks job to force on me. The framework is literally overriding scope expectations and expecting me to 'override' it to make it work like expected.

Two examples below that will work in the way you want with the new change I put up yesterday (these will actually also both work with the current implementation on npm too without the new { algorandScope: 'fixture' } bit):

You listed two examples that seem to show having to do extra (and non-intuitive) work to make it.. stay out of my way. 😞 A fixture that I have to tell to scope to .. a fixture ?
The framework should help me, not be an obtuse PITA.

@robdmoore
Copy link
Contributor

@pbennett

This isn't true. Literally every contract call had to change in the generated clients for example. .compose() became newGroup()...

You could have avoided this by using the previous version of the typed client generator and keeping the existing typed clients while migrating gradually. Because v7/v8 was backwards compatible to v8 for stateless functions (which the previous version of the typed clients used) they would still work. @neilcampbell did we mention this in the migration guide? If not we should add it.

Sure, but the developer should decide where that isolation is, not you.

That's what I proposed in the PR I raised yesterday - that the developer can control the scope?

You listed two examples that seem to show having to do extra (and non-intuitive) work to make it.. stay out of my way.

The first example, sure (but I only included it because it was the closest to the syntax you suggested where there are test global variables so you don't have to destructure in every test - and the pattern I included is typical jest/vitest code for doing that kind of thing though, not something we are forcing on people), the second example I gave though is exactly the same as the typical example except beforeEach is swapped for beforeAll - a standard jest/vitest mechanism - i.e. it's in your control how the scoping works using standard testing primitives.

A fixture that I have to tell to scope to .. a fixture ?

The reason I used fixture was that the scoping is tied to the lifetime of the fixture object itself, the other option being test since it's recreated for each beforeEach call. Open to better naming suggestions for both though if you have ideas?

@neilcampbell
Copy link
Contributor

neilcampbell commented Dec 19, 2024

I feel like we are not too far from a decision to make the experience more flexible for devs and solve the issues mentioned in this thread. It'd be good to try make this change as backwards compatible as possible, so we can update both v7/v8.

The options as I see them are:

Option 1 - Do nothing

Always an option worth calling out, however in this case I'm not sure it's desirable.

Developers who want to have fixture per test suite scoping can use the below.

const fixture = algorandFixture()
beforeAll(fixture.beforeEach)

This would be documented clearly and we'd explicitly call out the fact that beforeAll(fixture.context.beforeEach) is a bit of a smell.

Per test semantics would remain as already documented in the repo.

Pros

  • Only requires documentation changes

Cons

  • The beforeEach naming is confusing
  • When opting into a per test suite scope, the testAccount will be shared by the suite. For some this may actually be a pro, however it can result in "transaction already in ledger" if test are using the testAccount and sending transactions with the same data between tests.

Option 2 - Cache AlgorandClient inside the fixture

This option would be largely based on #356 and #358 which caches the AlgorandClient instance (and related objects like transactionLogger) inside the fixture.

Any calls to beforeEach would update the testAccount, however no other state in the fixture will be changed.

Developers who want to have fixture per test suite scoping can use the below.

const fixture = algorandFixture()
beforeEach(fixture.beforeEach)

Developers who want to have fixture per test scoping can use the below.

let fixture = algorandFixture()
beforeEach(async () => {
  fixture = algorandFixture()
  await fixture.beforeEach()
})

Pros

  • From an AlgorandClient perspective it's behaviourally consistent with v6.
  • testAccount has per test scoping

Cons

  • Fixture per test scoping is more verbose to setup.
  • The testLogger state would accumulate in the fixture, so a test couldn't use testLogger to see the transactions it has sent. This is a behaviour change.
  • The per test scoping feels more awkward than per test suit scoping.
  • waitForIndexer would wait for the latest block to be indexed, rather than the transactions sent by the test, which is possibly slower?

Option 3 - Rename beforeEach to newContext

This option would involve renaming beforeEach to newContext (or something similar) to better indicate what it is doing. We'd keep the beforeEach function for backwards compatibility, however it would be marked as deprecated.

Developers who want to have fixture per test suite scoping can use the below.

const fixture = algorandFixture()
beforeAll(fixture.newContext)

Developers who want to have fixture per test scoping can use the below.

const fixture = algorandFixture()
beforeEach(fixture.newContext)

Both usage patterns would be documented and the choice would be entirely up to the developer.

Pros

  • Non breaking change

Cons

  • When opting into a per test suite scope, the testAccount will be shared by the suite. For some this may actually be a pro, however it can result in "transaction already in ledger" if test are using the testAccount and sending transactions with the same data between tests.

Option 4 - Add an algorandScope option to the fixture

This option would involve adding an algorandScope setting which allows the developer to control the context scope/lifetime and would be based on #359

Developers who want to have fixture per test suite scoping can use the below.

const fixture = algorandFixture({ algorandScope: 'fixture' })
beforeEach(fixture.beforeEach)

Developers who want to have fixture per test scoping can use the below.

const fixture = algorandFixture({ algorandScope: 'test' }) // or algorandFixture()
beforeEach(fixture.beforeEach)

Pros

  • Non breaking change
  • testAccount has per test scoping

Cons

  • Introduces a new setting

@pbennett
Copy link

You could have avoided this by using the previous version of the typed client generator and keeping the existing typed clients while migrating gradually.

No actually I couldn't - since I needed latest arc56 support etc - which means you have to kind of lock-step go with latest of everything (which I wanted to do). srcmaps have been completely broken and I was hoping the latest and greatest would maybe fix it along with getting current w/ sdk v3, etc.

@lempira
Copy link
Contributor

lempira commented Dec 19, 2024

Thanks for creating these options @neilcampbell. They help gain a better understanding of the landscape for a path forward.

I am in favor of option 3. I agree with Joe that the beforeEach fn name in the fixture is confusing because it does not describe what it's doing, so this option gives more clarity. As I mentioned in a previous comment, deterministic and idempotent tests are things for which I strongly advocate. My biggest issue with the PR changes is that I can't use this fixture and choose where that isolation is. It's defaulting to the global scope. I personally think that each test, however cumbersome, should have its own scope. I understand that SC testing usually involves a series of steps which could benefit from state sharing, and this option (which is also the current behavior) allows the dev to choose the scope.

As in option 1, I also am in favor of adding additional docs regarding testing that would explain how to use this fixture and perhaps a recommendation on test isolation.

@robdmoore
Copy link
Contributor

robdmoore commented Dec 20, 2024

@pbennett is there an open issue for the source map thing you mentioned?

@robdmoore
Copy link
Contributor

@joe-p

That's why #358 uses round, otherwise we'd need to check if each transaction is still pending or dropped by the node.

By the way, the main issue was that because the algorand client was globally scoped and all transactions were being logged from all previous test runs (sure, including potentially failed ones) and the current implementation of waitForIndexer in the transactionLogger loops through all transactions to wait for them. This also causes huge delays and test timeouts as the number of indexer calls increases (particularly given the now long standing indexer bug where it waits 18s every now and then).

By changing waitForIndexer to wait for the last issued transaction it overcomes that problem regardless of whether transactionLogger is scoped globally or not. In the case you did issue a failed transaction and it gets logged, you wouldn't in practice have a problem with the waitForIndexer call because the failed transaction would generally cause a test to fail before you even got to a waitForIndexer call.

We could of course replace to call to transactionLogger.waitForIndexer() with a different waitForIndexer call that checked round number, but the downside of that is you would be waiting for longer than you need to when running tests in parallel, the nice thing about using transactionLogger is your waiting is scoped to the actual transaction you just issued in that test.

@robdmoore
Copy link
Contributor

FYI I've added a proposed pull request that implements option 3 along with something that propagates registered signing accounts with each successive newScope call to avoid potential lack of signer troubles when using multiple scopes together (e.g. when deciding to create a scope in a global context and different scopes in per-test contexts).

#362

@pbennett
Copy link

@pbennett is there an open issue for the source map thing you mentioned?

It's something I've talked w/ Joe about. The existing mapping code is set up to handle simple single contract calls, not contracts that call other contracts (where the failure is in that nested call) as lookup is going to be in the context of a simple 'client X calls contract X'. Kind of independent of this PR though.

The explict newContext option 3 is definitely a better choice. Reading this PR I learned that algokit was repeatedly generating 'new' test accounts for me nonstop as well (which I never asked or expected it to do). Explains a lot of issues I noticed. I'd simply expect a new test account whenever I explicitly generated one. I assumed getting testAccount from the fixture was getting me the same test account - not generating a new one.

With all the comments about the intended design ideals of per-test clean-slate contract testing, could someone point me to what they consider a good production example of this w/ algokit and algorand smart contracts? I'd really like to see these contracts and what they do to have clean-slate testing (for 'live' integration tests!) be a preferred approach.
At least with anything I've used it with (Reti and NFDs), literally reproducing 'all state, per-test' is the absolute last thing I'd ever want to do.

@robdmoore
Copy link
Contributor

@pbennett it's a bit older so isn't using typed clients, but this example shows isolated tests: https://github.com/algorandfoundation/nft_voting_tool/blob/develop/src/algorand/smart_contracts/tests/voting.spec.ts

It's worth pointing out, there's nothing wrong with setting up common state across a series of tests - if it takes a lot of time to do and the individual tests are able to execute with relative isolation from each other i.e. you know each tests is in a clean state of some sort then it's fine. The problem occurs when tests aren't isolated from each other and tests fail when executed in a different order etc.

The good thing about the option 3 suggestion is it makes it clear that the scoping of the fixture context is up to the developer (which it was already). For instance, here's a test in utils-ts itself that used beforeAll:

.

@joe-p
Copy link
Contributor Author

joe-p commented Dec 20, 2024

By the way, the main issue was that because the algorand client was globally scoped and all transactions were being logged from all previous test runs (sure, including potentially failed ones) and the current implementation of waitForIndexer in the transactionLogger loops through all transactions to wait for them. This also causes huge delays and test timeouts as the number of indexer calls increases (particularly given the now long standing indexer bug where it waits 18s every now and then).

Yes sorry I had forgotten that there was two issues.

  1. As you mentioned, transaction logger was global so every test was waiting for every past transaction
  2. The transaction logger was waiting for failed transactions, which became evident because of 1

I just want to make it clear to readers that problems with the AlgorandClient being global is NOT inherit to the AlgorandClient being global. #358 solves this but does also change the scope but that is not crucial to the solution. An alternative solution would be popping transactions that are already waited for successfully and popping transactions that are failed but #358 seemed like the simpler solution. Regardless of the solution here, I think we need to fix these issues. I can create an issue for this effort (created #363).

In the case you did issue a failed transaction and it gets logged, you wouldn't in practice have a problem with the waitForIndexer call because the failed transaction would generally cause a test to fail before you even got to a waitForIndexer call.

Problems could arise if you are doing something with a try ... catch block which I don't think is entirely unreasonable to assume it might be a part of a sufficiently complex app, especially if you are calling a library for an application where you might not have control over try ... catch blocks.

but the downside of that is you would be waiting for longer than you need to when running tests in parallel,

I'm confused on why this would be the case... "wait for all transactions in algod to show up" and "wait for the last round to show up" are the same questions as far as indexer is concerned. I don't want to sidetrack the main discussion here, so I've created #363 to continue the discussion.

I agree 3 is the best solution. I also think for everything exposed through the context there should be a helper function for generating a new instance. Similar to how there's testAccount and generateAccount

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants