Centralize Go test suite #48

aulorbe · 2024-07-19T23:09:19Z

Problem

Our current test set up is not as useful as it could be. Some issues are:

The IDs, metadata, and vectors used in tests are generated in non-deterministic ways
The (integration) tests rely on humans having to manually create indexes in our test project; those indexes must have titles that match Github secret env vars; if this isn't done correctly, tests fail
The tests do not differentiate between serverless and pod indexes, which make things cumbersome to work with (and make confirming updated vector values, etc. quite impossible)
Our tests are redundant and cost inefficient -- we spin up indexes mult times and either delete them or just let them live forever in our test project

Solution

Make it all better!

The current architecture now looks like this:

Two top-level test infra files:

integration_test_suite.go: This file defines a single test struct called IntegrationTests that holds the fields for everything we need wrt integration testing across all go files in our project
a. Importantly, this file also contains the testify mandatory SetupSuite and TeardownSuite methods attached to this struct, so that indexes are always torn down after testing completes
run_integration_test_suites.go: This file actually runs the test suites defined in integration_test_suite.go via testify's suite.Run command. It runs 2 suites: 1 for pods (podTestSuite) and 1 for serverless (serverlessTestSuite)

Individual test files:
Each file still has a complementary ..._test.go file that contains its integration and unit tests. The main difference this PR introduces is that each of these files no longer contains a redundant SetupSuite function, etc. Instead, they simply call run_integration_test_suites.go's RunSuites() method, and everything is automagically created/destroyed.

Genesis

This refactor arose from Audrey trying to write integration tests for update, but being unable to do so, since she could not easily compare IDs, vector values or metadata before vs after update operations.

Misc.:

There is still a lot of things we can do to make our tests better and more efficient, I'm sure. This is just one baby step on the longer journey towards test suite-maturity.

FAQs

Why do we need two infra-type files (integration_test_suite.go and run_integration_test_suites.go)?

I don't like it either, but apparently this is what is needed for testify to work 😢 . You can't have the suite.Run call in the same file as the SetupSuite and TeardownSuite methods.

Does the fact that all (integration) tests now share the same struct (IntegrationTests) mean that when you run the integration tests in a specific file (e.g. client_tests.go), all integration tests actually run?

Yes. This obviously isn't ideal for dev work, but you can figure out how to run individual tests via the command line by reading up on go test.

Why have different Suites for pods vs serverless indexes, when they share most fields?

This is totally fair and tbh I simply didn't refactor this part because this PR is getting gigantic and it seemed like it would add unnecessary complexity to it. But we should go over the pros/cons of having everything in a single Suite later! For now, splitting them out produced the invaluable outcome of allowing me to test different things per index type.

Type of Change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update
Infrastructure change (CI configs, etc)
Non-code change (docs, etc)
None of the above: (explain here)

Test Plan

CI passes.

To see the specific tasks where the Asana app for GitHub is being used, see below:
- https://app.asana.com/0/0/1207910317683711
- https://app.asana.com/0/0/1207937493683805

aulorbe · 2024-07-23T18:31:40Z

pinecone/index_connection_test.go


-	namespace, err := uuid.NewV7()


Just changed this to make it more up to date w/current Go code I was seeing online

aulorbe · 2024-07-23T18:31:58Z

pinecone/index_connection_test.go

-	namespace, err := uuid.NewV7()
-	assert.NoError(ts.T(), err)
+	namespace, err := uuid.NewUUID()
+	require.NoError(ts.T(), err)


Changed some asserts to requires since that's what I was seeing is more common with testify.

aulorbe · 2024-07-23T19:01:26Z

.github/workflows/ci.yaml

@@ -17,7 +17,8 @@ jobs:
        run: |
          go get ./pinecone
      - name: Run tests
-        run: go test ./pinecone
+        #        run: go test -count=1 -v ./pinecone


We will re-enable this (and remove the line below) once the tests for client are working

Would it make sense to just do that in this PR as well rather than having tests commented out between cycles?

Yeah -- lemme rebase your changes (where you added the new env vars for the client integration tests) into mine once you merge, and then it'll be done

aulorbe · 2024-07-23T19:01:55Z

pinecone/index_connection_test.go

-	if err != nil {
-		t.FailNow()
-	}
+	client, err := NewClient(NewClientParams{ApiKey: apiKey, Headers: map[string]string{"content-type": "application/json"}})


Had to add new header for content-type. Not entirely sure why but the backend was yelling at me that it wasn't set before this line.

We shouldn't need to do this here. It's confusing to me that we'd need to when all we did was change test code rather than the implementation code in the SDK.

What was the backend yelling at you? This header should be handled by the underlying generated code for all requests where it's necessary. Look at all the places it's applied in control_plane.opas.go:

go-pinecone/internal/gen/control/control_plane.oas.go

Line 757 in be9fe92

return NewCreateCollectionRequestWithBody(server, "application/json", bodyReader)

Yeah it was yelling at me that the content type header was missing, but lemme try again -- maybe that was a red herring for something else at the time

I've removed and CI still passes... I'm not sure why this is happening but it's not happening any longer.

austin-denoble

Took a first pass, nice work getting us onto the path of having our integration tests fully isolated! I think the most important feedback here is around allowing the tests to generate index names, and then rely on those for the test run and cleanup. I'd prefer to move away from storing index names in environment variables if possible.

austin-denoble · 2024-07-24T21:48:57Z

pinecone/index_connection_test.go

-	if err != nil {
-		t.FailNow()
-	}
+	client, err := NewClient(NewClientParams{ApiKey: apiKey, Headers: map[string]string{"content-type": "application/json"}})


We shouldn't need to do this here. It's confusing to me that we'd need to when all we did was change test code rather than the implementation code in the SDK.

What was the backend yelling at you? This header should be handled by the underlying generated code for all requests where it's necessary. Look at all the places it's applied in control_plane.opas.go:

go-pinecone/internal/gen/control/control_plane.oas.go

Line 757 in be9fe92

return NewCreateCollectionRequestWithBody(server, "application/json", bodyReader)

austin-denoble · 2024-07-24T21:51:41Z

.github/workflows/ci.yaml

@@ -17,7 +17,8 @@ jobs:
        run: |
          go get ./pinecone
      - name: Run tests
-        run: go test ./pinecone
+        #        run: go test -count=1 -v ./pinecone


Would it make sense to just do that in this PR as well rather than having tests commented out between cycles?

austin-denoble · 2024-07-24T21:52:02Z

.github/workflows/ci.yaml

          TEST_PODS_INDEX_NAME: ${{ secrets.TEST_PODS_INDEX_NAME }}
          TEST_SERVERLESS_INDEX_NAME: ${{ secrets.TEST_SERVERLESS_INDEX_NAME }}


The strings that these secrets represent changed in GitHub for this repository, right?

austin-denoble · 2024-07-24T21:56:11Z

pinecone/index_connection_test.go

+	fmt.Printf("Creating Serverless index: %s\n", idxName)
+	serverlessIdx, err := in.CreateServerlessIndex(ctx, &CreateServerlessIndexRequest{
+		Name:      idxName,
+		Dimension: int32(setDimensionsForTestIndexes()),


Are we not able to just use Dimension: 5 here? Like here:

go-pinecone/pinecone/client_test.go

Line 572 in be9fe92

Dimension: 10,

I just use this new helper function int mult places, so I thought it was better to call a function than to hardcode it to a number

austin-denoble · 2024-07-24T21:59:02Z

pinecone/index_connection_test.go

+	return array
+}
+
+func getStatus(ts *IndexConnectionTestsIntegration, ctx context.Context) (bool, error) {


getStatus of what? This should probably be more descriptive. Since it's operating on indexes depending on which is targeted via ts.IndexType and then polling for Ready and a boolean I'd try and make that more clear.

austin-denoble · 2024-07-24T21:59:26Z

pinecone/index_connection_test.go

+	return desc.Status.Ready, nil
+}
+
+func upsert(ts *IndexConnectionTestsIntegration, ctx context.Context, vectors []*Vector) error {


Same as above regarding naming.

austin-denoble · 2024-07-24T22:09:34Z

pinecone/index_connection_test.go

+	})
+	assert.NoError(ts.T(), err)
+
+	time.Sleep(5 * time.Second)


Has 5 seconds worked well for this so far? We had use retries and longer wait windows in some other tests to handle upsert and update. Just curious how it's performed for you so far.

Yeah I tried longer and shorter, but 5 was the shortest I found where it still passed

austin-denoble · 2024-07-24T22:11:35Z

pinecone/index_connection_test.go

+func generateFloat32Array(n int) []float32 {
+	array := make([]float32, n)
+	for i := 0; i < n; i++ {
+		array[i] = float32(i)


This is minor - would be nice to generate random floats or ints for these helpers rather than just using the same index values every time. I guess it ultimately doesn't matter much, but random values could be a nice stress test.

I tried doing it by setting the start of the range to a random int, but because n is always so low (5 or less), it often fails. I think we can just leave as is?

We can leave as-is, it's nitpicky.

I think you could write a helper function that uses math/rand maybe:

func RandomFloat32() float32 { rand.Seed(time.Now().UnixNano()) // Seed the random number generator return rand.Float32() }

austin-denoble · 2024-07-24T22:17:47Z

pinecone/index_connection_test.go

-	}
+	client, err := NewClient(NewClientParams{ApiKey: apiKey, Headers: map[string]string{"content-type": "application/json"}})
+	require.NotNil(t, client, "Client should not be nil after creation")
+	require.NoError(t, err)

 	podIndexName := os.Getenv("TEST_PODS_INDEX_NAME")


Rather than hard-coding this name into an invisible secret in GitHub, I'd prefer if we could generate a random name to use each run since the suite is managing it's resources anyways.

If we hard-code things like this we'd also run into problems if we want to ever run integration tests in parallel, because the tests are no longer isolated and are all pointing at this predefined set of index names.

Can we look at adding a helper function that generates a random name, possibly with a seed string? Something like we do in our other test suites:

TypeScript: https://github.com/pinecone-io/pinecone-ts-client/blob/b5a66e6216d7615576dd881a67db627be2aa7bba/src/integration/test-helpers.ts#L92

Python: https://github.com/pinecone-io/pinecone-python-client/blob/d9df37558ba35b7097dc4bc5c8101b576019caa2/tests/integration/helpers/helpers.py#L13

If we allow the test runs to create their own names and resources and then clean those up, that feels a bit more robust.

Yes, fantastic idea!

aulorbe · 2024-07-25T20:33:45Z

.env.example

-TEST_PODS_INDEX_NAME="<Pod based Index name>"
-TEST_SERVERLESS_INDEX_NAME="<Serverless based Index name>"


austin-denoble

This makes a lot of sense to me, and I really appreciate the clean up here, and putting us in a better direction in terms of integration test robustness. Nice work! 👏

I do have some questions and follow up around how the IntegrationTests suite is created and run. I feel like I'm possibly missing something about our integration testing dependencies. We can follow up offline, or in an additional PR to further refine things.

For now though we've retained our coverage and better organized the setup and teardown into a centralized location which I love, and is similar to our approach in Java.

austin-denoble · 2024-07-25T19:04:43Z

pinecone/client_test_2.go

@@ -0,0 +1 @@
+package pinecone


I think this can be removed, right?

(I know this file is renamed, but answering anyways -- apparently no it needs to be there (says my IDE))

austin-denoble · 2024-07-26T18:13:22Z

.env.example

-TEST_PODS_INDEX_NAME="<Pod based Index name>"
-TEST_SERVERLESS_INDEX_NAME="<Serverless based Index name>"


austin-denoble · 2024-07-26T18:15:47Z

.github/workflows/ci.yaml

@@ -17,8 +17,6 @@ jobs:
        run: |
          go get ./pinecone
      - name: Run tests
-        run: go test ./pinecone
+        run: go test -count=1 -v ./pinecone


I think the default for count is 1 so you can probably remove explicitly setting it.

I thinkkkk from the documentation it's not, actually! Check this out:

austin-denoble · 2024-07-26T18:16:34Z

README.md

@@ -101,7 +103,7 @@ Then, execute `just bootstrap` to install the necessary Go packages
 ### .env Setup

 To avoid race conditions or having to wait for index creation, the tests require a project with at least one pod index


nit: I think we can remove this whole first part since it's not longer true that we're avoiding waiting for indexes to create.

austin-denoble · 2024-07-26T18:20:40Z

pinecone/integration_test_suite.go

+	return nil
+}
+
+// TODO: how to get this func to work for client tests too


I think this TODO can come out, right?

Whoopsie yes :)

austin-denoble · 2024-07-26T18:41:42Z

pinecone/integration_test_suite.go

+	podIdxName        string
+	serverlessIdxName string


What's the reason for needing both of these names in each struct? It seems like in run_integration_test_suites.go, we're creating two different IntegrationTests objects. Do they not run separately?

I'm just thinking it feels easier to reason about each instance of the struct handling it's own index. I know you mentioned both serverless and pod index tests run regardless of what file you're trying to test, so there might be something I'm misunderstanding. Basically, my initial thoughts were "why can't this just be idxName and each suite handles one index?

Yeah....I think you're right! Lemme try it out and see if CI still passes...

austin-denoble · 2024-07-26T18:45:52Z

pinecone/run_integration_test_suites.go

+	"github.com/stretchr/testify/suite"
+)
+
+func RunSuites(t *testing.T) {


I wonder if there's a way to add an additional parameter here like a pod vs. serverless enum. We could then directly control what's run when each client_test.go or index_connection_test.go runs RunSuites(), so you could control behavior more directly via integration test file. That feels like it might be nice.

Like you said, there's a lot of ways we could take this and we should probably just take this first step and then play around.

austin-denoble · 2024-07-26T18:49:41Z

pinecone/index_connection_test.go

+func TestIndexConnectionIntegration(t *testing.T) {
+	RunSuites(t)


So we call RunSuites() from here in each test file (client_test.go and index_connection_test.go). Then in RunSuites() we're creating two IntegrationTest structs and then calling suite.Run on both of them:

suite.Run(t, podTestSuite) suite.Run(t, serverlessTestSuite)

This may be a testify thing I'm not clear on, but is there a possibility we're running duplicates of the test suites? Like if we trigger go test on both files, does it spawn different instances of testing.T and RunSuites() etc?

Gone! Now we have RunSuites in suite_runner_test.go (has to be append with _test so that go test runs the RunSuites func, and test_suite.go, which is where the setup/teardown lives.

austin-denoble · 2024-07-26T18:53:40Z

pinecone/integration_test_suite.go

+	if ts.indexType == "serverless" {
+		indexName = ts.serverlessIdxName
+	} else if ts.indexType == "pods" {
+		indexName = ts.podIdxName
+	}


I feel like needing to do these indexType checks inside of this method is harder to reason about, couldn't we just take in an indexName as an argument and then make this function handle one index no matter what?

austin-denoble

Thanks for addressing all that feedback and getting to the root of the test run stuff! 🚢

aulorbe added 15 commits July 19, 2024 16:09

Save work

fbe03b0

Save work

58a7f90

Got pods working, provided index does not already exist

b999395

Save work

3d551e4

Got most things to work

317a661

Holy crap it is working

8016f9d

All tests working for index_connection now

17e7f35

Update CI

e7d8985

Update env var to correct

146d041

Update env var name in client_test.go

4d87263

Merge branch 'main' into Audrey/integration-tests

f270ea4

Move index deletion process to teardown

f4bf712

Try to force no caching and tests to run sequentially

ff07545

Try to just run index_connection tests

80d338b

gofmt

3994fa1

aulorbe changed the title ~~Save work~~ Rework index_connection tests suite + add int. tests for Update, Create Jul 23, 2024

aulorbe changed the title ~~Rework index_connection tests suite + add int. tests for Update, Create~~ Rework index_connection tests suite + add int. tests for Update Jul 23, 2024

aulorbe commented Jul 23, 2024

View reviewed changes

aulorbe added 4 commits July 23, 2024 11:41

Remove test for a diff pr

e8d814e

cleanup

b59874b

remove changes to client_test

7cd2da1

remove changes to client_test

9b086f4

aulorbe marked this pull request as ready for review July 23, 2024 18:49

aulorbe requested a review from austin-denoble July 23, 2024 18:49

aulorbe commented Jul 23, 2024

View reviewed changes

austin-denoble reviewed Jul 24, 2024

View reviewed changes

aulorbe added 2 commits July 24, 2024 15:42

Try to remove content-type header to see what happens

1aadcdb

Remove hard-coded env vars

eab6b0b

aulorbe added 5 commits July 24, 2024 15:54

Randomize Array generation funcs

ce7cd8b

Make naming more descriptive

e3aa2bf

pull out test suite into own file to use for all tests

ac4d29d

index_connection tests working

e46d008

New arch for centralized go testing

5bdf6d0

aulorbe changed the title ~~Rework index_connection tests suite + add int. tests for Update~~ Centralize Go test suite Jul 25, 2024

Update CI

5630a58

aulorbe commented Jul 25, 2024

View reviewed changes

aulorbe requested a review from austin-denoble July 26, 2024 00:32

austin-denoble approved these changes Jul 26, 2024

View reviewed changes

aulorbe added 6 commits July 26, 2024 12:21

Update .env part of README

600c06b

Experiment with condensing names into simply ts.IdxName

8b79a56

Update struct in experimentation

b91dfa9

Cleanup

cf5d0eb

New setup, global RunSuites so no duplication, yay!

517e8ed

Remove integration-test-only command bc no longer works

bc8a4f5

austin-denoble approved these changes Jul 26, 2024

View reviewed changes

aulorbe merged commit 3da0e9d into main Jul 26, 2024
3 checks passed

aulorbe deleted the Audrey/integration-tests branch July 26, 2024 21:25

		TEST_PODS_INDEX_NAME: ${{ secrets.TEST_PODS_INDEX_NAME }}
		TEST_SERVERLESS_INDEX_NAME: ${{ secrets.TEST_SERVERLESS_INDEX_NAME }}

		TEST_PODS_INDEX_NAME="<Pod based Index name>"
		TEST_SERVERLESS_INDEX_NAME="<Serverless based Index name>"

		@@ -101,7 +103,7 @@ Then, execute `just bootstrap` to install the necessary Go packages
		### .env Setup

		To avoid race conditions or having to wait for index creation, the tests require a project with at least one pod index

		func TestIndexConnectionIntegration(t *testing.T) {
		RunSuites(t)

Centralize Go test suite #48

Centralize Go test suite #48

Conversation

aulorbe commented Jul 19, 2024 • edited by austin-denoble Loading

Problem

Solution

Genesis

Misc.:

FAQs

Type of Change

Test Plan

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

austin-denoble left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

austin-denoble left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

austin-denoble left a comment

Choose a reason for hiding this comment

aulorbe commented Jul 19, 2024 •

edited by austin-denoble

Loading