Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data plane integration testing suite #268

Merged
merged 19 commits into from
Jan 12, 2024
Merged

Conversation

jhamon
Copy link
Collaborator

@jhamon jhamon commented Jan 10, 2024

Problem

Want to add a much more comprehensive integration testing suite against the data plane. This is a large undertaking because of the amount of variables involved, but this is a start.

Some variables to consider:

  • REST vs GRPC
  • pod vs serverless
  • environment/cloud/region
  • freshness timeout
  • index metric, since hybrid search only makes sense for dotproduct

Solution

Conceptually, I want every test run on github actions to target one combination of the factors above. This means some things which might otherwise be handled through test parameterization in a big monolithic test suite that is run with a single command are now toggled through environment variables. This makes it very easy to setup parallel runs with different configurations, but at the cost of some additional complexity. There's currently no single command to run every combo from my local machine in one go.

To run these locally right now, I might have an incantation like this:

PINECONE_API_KEY='nana-nana-nana-nana-batman' \
  METRIC='cosine' \
  USE_GRPC='false' \
  SPEC='{"serverless": {"region": "us-west-2", "cloud": "aws"}}' \
  poetry run pytest tests/integration/data/ -s -vv
  • In CI, I shove this ugliness into a reusable github action test-data-plane that allow tests to easily be run using different configurations for target environment/cloud/region, REST vs GRPC, and metric type (which is relevant because hybrid search / sparse vector stuff only makes sense in the context of indexes with metric=dotproduct).
  • Since waiting on data freshness is another bottleneck in the execution, I try to seed some data upfront and use it for all my query/fetch testing.
  • By using the exact same tests via GRPC and REST for the first time, I'm discovering a lot of small details that don't match. So I'm making adjustments as I go to bring them into alignment, at least on the happy path. Error handling is a complex beast and will need more iteration in the future.
    • Migrated some logic from IndexGRPC in to another vector factory class to allow IndexGRPC#upsert to accept Vector or VectorGRPC args. This greatly simplifies test reuse and will be a nice UX improvement for users also because it simplifies docs.
    • Refactored a bit in the GRPC "unit" tests which were and still are a disaster. I found myself crawling through those after refactoring the IndexGRPC class to use the new VectoryFactoryGRPC class and triggering a few failures due to missed a few edge cases expressed through the unit tests. I split things up into smaller files and transitioned from using self.<field> everywhere to using pytest fixtures instead. It still sucks, but readability maybe slightly improved.
    • An example of something that changed (for alignment reasons) is that in the past an empty fetch result would be None via GRPC or FetchResult(vectors=[]) via REST. The simplicity of None was nice, but now that there is read_unit information to display even when there are no results, the FetchResponse return value makes more sense.
  • Implemented a vector factory for GRPC that is very similar to the one I created a while ago for the REST code path. It's much much easier to test edge cases in that class alone than in a more integrationy unit test that requires me to mock/patch a bunch of stuff that would otherwise fire off network calls.

To save resources/time while initially building out these tests, for now I'm only running with serverless indexes. But it should be easy to add other configurations soon.

Some possible bugs/questions to follow up:

  • Seems like $ne and $nin aren't working as documented
  • Freshness seems much different on dotproduct indexes. Not sure if this is expected.
  • Need to rename readUnits to read_units before shipping 3.0

High-level coverage so far

  • Upsert
  • Upsert (sparse vec)
  • Fetch
  • Query
  • Query (metadata filters)
  • Query (hybrid / sparse vec)
  • Update
  • DescribeIndexStats
  • Delete

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update
  • Infrastructure change (CI configs, etc)
  • Non-code change (docs, etc)

@jhamon jhamon changed the title Data plane integration tests Data plane integration testing Jan 12, 2024
@jhamon jhamon changed the title Data plane integration testing Data plane integration testing suite Jan 12, 2024
@jhamon jhamon marked this pull request as ready for review January 12, 2024 10:20
Copy link
Contributor

@austin-denoble austin-denoble left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for all the nitpicks and questions, you can obviously ignore and land as-is. This is fantastic, thanks for expanding our coverage, streamlining the integration flows, and packing in additional GRPC quality of life improvements. Also hot on the heels of all the dependency coverage work, wow! 🚢

outputs:
index_name:
description: 'The name of the index, including randomized suffix'
value: ${{ steps.create-index.outputs.index_name }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this referring to a different create-index step that's present in this action, or is it assuming it's been run prior? It's working properly so I'm just going to assume the scope for steps.<whatever> is global to the whole workflow. 😅

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this outputs declaration is just accidental copy/pasta from the create-index action I made earlier this week for dependency testing purposes so it can be removed. If I wanted to use it I could by having my test suite write the name of the index it creates to the GITHUB_OUTPUT location (see here in script used by create-index action) but in this case it's not needed since the test creates and deletes the index.

Comment on lines +15 to +16
# - euclidean
# - dotproduct
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming you want to keep these commented out for now, but if not.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, leaving them out while still fleshing out the tests and iterating heavily. Dotproduct is having some kind of performance issue at the moment the db team are looking into.

Comment on lines +111 to +114
- 3.9
- '3.10'
- 3.11
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything but 3.8 is commented out in dependency-matrix-grpc, is that intentional?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eh, not exactly. Mainly just trying to save some time and resources while iterating. I should bring them back on the grpc job.

SparseValues as NonGRPCSparseValues
)

class VectorFactoryGRPC:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the work getting the grpc side of things a bit more aligned with the non-grpc paths. 🙏

def build(item: Union[GRPCVector, NonGRPCVector, Tuple, Dict]) -> GRPCVector:
if isinstance(item, GRPCVector):
return item
elif isinstance(item, NonGRPCVector):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it would, but would the VectorFactory benefit from having the flexibility of GRPCVector being added to the union in build?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps in some circumstances, but it's probably a lot less likely to happen than people trying to pass the NonGRPCVector object into into the GRPC-flavored client. The GRPC-flavored version of this object is annoying because you have to convert metadata dicts yourself into Struct before passing the kwarg param. So the ergonomics leave a lot to be desired.

return random_string(10)

@pytest.fixture(scope='session')
def idx(client, index_name, index_host):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: Any specific reason for using idx?

total_time += delta_t
time.sleep(delta_t)

def poll_fetch_for_ids_in_namespace(idx, ids, namespace):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Lots of goodies for improving thinking around how to streamline the integration flows on typscript side, thank you again.

@@ -1,5 +1,6 @@
import pytest
from pinecone.utils import convert_to_list
from pinecone import SparseValues
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: Might not be needed.

Comment on lines +39 to +44
def md1():
return {"genre": "action", "year": 2021}


@pytest.fixture
def md2():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: More descriptive metadata1() and metadata2()? Very nitpicky, but since these fixtures are used across tests in other files might help readability.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find everything about these tests loathsome even after some refactoring. 💀

They were previously using attributes on a test class called self.md1, etc, so I kept the names the same for now just to simplify the refactor.

self.config = Config(api_key="test-api-key", host="foo")
self.index = GRPCIndex(config=self.config, index_name="example-name", _endpoint_override="test-endpoint")

def test_describeIndexStats_callWithoutFilter_CalledWithoutFilter(self, mocker):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: I was going to ask about the duplicative seeming naming for tests in the grpc suite, but it seems to be across the board in some of these files.

@jhamon jhamon merged commit 93a04ca into spruce Jan 12, 2024
64 checks passed
@jhamon jhamon deleted the jhamon/data-plane-testing branch January 12, 2024 20:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants