Skip to content

Releases: datastax/astrapy

Release v2.0.0rc1

20 Dec 15:17
e968998
Compare
Choose a tag to compare
Release v2.0.0rc1 Pre-release
Pre-release

Changes w.r.t 2.0.0-preview:

Support for TIMEUUID and COUNTER columns:
    - enlarged ColumnType enum (used by the API to describe non-creatable columns)
    - readable through find/find_one operations (now returned by the API when reading)
shortened string representation of table column descriptors for improved readability
added 'filter' to the `TableAPISupportDescriptor` structure (now returned by the Data API)
added optional `api_support` member to all column descriptor (as the Data API returns it for various columns)
restore support for Python 3.8, 3.9
maintenance: full restructuring of tests and CI (tables+collections on same footing+other)
maintenance: adopt `blockbuster` in async tests to detect (and bust) any blocking call

Release v2.0.0-preview

06 Dec 01:32
a5484b9
Compare
Choose a tag to compare
Pre-release

v2.0.0-preview

Introduction of full Tables support.
Major revision of overall interface including Collection support.

Introduced new astrapy-specific data types for full expressivity (see `serdes_options` below):
    - `DataAPIVector` data type
    - `DataAPIDate`, `DataAPITime`, `DataAPITimestamp`, `DataAPIDuration`
    - `DataAPISet`, `DataAPIMap`

Typing support for Collections (optional):
    - `get_collection` and `create_collection` get a `document_type` parameter to go with the type hint `Collection[MyType]`
    - if unspecified fall back to `DefaultCollection = Collection[DefaultDocumentType]` (where `DefaultDocumentType = dict[str, Any]`)
    - cursors from `find` also allow strict typechecking

Introduced a consistent API Options system:
    - an APIOptions object inherited at each "spawn" operation, with overrides
    - environment-dependent defaults if nothing supplied
    - `serdes_options` to control data types accepted for writes and to select data types for reads
    - `serdes_options`: Collections default to using custom types for lossless and full-range expression of database content
        - `serdes_options.binary_encode_vectors`, to control usage of binary-encoding for writing vectors.
        - e.g. instead of 'datetime.datetime', instances of `DataAPITimestamp` are returned
        - Exception: numbers are treated by default as ints and floats. To have them all Decimal, set serdes_option.use_decimals_in_collections to True.
        - Use the options' serdes_option to opt out and revert to non-custom data types
        - For datetimes, fine control over naive-datetime tolerance and timezone is introduced. Usage of naive datetime is now OPT-IN.
    - Support for arbitrary 'database' and 'admin' headers throughout the object chain
    - Fully reworked timeout options through all abstractions:
        - `TimeoutOptions` has six classes of timeouts, applying differently to various methods according to the kind of method. Timeouts can be overridden per-method-call
        - removal of the 'max_time_ms` parameter ==> still a quick migration path is to replace it with `timeout_ms` throughout
        - timeout of 0 means that timeout is disabled

Reworked and enriched `FindCursor` interface:
    - Cursors are typed, similarly to Tables and Collections. The `find` method has an optional `document_type` parameter for typechecking.
    - Cursor classes renamed to `[Async]CollectionCursor`
    - Base class for all (find) cursors renamed to `FindCursor`
    - introduced `map` and `to_list` methods
    - `cursor.state` now has values in `FindCursorState` enum (take `cursor.state.value` for a string)
    - 'cursor.address' is removed from the API
    - `cursor.rewind()` returns None, mutates cursor in-place
    - removed 'cursor.distinct()': use the  corresponding collection(/table) method.
    - removed cursor '.keyspace' property
    - removed 'retrieved' for cursors: use `consumed`
    - added many cursor management methods (see docstrings for details)

Other changes to existing API:
    - `Database.create_collection`: signature change (now accepts a single "collection definition")
        - added parameter `definition` to method (a CollectionDefinition, plain dictionary or None)
        - (support for `source_model` vector index setting within the `definition` parameter)
        - removed 'dimension', 'metric', 'source_model', 'service', 'indexing', 'default_id_type' (all of them subsumed in `definition`)
        - removed parameters 'additional_options' and 'timeout_ms' as part of the broader timeout rework
    - renamed 'CollectionOptions' class to `CollectionDefinition` (return type of `Collection.options()`):
        - renamed its 'options' attribute into `definition` (although the API payload calls it "options")
        - removed its 'raw_options' attribute (redundant w.r.t `CollectionDescriptor.raw_descriptor`)
        - `CollectionDefinition`: implemented fluent interface to build collection definition objects
    - renamed `CollectionVectorServiceOptions` class to `VectorServiceOptions`
    - renamed `astrapy.constants.SortDocuments` to `SortMode`
    - renamed (collection-specific) "Result" classes like this:
        - 'DeleteResult' ==> `CollectionDeleteResult`
        - 'InsertOneResult' ==> `CollectionInsertOneResult`
        - 'InsertManyResult' ==> `CollectionInsertManyResult`
        - 'UpdateResult' ==> `CollectionUpdateResult`
    - signature change from `-> {"ok": 1}` to `-> None` for some admin and schema methods:
        - `AstraDBAdmin`: `drop_database` (+ async)
        - `AstraDBDatabaseAdmin`, `DataAPIDatabaseAdmin`: `create_keyspace`, `drop_keyspace`, `drop` (+ async)
        - `Database`, `AsyncDatabase`: `drop_collection`, `drop_table`
        - `Collection`, `AsyncCollection`: `drop`
    - renamed parameter 'collection_name' to `collection_or_table_name` and allow for `keyspace=None` in database `command()` method
    - [Async]Database `drop_collection` method now accepts a keyspace parameter.
    - `AsyncDatabase` methods `get_collection` and `get_table` are not async functions anymore (remove the await when calling them)
    - the following "info" methods are made async (= awaitable): `AsyncDatabase.info`, `AsyncDatabase.name`, `AsyncCollection.info`, `AsyncTable.info`, `AsyncDatabase.list_collections`, `AsyncDatabase.list_tables`
    - Database info structure: changed class name and reworked attributes of `AstraDBAdminDatabaseInfo` (formerly 'AdminDatabaseInfo') and `AstraDBDatabaseInfo` (formerly 'DatabaseInfo')
    - `[Async]Collection` and `[Async]Database`: `info` method now accepts the relevant timeout parameters
    - remove 'check_exists' from `[Async]Database.create_collection` method (the client does no checks now)
    - removed AstraDBDatabaseAdmin's `from_api_endpoint` static method (reason: unused)
    - remove 'database' parameter to the `to_sync()` and `to_async()` conversion methods for collections
    - `[Async]Database.drop_collection` method accepts only the string name of the target to drop (no collection objects anymore)
    - removed the 'CommandCursor'/'AsyncCommandCursor' classes:
        - `AstraDBAdmin`: `list_databases`, `async_list_databases` methods return regular lists
        - `[Async]Database`: `list_collections`, `list_tables` methods return regular lists
    - `[Async]Database`: added a `.region` property

Exceptions hierarchy reworked:
    - removed 'CursorIsStartedException': now `CursorException` raised for all state-related illegal calls in cursors
    - removed 'CollectionNotFoundException', replaced by a ValueError in the few cases it's needed
    - removed `CollectionAlreadyExistsException` class (not used anymore without `check_exists`)
    - introduced `InvalidEnvironmentException` for operations invalid on some Data API environments.
    - renamed 'InsertManyException' ==> `CollectionInsertManyException`
    - renamed 'DeleteManyException' ==> `CollectionDeleteManyException`
    - renamed 'UpdateManyException' ==> `CollectionUpdateManyException`
    - renamed 'DevOpsAPIFaultyResponseException' ==> `UnexpectedDevOpsAPIResponseException`
    - renamed 'DataAPIFaultyResponseException' ==> `UnexpectedDataAPIResponseException`
    - (improved string representation of DataAPIResponseException cases with multiple error descriptors)

Removal of deprecated modules, objects, patterns and parameters:
    - 'core' (i.e. pre-1.0) library
    - 'collection.bulk_write' and the associated result and exception classes
    - 'vector=', 'vectorize=' and 'vectors=' parameters from collection methods
    - 'set_caller' method of `DataAPIClient`, `AstraDBAdmin`, `DataAPIDatabaseAdmin`, `AstraDBDatabaseAdmin`, `[Async]Database`, `[Async]Collection`
    - 'caller_name' and 'caller_version' parameters. A single list-of-pairs `callers` is now expected
    - 'id' and 'region' to DataAPIClient's 'get_database' (and async version). Use `api_endpoint` which is now the one positional parameter.
    - Accordingly, the syntax `client[api_endpoint]` also does not accept a database ID anymore.
    - 'region' parameter of `AstraDBDatabaseAdmin.get[_async]_database` (was ignored already in the method)
    - 'namespace' parameter of several methods of: DataAPIClient, admin objects, Database and Collection (use `keyspace`)
    - 'namespace' property of CollectionInfo, DatabaseInfo, CollectionNotFoundException, CollectionAlreadyExistsException (use `keyspace`)
    - 'namespace' property of `Database` and `Collection` (switch to `keyspace`)
    - 'update_db_namespace' parameter for keyspace admin methods (use `update_db_keyspace`)
    - 'use_namespace' for `Databases` (switch to `use_keyspace`)
    - 'delete_all' method of `Collection` and `AsyncCollection` (use `delete_many({})`)

API payloads are encoded with full Unicode (not encoded in ASCII anymore) for HTTP requests

- Revision of all "spawning and copying" methods for abstractions. Parameters added/removed/renamed (switch to the corresponding parameters inside the APIOptions instead of the removed keyword parameters):
    - All the client/admin/database/table/collection classes have an `api_options` parameter in their `with_options/to_[a]sync` method
    - `DataAPIClient`
        - `_copy()`, `with_options()`: removed 'callers'
        - `get_..._database...()`: removed 'api_path', 'api_version'
        - `get_admin()`: removed 'dev_ops_url', 'dev_ops_api_version'
    - `AstraDBAdmin`
        - `(_copy)`: removed 'environment', 'dev_ops_url', 'dev_ops_api_version', 'callers'
        - `(with_options)`: removed 'callers'
        - `(create..._database)`: added `token`, `spawn_api_options`
        - `(get..._database)`: removed 'api_path', 'api_version', 'database_request_timeout_ms', 'database_timeout_ms'; renamed 'database_api_options' => `spawn_api_options`
        - `(get_da...
Read more

Release v1.5.2

08 Oct 21:52
4601c5f
Compare
Choose a tag to compare

v1.5.2

Bugfix: Database.get_collection uses callers inheritance (same for async)

Release v1.5.1

08 Oct 14:17
37bd0ea
Compare
Choose a tag to compare

v. 1.5.1

Switching to endpoint as the only/primary way of specifying databases:

  • AstraDBClient tolerates (deprecated, removal in 2.0) id[/region] in get_database
  • (internal-use constructors and utilities only accept API Endpoint)
  • AstraDBAdmin is the only place where id[/region] will remain an allowed path in 2.0
  • all tests adapted to reflect this simplification

Admins: resilience against DevOps responses omitting 'keyspace'/'keyspaces'
AstraDBAdmin: added filters and automatic pagination to [async_]list_databases
Consistent handling of deletedCount=-1 from the API (always returned as-is)
Cursors: alignment and rework:

  • states are an enum; state names reworked for clarity (better cursor __repr__)
  • _copy and _to_sync methods always return a clean pristine cursor
  • "retrieved" property deprecated (removal 2.0). Use consumed.
  • "collection" property deprecated (removal 2.0). Use data_source.

Deprecation of all set_caller (=> to be set at constructor-time) (removal in 2.0)
Callers and user-agent string:

  • remove RAGStack automatic detection

  • Deprecate "caller_name/caller_version" parameters in favour of callers pair list

  • (minor) breaking change: passing only one of "caller_name/caller_version" to _copy/with_options will override the whole one-item
    callers pair list
    Repo housekeeping:

  • using ruff for imports and formatting (instead of isort+black) by @cbornet

  • add ruff rules UP(pyupgrade) by @cbornet

  • remove cassio unused dependency

Release v1.5.0

21 Sep 15:31
bd99707
Compare
Choose a tag to compare

v. 1.5.0

Deprecation of "namespace-" terminology, replaced by "keyspace-" (removal in 2.0)

  • deprecation of all namespace method names
  • deprecation of the namespace= named argument to all methods
  • deprecation of the update_db_namespace parameter to create_*space

Deprecation of collection bulk_write method (removal in 2.0)
APICommander logs warnings received from the Data API
Full removal of "core library" from the current API:

  • DevOps API accessed through APICommander everywhere
  • Admin objects use APICommander consistently
  • [Async]Database and [Async]Collection directly use APICommander
  • Cursor library uses APICommander directly
  • Core library imports triggers a submodule-wide deprecation warning
  • (simplification of the vector/vectorize deprecator utility)

Widened exception hierarchy with:

  • DevOpsAPIHttpException
  • DevOpsAPITimeoutException
  • DevOpsAPIFaultyResponseException

Rearrangement into separate modules for:

  • constants, strings, magic numbers and settings
  • request low-level tools
  • payload/response transformations
  • (sometimes with temporary duplication to avoid depending on 'core')

Testing:

  • testing on HCD targets Data API v 1.0.16
  • added tests for APICommander
  • improved tests for admin classes

Logging of API requests made more uniform and easier to read
Replaced collections.abc.Iterator => typing.Iterator for python3.8 compatibility

Release v1.4.2

10 Sep 13:56
474fe4f
Compare
Choose a tag to compare

v. 1.4.2

Method 'update_one' of [Async]Collection: now invokes the corresponding API command.
Better URL-parsing error messages for the API endpoint (with guidance on expected format)
Improved __repr__ for: token/auth-related items, Database/Client classes, response+info objects
DataAPIErrorDescriptor can parse 'extend error' in the responses
Introduced DataAPIHttpException (subclass of both httpx.HTTPStatusError and DataAPIException)
testing on HCD:

  • DockerCompose tweaked to invoke docker compose
  • HCD 1.0.0 and Data API 1.0.15 as test targets

relaxed dependency on "uuid6" to most recent releases
core:

  • prefetched find iterators: fix second-thread hangups in some cases (by @cbornet)
  • added 'options' parameter to [Async]AstraDBCollection.update_one

Release v1.4.1

31 Jul 13:50
e5eedbc
Compare
Choose a tag to compare

v. 1.4.1

FindEmbeddingProvidersResult and descendant dataclasses:
- add handling of optional 'hint' and 'displayName' fields for parameters
- knowedge of optional-as-null vs optional-as-possibly-absent ancillary fields
Replace bson dependency with pymongo (#297, by @caseyclements)

Release v1.4.0

09 Jul 21:49
a4885a4
Compare
Choose a tag to compare

DatabaseAdmin classes retain a reference to the Async/Database instance that spawned it, if any

  • introduced a spawner_database parameter to database admin constructors
  • database admin can retroactively set the db's working namespace upon creation of same
  • Idiom database = client.get_database(...); database.get_database_admin().create_namespace("the_namespace", update_db_namespace=True)

Database (and AsyncDatabase) classes admit null namespace:

  • default to "default_namespace" only for Astra, otherwise null
  • as long as null, most operations are unavailable and error out
  • a use_namespace method to (mutably) set the working namespace on a database instance

AstraDBDatabaseAdmin class is fully region-aware:

  • can be instantiated with an endpoint (also id parameter aliased to api_endpoint)
  • requires a region to be specified with an ID, unless auto-guess can be done

VectorizeOps: support for find_embedding_providers Database method

Support for multiple-header embedding api keys:

  • EmbeddingHeadersProvider classes for embedding_api_key parameter
  • AWS header provider in addition to the regular one-header one
  • adapt CI to cover this setup

Testing:

  • restructure CI to fully support HCD alongside Astra DB
  • add details for testing new embedding providers

Release v1.3.1

26 Jun 21:12
85a3077
Compare
Choose a tag to compare
  • Fixed bug in parsing endpoint domain names containing hyphens (#287), by @bradfordcp
  • Added isort for source code formatting
  • Updated abstractions diagram in README for non-Astra environments

Release v1.3.0

22 Jun 07:04
4611be5
Compare
Choose a tag to compare
  • Integration testing covers Astra and nonAstra smoothly:
    * idiomatic library
    * vectorize_idiomatic
    * nonAstra admin, i.e. namespace crud
  • Add the TokenProvider abstract class => and StaticTokenProvider, UsernamePasswordTokenProvider
  • Introduce CHANGES file.
  • Add __eq__ and _copy methods to APICommander class
  • Allow delete_many({}) with empty filter
  • Implement include_sort_vector option to Collection.find and get_sort_vector to cursors
  • Add Content-Type header to all API requests
  • Added HCD and CASSANDRA Environment values (besides the other non-Astra DSE and OTHER)
  • Clearer string repr of cursors ('retrieved' => 'yielded so far')
  • Deprecation of collection delete_all method in favour of delete_many(filter={})
    * Introduction of a custom deprecation decorator for async method removal tests
  • Deprecation of vector,vectors and vectorize params from collections and Operations
  • Remove several long-deprecated methods from core API (i.e. internal changes):
    • AstraDBCollection.delete => delete_one
    • AstraDBCollection.upsert => upsert_one
    • AsyncAstraDBCollection.upsert => upsert_one
    • AstraDB.truncate_collection => AstraDBCollection.clear
    • AsyncAstraDB.truncate_collection => AsyncAstraDBCollectionclear
  • Add support for null tokens in the core library