Releases: datastax/astrapy
Release v2.0.0rc1
Changes w.r.t 2.0.0-preview:
Support for TIMEUUID and COUNTER columns:
- enlarged ColumnType enum (used by the API to describe non-creatable columns)
- readable through find/find_one operations (now returned by the API when reading)
shortened string representation of table column descriptors for improved readability
added 'filter' to the `TableAPISupportDescriptor` structure (now returned by the Data API)
added optional `api_support` member to all column descriptor (as the Data API returns it for various columns)
restore support for Python 3.8, 3.9
maintenance: full restructuring of tests and CI (tables+collections on same footing+other)
maintenance: adopt `blockbuster` in async tests to detect (and bust) any blocking call
Release v2.0.0-preview
v2.0.0-preview
Introduction of full Tables support.
Major revision of overall interface including Collection support.
Introduced new astrapy-specific data types for full expressivity (see `serdes_options` below):
- `DataAPIVector` data type
- `DataAPIDate`, `DataAPITime`, `DataAPITimestamp`, `DataAPIDuration`
- `DataAPISet`, `DataAPIMap`
Typing support for Collections (optional):
- `get_collection` and `create_collection` get a `document_type` parameter to go with the type hint `Collection[MyType]`
- if unspecified fall back to `DefaultCollection = Collection[DefaultDocumentType]` (where `DefaultDocumentType = dict[str, Any]`)
- cursors from `find` also allow strict typechecking
Introduced a consistent API Options system:
- an APIOptions object inherited at each "spawn" operation, with overrides
- environment-dependent defaults if nothing supplied
- `serdes_options` to control data types accepted for writes and to select data types for reads
- `serdes_options`: Collections default to using custom types for lossless and full-range expression of database content
- `serdes_options.binary_encode_vectors`, to control usage of binary-encoding for writing vectors.
- e.g. instead of 'datetime.datetime', instances of `DataAPITimestamp` are returned
- Exception: numbers are treated by default as ints and floats. To have them all Decimal, set serdes_option.use_decimals_in_collections to True.
- Use the options' serdes_option to opt out and revert to non-custom data types
- For datetimes, fine control over naive-datetime tolerance and timezone is introduced. Usage of naive datetime is now OPT-IN.
- Support for arbitrary 'database' and 'admin' headers throughout the object chain
- Fully reworked timeout options through all abstractions:
- `TimeoutOptions` has six classes of timeouts, applying differently to various methods according to the kind of method. Timeouts can be overridden per-method-call
- removal of the 'max_time_ms` parameter ==> still a quick migration path is to replace it with `timeout_ms` throughout
- timeout of 0 means that timeout is disabled
Reworked and enriched `FindCursor` interface:
- Cursors are typed, similarly to Tables and Collections. The `find` method has an optional `document_type` parameter for typechecking.
- Cursor classes renamed to `[Async]CollectionCursor`
- Base class for all (find) cursors renamed to `FindCursor`
- introduced `map` and `to_list` methods
- `cursor.state` now has values in `FindCursorState` enum (take `cursor.state.value` for a string)
- 'cursor.address' is removed from the API
- `cursor.rewind()` returns None, mutates cursor in-place
- removed 'cursor.distinct()': use the corresponding collection(/table) method.
- removed cursor '.keyspace' property
- removed 'retrieved' for cursors: use `consumed`
- added many cursor management methods (see docstrings for details)
Other changes to existing API:
- `Database.create_collection`: signature change (now accepts a single "collection definition")
- added parameter `definition` to method (a CollectionDefinition, plain dictionary or None)
- (support for `source_model` vector index setting within the `definition` parameter)
- removed 'dimension', 'metric', 'source_model', 'service', 'indexing', 'default_id_type' (all of them subsumed in `definition`)
- removed parameters 'additional_options' and 'timeout_ms' as part of the broader timeout rework
- renamed 'CollectionOptions' class to `CollectionDefinition` (return type of `Collection.options()`):
- renamed its 'options' attribute into `definition` (although the API payload calls it "options")
- removed its 'raw_options' attribute (redundant w.r.t `CollectionDescriptor.raw_descriptor`)
- `CollectionDefinition`: implemented fluent interface to build collection definition objects
- renamed `CollectionVectorServiceOptions` class to `VectorServiceOptions`
- renamed `astrapy.constants.SortDocuments` to `SortMode`
- renamed (collection-specific) "Result" classes like this:
- 'DeleteResult' ==> `CollectionDeleteResult`
- 'InsertOneResult' ==> `CollectionInsertOneResult`
- 'InsertManyResult' ==> `CollectionInsertManyResult`
- 'UpdateResult' ==> `CollectionUpdateResult`
- signature change from `-> {"ok": 1}` to `-> None` for some admin and schema methods:
- `AstraDBAdmin`: `drop_database` (+ async)
- `AstraDBDatabaseAdmin`, `DataAPIDatabaseAdmin`: `create_keyspace`, `drop_keyspace`, `drop` (+ async)
- `Database`, `AsyncDatabase`: `drop_collection`, `drop_table`
- `Collection`, `AsyncCollection`: `drop`
- renamed parameter 'collection_name' to `collection_or_table_name` and allow for `keyspace=None` in database `command()` method
- [Async]Database `drop_collection` method now accepts a keyspace parameter.
- `AsyncDatabase` methods `get_collection` and `get_table` are not async functions anymore (remove the await when calling them)
- the following "info" methods are made async (= awaitable): `AsyncDatabase.info`, `AsyncDatabase.name`, `AsyncCollection.info`, `AsyncTable.info`, `AsyncDatabase.list_collections`, `AsyncDatabase.list_tables`
- Database info structure: changed class name and reworked attributes of `AstraDBAdminDatabaseInfo` (formerly 'AdminDatabaseInfo') and `AstraDBDatabaseInfo` (formerly 'DatabaseInfo')
- `[Async]Collection` and `[Async]Database`: `info` method now accepts the relevant timeout parameters
- remove 'check_exists' from `[Async]Database.create_collection` method (the client does no checks now)
- removed AstraDBDatabaseAdmin's `from_api_endpoint` static method (reason: unused)
- remove 'database' parameter to the `to_sync()` and `to_async()` conversion methods for collections
- `[Async]Database.drop_collection` method accepts only the string name of the target to drop (no collection objects anymore)
- removed the 'CommandCursor'/'AsyncCommandCursor' classes:
- `AstraDBAdmin`: `list_databases`, `async_list_databases` methods return regular lists
- `[Async]Database`: `list_collections`, `list_tables` methods return regular lists
- `[Async]Database`: added a `.region` property
Exceptions hierarchy reworked:
- removed 'CursorIsStartedException': now `CursorException` raised for all state-related illegal calls in cursors
- removed 'CollectionNotFoundException', replaced by a ValueError in the few cases it's needed
- removed `CollectionAlreadyExistsException` class (not used anymore without `check_exists`)
- introduced `InvalidEnvironmentException` for operations invalid on some Data API environments.
- renamed 'InsertManyException' ==> `CollectionInsertManyException`
- renamed 'DeleteManyException' ==> `CollectionDeleteManyException`
- renamed 'UpdateManyException' ==> `CollectionUpdateManyException`
- renamed 'DevOpsAPIFaultyResponseException' ==> `UnexpectedDevOpsAPIResponseException`
- renamed 'DataAPIFaultyResponseException' ==> `UnexpectedDataAPIResponseException`
- (improved string representation of DataAPIResponseException cases with multiple error descriptors)
Removal of deprecated modules, objects, patterns and parameters:
- 'core' (i.e. pre-1.0) library
- 'collection.bulk_write' and the associated result and exception classes
- 'vector=', 'vectorize=' and 'vectors=' parameters from collection methods
- 'set_caller' method of `DataAPIClient`, `AstraDBAdmin`, `DataAPIDatabaseAdmin`, `AstraDBDatabaseAdmin`, `[Async]Database`, `[Async]Collection`
- 'caller_name' and 'caller_version' parameters. A single list-of-pairs `callers` is now expected
- 'id' and 'region' to DataAPIClient's 'get_database' (and async version). Use `api_endpoint` which is now the one positional parameter.
- Accordingly, the syntax `client[api_endpoint]` also does not accept a database ID anymore.
- 'region' parameter of `AstraDBDatabaseAdmin.get[_async]_database` (was ignored already in the method)
- 'namespace' parameter of several methods of: DataAPIClient, admin objects, Database and Collection (use `keyspace`)
- 'namespace' property of CollectionInfo, DatabaseInfo, CollectionNotFoundException, CollectionAlreadyExistsException (use `keyspace`)
- 'namespace' property of `Database` and `Collection` (switch to `keyspace`)
- 'update_db_namespace' parameter for keyspace admin methods (use `update_db_keyspace`)
- 'use_namespace' for `Databases` (switch to `use_keyspace`)
- 'delete_all' method of `Collection` and `AsyncCollection` (use `delete_many({})`)
API payloads are encoded with full Unicode (not encoded in ASCII anymore) for HTTP requests
- Revision of all "spawning and copying" methods for abstractions. Parameters added/removed/renamed (switch to the corresponding parameters inside the APIOptions instead of the removed keyword parameters):
- All the client/admin/database/table/collection classes have an `api_options` parameter in their `with_options/to_[a]sync` method
- `DataAPIClient`
- `_copy()`, `with_options()`: removed 'callers'
- `get_..._database...()`: removed 'api_path', 'api_version'
- `get_admin()`: removed 'dev_ops_url', 'dev_ops_api_version'
- `AstraDBAdmin`
- `(_copy)`: removed 'environment', 'dev_ops_url', 'dev_ops_api_version', 'callers'
- `(with_options)`: removed 'callers'
- `(create..._database)`: added `token`, `spawn_api_options`
- `(get..._database)`: removed 'api_path', 'api_version', 'database_request_timeout_ms', 'database_timeout_ms'; renamed 'database_api_options' => `spawn_api_options`
- `(get_da...
Release v1.5.2
v1.5.2
Bugfix: Database.get_collection
uses callers inheritance (same for async)
Release v1.5.1
v. 1.5.1
Switching to endpoint as the only/primary way of specifying databases:
AstraDBClient
tolerates (deprecated, removal in 2.0) id[/region] inget_database
- (internal-use constructors and utilities only accept API Endpoint)
AstraDBAdmin
is the only place where id[/region] will remain an allowed path in 2.0- all tests adapted to reflect this simplification
Admins: resilience against DevOps responses omitting 'keyspace'/'keyspaces'
AstraDBAdmin
: added filters and automatic pagination to [async_]list_databases
Consistent handling of deletedCount=-1
from the API (always returned as-is)
Cursors: alignment and rework:
- states are an enum; state names reworked for clarity (better cursor
__repr__
) _copy
and_to_sync
methods always return a clean pristine cursor- "retrieved" property deprecated (removal 2.0). Use
consumed
. - "collection" property deprecated (removal 2.0). Use
data_source
.
Deprecation of all set_caller
(=> to be set at constructor-time) (removal in 2.0)
Callers and user-agent string:
-
remove RAGStack automatic detection
-
Deprecate "caller_name/caller_version" parameters in favour of
callers
pair list -
(minor) breaking change: passing only one of "caller_name/caller_version" to
_copy
/with_options
will override the whole one-item
callers pair list
Repo housekeeping: -
using ruff for imports and formatting (instead of isort+black) by @cbornet
-
add ruff rules UP(pyupgrade) by @cbornet
-
remove
cassio
unused dependency
Release v1.5.0
v. 1.5.0
Deprecation of "namespace-" terminology, replaced by "keyspace-" (removal in 2.0)
- deprecation of all namespace method names
- deprecation of the
namespace=
named argument to all methods - deprecation of the
update_db_namespace
parameter to create_*space
Deprecation of collection bulk_write method (removal in 2.0)
APICommander logs warnings received from the Data API
Full removal of "core library" from the current API:
- DevOps API accessed through APICommander everywhere
- Admin objects use APICommander consistently
- [Async]Database and [Async]Collection directly use APICommander
- Cursor library uses APICommander directly
- Core library imports triggers a submodule-wide deprecation warning
- (simplification of the vector/vectorize deprecator utility)
Widened exception hierarchy with:
- DevOpsAPIHttpException
- DevOpsAPITimeoutException
- DevOpsAPIFaultyResponseException
Rearrangement into separate modules for:
- constants, strings, magic numbers and settings
- request low-level tools
- payload/response transformations
- (sometimes with temporary duplication to avoid depending on 'core')
Testing:
- testing on HCD targets Data API v 1.0.16
- added tests for APICommander
- improved tests for admin classes
Logging of API requests made more uniform and easier to read
Replaced collections.abc.Iterator => typing.Iterator for python3.8 compatibility
Release v1.4.2
v. 1.4.2
Method 'update_one' of [Async]Collection: now invokes the corresponding API command.
Better URL-parsing error messages for the API endpoint (with guidance on expected format)
Improved __repr__
for: token/auth-related items, Database/Client classes, response+info objects
DataAPIErrorDescriptor can parse 'extend error' in the responses
Introduced DataAPIHttpException (subclass of both httpx.HTTPStatusError and DataAPIException)
testing on HCD:
- DockerCompose tweaked to invoke
docker compose
- HCD 1.0.0 and Data API 1.0.15 as test targets
relaxed dependency on "uuid6" to most recent releases
core:
- prefetched find iterators: fix second-thread hangups in some cases (by @cbornet)
- added 'options' parameter to [Async]AstraDBCollection.update_one
Release v1.4.1
v. 1.4.1
FindEmbeddingProvidersResult and descendant dataclasses:
- add handling of optional 'hint' and 'displayName' fields for parameters
- knowedge of optional-as-null vs optional-as-possibly-absent ancillary fields
Replace bson dependency with pymongo (#297, by @caseyclements)
Release v1.4.0
DatabaseAdmin classes retain a reference to the Async/Database instance that spawned it, if any
- introduced a spawner_database parameter to database admin constructors
- database admin can retroactively set the db's working namespace upon creation of same
- Idiom
database = client.get_database(...); database.get_database_admin().create_namespace("the_namespace", update_db_namespace=True)
Database (and AsyncDatabase) classes admit null namespace:
- default to "default_namespace" only for Astra, otherwise null
- as long as null, most operations are unavailable and error out
- a
use_namespace
method to (mutably) set the working namespace on a database instance
AstraDBDatabaseAdmin class is fully region-aware:
- can be instantiated with an endpoint (also
id
parameter aliased toapi_endpoint
) - requires a region to be specified with an ID, unless auto-guess can be done
VectorizeOps: support for find_embedding_providers Database method
Support for multiple-header embedding api keys:
EmbeddingHeadersProvider
classes forembedding_api_key
parameter- AWS header provider in addition to the regular one-header one
- adapt CI to cover this setup
Testing:
- restructure CI to fully support HCD alongside Astra DB
- add details for testing new embedding providers
Release v1.3.1
- Fixed bug in parsing endpoint domain names containing hyphens (#287), by @bradfordcp
- Added isort for source code formatting
- Updated abstractions diagram in README for non-Astra environments
Release v1.3.0
- Integration testing covers Astra and nonAstra smoothly:
* idiomatic library
* vectorize_idiomatic
* nonAstra admin, i.e. namespace crud - Add the
TokenProvider
abstract class => andStaticTokenProvider
,UsernamePasswordTokenProvider
- Introduce CHANGES file.
- Add
__eq__
and_copy
methods toAPICommander
class - Allow
delete_many({})
with empty filter - Implement
include_sort_vector
option toCollection.find
andget_sort_vector
to cursors - Add Content-Type header to all API requests
- Added HCD and CASSANDRA Environment values (besides the other non-Astra DSE and OTHER)
- Clearer string repr of cursors ('retrieved' => 'yielded so far')
- Deprecation of collection
delete_all
method in favour ofdelete_many(filter={})
* Introduction of a custom deprecation decorator for async method removal tests - Deprecation of vector,vectors and vectorize params from collections and Operations
- Remove several long-deprecated methods from core API (i.e. internal changes):
AstraDBCollection.delete
=>delete_one
AstraDBCollection.upsert
=>upsert_one
AsyncAstraDBCollection.upsert
=>upsert_one
AstraDB.truncate_collection
=>AstraDBCollection.clear
AsyncAstraDB.truncate_collection
=>AsyncAstraDBCollectionclear
- Add support for null tokens in the core library