Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

📝 Update signature and docs for Collection.mapped #367

Merged
merged 2 commits into from
Apr 12, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 17 additions & 7 deletions lnschema_core/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -2134,7 +2134,9 @@ def __init__(

def mapped(
self,
label_keys: str | list[str] | None = None,
layers_keys: str | list[str] | None = None,
obs_keys: str | list[str] | None = None,
obsm_keys: str | list[str] | None = None,
join: Literal["inner", "outer"] | None = "inner",
encode_labels: bool | list[str] = True,
unknown_label: str | dict[str, str] | None = None,
Expand All @@ -2153,25 +2155,33 @@ def mapped(
If your `AnnData` collection is in the cloud, move them into a local
cache first via :meth:`~lamindb.Collection.stage`.

`__getitem__` of the `MappedCollection` object takes a single integer index
and returns a dictionary with the observation data sample for this index from
the `AnnData` objects in the collection. The dictionary has keys for `layers_keys`
(`.X` is in `"X"`), `obs_keys`, `obsm_keys` (under `f"obsm_{key}"`) and also `"_store_idx"`
for the index of the `AnnData` object containing this observation sample.

.. note::

For a guide, see :doc:`docs:scrna5`.

This method currently only works for collections of `AnnData` artifacts.

Args:
label_keys: Columns of the ``.obs`` slot - the names of the metadata
features storing labels.
layers_keys: Keys from the ``.layers`` slot. ``layers_keys=None`` or ``"X"`` in the list
retrieves ``.X``.
obsm_keys: Keys from the ``.obsm`` slots.
obs_keys: Keys from the ``.obs`` slots.
join: `"inner"` or `"outer"` virtual joins. If ``None`` is passed,
does not join.
encode_labels: Encode labels into integers.
Can be a list with elements from ``label_keys```.
Can be a list with elements from ``obs_keys``.
unknown_label: Encode this label to -1.
Can be a dictionary with keys from ``label_keys`` if ``encode_labels=True```
Can be a dictionary with keys from ``obs_keys`` if ``encode_labels=True``
or from ``encode_labels`` if it is a list.
cache_categories: Enable caching categories of ``label_keys`` for faster access.
cache_categories: Enable caching categories of ``obs_keys`` for faster access.
parallel: Enable sampling with multiple processes.
dtype: Convert numpy arrays from ``.X`` to this dtype on selection.
dtype: Convert numpy arrays from ``.X``, ``.layers`` and ``.obsm``
stream: Whether to stream data from the array backend.
is_run_input: Whether to track this collection as run input.

Expand Down
Loading