Skip to content

Commit

Permalink
docs(framework) Add examples to records docstrings (#4021)
Browse files Browse the repository at this point in the history
Co-authored-by: Heng Pan <pan@flower.ai>
  • Loading branch information
jafermarq and panh99 authored Aug 21, 2024
1 parent 29827fc commit 53176af
Show file tree
Hide file tree
Showing 4 changed files with 258 additions and 47 deletions.
64 changes: 49 additions & 15 deletions src/py/flwr/common/record/configsrecord.py
Original file line number Diff line number Diff line change
Expand Up @@ -58,27 +58,61 @@ def is_valid(__v: ConfigsScalar) -> None:


class ConfigsRecord(TypedDict[str, ConfigsRecordValues]):
"""Configs record."""
"""Configs record.
A :code:`ConfigsRecord` is a Python dictionary designed to ensure that
each key-value pair adheres to specified data types. A :code:`ConfigsRecord`
is one of the types of records that a
`flwr.common.RecordSet <flwr.common.RecordSet.html#recordset>`_ supports and
can therefore be used to construct :code:`common.Message` objects.
Parameters
----------
configs_dict : Optional[Dict[str, ConfigsRecordValues]]
A dictionary that stores basic types (i.e. `str`, `int`, `float`, `bytes` as
defined in `ConfigsScalar`) and lists of such types (see
`ConfigsScalarList`).
keep_input : bool (default: True)
A boolean indicating whether config passed should be deleted from the input
dictionary immediately after adding them to the record. When set
to True, the data is duplicated in memory. If memory is a concern, set
it to False.
Examples
--------
The usage of a :code:`ConfigsRecord` is envisioned for sending configuration values
telling the target node how to perform a certain action (e.g. train/evaluate a model
). You can use standard Python built-in types such as :code:`float`, :code:`str`
, :code:`bytes`. All types allowed are defined in
:code:`flwr.common.ConfigsRecordValues`. While lists are supported, we
encourage you to use a :code:`ParametersRecord` instead if these are of high
dimensionality.
Let's see some examples of how to construct a :code:`ConfigsRecord` from scratch:
>>> from flwr.common import ConfigsRecord
>>>
>>> # A `ConfigsRecord` is a specialized Python dictionary
>>> record = ConfigsRecord({"lr": 0.1, "batch-size": 128})
>>> # You can add more content to an existing record
>>> record["compute-average"] = True
>>> # It also supports lists
>>> record["loss-fn-coefficients"] = [0.4, 0.25, 0.35]
>>> # And string values (among other types)
>>> record["path-to-S3"] = "s3://bucket_name/folder1/fileA.json"
Just like the other types of records in a :code:`flwr.common.RecordSet`, types are
enforced. If you need to add a custom data structure or object, we recommend to
serialise it into bytes and save it as such (bytes are allowed in a
:code:`ConfigsRecord`)
"""

def __init__(
self,
configs_dict: Optional[Dict[str, ConfigsRecordValues]] = None,
keep_input: bool = True,
) -> None:
"""Construct a ConfigsRecord object.
Parameters
----------
configs_dict : Optional[Dict[str, ConfigsRecordValues]]
A dictionary that stores basic types (i.e. `str`, `int`, `float`, `bytes` as
defined in `ConfigsScalar`) and lists of such types (see
`ConfigsScalarList`).
keep_input : bool (default: True)
A boolean indicating whether config passed should be deleted from the input
dictionary immediately after adding them to the record. When set
to True, the data is duplicated in memory. If memory is a concern, set
it to False.
"""

super().__init__(_check_key, _check_value)
if configs_dict:
for k in list(configs_dict.keys()):
Expand Down
68 changes: 54 additions & 14 deletions src/py/flwr/common/record/metricsrecord.py
Original file line number Diff line number Diff line change
Expand Up @@ -58,26 +58,66 @@ def is_valid(__v: MetricsScalar) -> None:


class MetricsRecord(TypedDict[str, MetricsRecordValues]):
"""Metrics record."""
"""Metrics recod.
A :code:`MetricsRecord` is a Python dictionary designed to ensure that
each key-value pair adheres to specified data types. A :code:`MetricsRecord`
is one of the types of records that a
`flwr.common.RecordSet <flwr.common.RecordSet.html#recordset>`_ supports and
can therefore be used to construct :code:`common.Message` objects.
Parameters
----------
metrics_dict : Optional[Dict[str, MetricsRecordValues]]
A dictionary that stores basic types (i.e. `int`, `float` as defined
in `MetricsScalar`) and list of such types (see `MetricsScalarList`).
keep_input : bool (default: True)
A boolean indicating whether metrics should be deleted from the input
dictionary immediately after adding them to the record. When set
to True, the data is duplicated in memory. If memory is a concern, set
it to False.
Examples
--------
The usage of a :code:`MetricsRecord` is envisioned for communicating results
obtained when a node performs an action. A few typical examples include:
communicating the training accuracy after a model is trained locally by a
:code:`ClientApp`, reporting the validation loss obtained at a :code:`ClientApp`,
or, more generally, the output of executing a query by the :code:`ClientApp`.
Common to these examples is that the output can be typically represented by
a single scalar (:code:`int`, :code:`float`) or list of scalars.
Let's see some examples of how to construct a :code:`MetricsRecord` from scratch:
>>> from flwr.common import MetricsRecord
>>>
>>> # A `MetricsRecord` is a specialized Python dictionary
>>> record = MetricsRecord({"accuracy": 0.94})
>>> # You can add more content to an existing record
>>> record["loss"] = 0.01
>>> # It also supports lists
>>> record["loss-historic"] = [0.9, 0.5, 0.01]
Since types are enforced, the types of the objects inserted are checked. For a
:code:`MetricsRecord`, value types allowed are those in defined in
:code:`flwr.common.MetricsRecordValues`. Similarly, only :code:`str` keys are
allowed.
>>> from flwr.common import MetricsRecord
>>>
>>> record = MetricsRecord() # an empty record
>>> # Add unsupported value
>>> record["something-unsupported"] = {'a': 123} # Will throw a `TypeError`
If you need a more versatily type of record try :code:`ConfigsRecord` or
:code:`ParametersRecord`.
"""

def __init__(
self,
metrics_dict: Optional[Dict[str, MetricsRecordValues]] = None,
keep_input: bool = True,
):
"""Construct a MetricsRecord object.
Parameters
----------
metrics_dict : Optional[Dict[str, MetricsRecordValues]]
A dictionary that stores basic types (i.e. `int`, `float` as defined
in `MetricsScalar`) and list of such types (see `MetricsScalarList`).
keep_input : bool (default: True)
A boolean indicating whether metrics should be deleted from the input
dictionary immediately after adding them to the record. When set
to True, the data is duplicated in memory. If memory is a concern, set
it to False.
"""
super().__init__(_check_key, _check_value)
if metrics_dict:
for k in list(metrics_dict.keys()):
Expand Down
101 changes: 84 additions & 17 deletions src/py/flwr/common/record/parametersrecord.py
Original file line number Diff line number Diff line change
Expand Up @@ -83,33 +83,100 @@ def _check_value(value: Array) -> None:


class ParametersRecord(TypedDict[str, Array]):
"""Parameters record.
r"""Parameters record.
A dataclass storing named Arrays in order. This means that it holds entries as an
OrderedDict[str, Array]. ParametersRecord objects can be viewed as an equivalent to
PyTorch's state_dict, but holding serialised tensors instead.
PyTorch's state_dict, but holding serialised tensors instead. A
:code:`ParametersRecord` is one of the types of records that a
`flwr.common.RecordSet <flwr.common.RecordSet.html#recordset>`_ supports and
can therefore be used to construct :code:`common.Message` objects.
Parameters
----------
array_dict : Optional[OrderedDict[str, Array]]
A dictionary that stores serialized array-like or tensor-like objects.
keep_input : bool (default: False)
A boolean indicating whether parameters should be deleted from the input
dictionary immediately after adding them to the record. If False, the
dictionary passed to `set_parameters()` will be empty once exiting from that
function. This is the desired behaviour when working with very large
models/tensors/arrays. However, if you plan to continue working with your
parameters after adding it to the record, set this flag to True. When set
to True, the data is duplicated in memory.
Examples
--------
The usage of :code:`ParametersRecord` is envisioned for storing data arrays (e.g.
parameters of a machine learning model). These first need to be serialized into
a :code:`flwr.common.Array` data structure.
Let's see some examples:
>>> import numpy as np
>>> from flwr.common import ParametersRecord
>>> from flwr.common import array_from_numpy
>>>
>>> # Let's create a simple NumPy array
>>> arr_np = np.random.randn(3, 3)
>>>
>>> # If we print it
>>> array([[-1.84242409, -1.01539537, -0.46528405],
>>> [ 0.32991896, 0.55540414, 0.44085534],
>>> [-0.10758364, 1.97619858, -0.37120501]])
>>>
>>> # Let's create an Array out of it
>>> arr = array_from_numpy(arr_np)
>>>
>>> # If we print it you'll see (note the binary data)
>>> Array(dtype='float64', shape=[3,3], stype='numpy.ndarray', data=b'@\x99\x18...')
>>>
>>> # Adding it to a ParametersRecord:
>>> p_record = ParametersRecord({"my_array": arr})
Now that the NumPy array is embedded into a :code:`ParametersRecord` it could be
sent if added as part of a :code:`common.Message` or it could be saved as a
persistent state of a :code:`ClientApp` via its context. Regardless of the usecase,
we will sooner or later want to recover the array in its original NumPy
representation. For the example above, where the array was serialized using the
built-in utility function, deserialization can be done as follows:
>>> # Use the Array's built-in method
>>> arr_np_d = arr.numpy()
>>>
>>> # If printed, it will show the exact same data as above:
>>> array([[-1.84242409, -1.01539537, -0.46528405],
>>> [ 0.32991896, 0.55540414, 0.44085534],
>>> [-0.10758364, 1.97619858, -0.37120501]])
If you need finer control on how your arrays are serialized and deserialized, you
can construct :code:`Array` objects directly like this:
>>> from flwr.common import Array
>>> # Serialize your array and construct Array object
>>> arr = Array(
>>> data=ndarray.tobytes(),
>>> dtype=str(ndarray.dtype),
>>> stype="", # Could be used in a deserialization function
>>> shape=list(ndarray.shape),
>>> )
>>>
>>> # Then you can deserialize it like this
>>> arr_np_d = np.frombuffer(
>>> buffer=array.data,
>>> dtype=array.dtype,
>>> ).reshape(array.shape)
Note that different arrays (e.g. from PyTorch, Tensorflow) might require different
serialization mechanism. Howerver, they often support a conversion to NumPy,
therefore allowing to use the same or similar steps as in the example above.
"""

def __init__(
self,
array_dict: Optional[OrderedDict[str, Array]] = None,
keep_input: bool = False,
) -> None:
"""Construct a ParametersRecord object.
Parameters
----------
array_dict : Optional[OrderedDict[str, Array]]
A dictionary that stores serialized array-like or tensor-like objects.
keep_input : bool (default: False)
A boolean indicating whether parameters should be deleted from the input
dictionary immediately after adding them to the record. If False, the
dictionary passed to `set_parameters()` will be empty once exiting from that
function. This is the desired behaviour when working with very large
models/tensors/arrays. However, if you plan to continue working with your
parameters after adding it to the record, set this flag to True. When set
to True, the data is duplicated in memory.
"""
super().__init__(_check_key, _check_value)
if array_dict:
for k in list(array_dict.keys()):
Expand Down
72 changes: 71 additions & 1 deletion src/py/flwr/common/record/recordset.py
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,77 @@ def _check_fn_configs(self, record: ConfigsRecord) -> None:


class RecordSet:
"""RecordSet stores groups of parameters, metrics and configs."""
"""RecordSet stores groups of parameters, metrics and configs.
A :code:`RecordSet` is the unified mechanism by which parameters,
metrics and configs can be either stored as part of a
`flwr.common.Context <flwr.common.Context.html>`_ in your apps
or communicated as part of a
`flwr.common.Message <flwr.common.Message.html>`_ between your apps.
Parameters
----------
parameters_records : Optional[Dict[str, ParametersRecord]]
A dictionary of :code:`ParametersRecords` that can be used to record
and communicate model parameters and high-dimensional arrays.
metrics_records : Optional[Dict[str, MetricsRecord]]
A dictionary of :code:`MetricsRecord` that can be used to record
and communicate scalar-valued metrics that are the result of performing
and action, for example, by a :code:`ClientApp`.
configs_records : Optional[Dict[str, ConfigsRecord]]
A dictionary of :code:`ConfigsRecord` that can be used to record
and communicate configuration values to an entity (e.g. to a
:code:`ClientApp`)
for it to adjust how an action is performed.
Examples
--------
A :code:`RecordSet` can hold three types of records, each designed
with an specific purpose. What is common to all of them is that they
are Python dictionaries designed to ensure that each key-value pair
adheres to specified data types.
Let's see an example.
>>> from flwr.common import RecordSet
>>> from flwr.common import ConfigsRecords, MetricsRecords, ParametersRecord
>>>
>>> # Let's begin with an empty record
>>> my_recordset = RecordSet()
>>>
>>> # We can create a ConfigsRecord
>>> c_record = ConfigsRecord({"lr": 0.1, "batch-size": 128})
>>> # Adding it to the record_set would look like this
>>> my_recordset.configs_records["my_config"] = c_record
>>>
>>> # We can create a MetricsRecord following a similar process
>>> m_record = MetricsRecord({"accuracy": 0.93, "losses": [0.23, 0.1]})
>>> # Adding it to the record_set would look like this
>>> my_recordset.metrics_records["my_metrics"] = m_record
Adding a :code:`ParametersRecord` follows the same steps as above but first,
the array needs to be serialized and represented as a :code:`flwr.common.Array`.
If the array is a :code:`NumPy` array, you can use the built-in utility function
`array_from_numpy <flwr.common.array_from_numpy.html>`_. It is often possible to
convert an array first to :code:`NumPy` and then use the aforementioned function.
>>> from flwr.common import array_from_numpy
>>> # Creating a ParametersRecord would look like this
>>> arr_np = np.random.randn(3, 3)
>>>
>>> # You can use the built-in tool to serialize the array
>>> arr = array_from_numpy(arr_np)
>>>
>>> # Finally, create the record
>>> p_record = ParametersRecord({"my_array": arr})
>>>
>>> # Adding it to the record_set would look like this
>>> my_recordset.configs_records["my_config"] = c_record
For additional examples on how to construct each of the records types shown
above, please refer to the documentation for :code:`ConfigsRecord`,
:code:`MetricsRecord` and :code:`ParametersRecord`.
"""

def __init__(
self,
Expand Down

0 comments on commit 53176af

Please sign in to comment.