Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data processing #22

Merged
merged 56 commits into from
Apr 23, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
3d25f40
* Carved out data processor from PR #20
eggerdj Mar 26, 2021
375caa0
* Added node_output property.
eggerdj Mar 26, 2021
9994758
* Ran Black.
eggerdj Mar 26, 2021
0fc66ba
* Added a better methodology for checking the input requirements.
eggerdj Mar 26, 2021
a0cbd65
* Lint fix.
eggerdj Mar 26, 2021
839ab97
* Added population shots fix and corresponding tests.
eggerdj Mar 29, 2021
38d6675
* Unified ToReal and ToImag.
eggerdj Mar 29, 2021
0be6f1f
* Reformatted the DataProcessor to a list of nodes rather than pointers.
eggerdj Mar 31, 2021
3f95b1e
* Added history functionality to the data processor.
eggerdj Mar 31, 2021
1883a91
* Made history of data processor a property.
eggerdj Apr 1, 2021
6b6051c
* Fixed docstring.
eggerdj Apr 1, 2021
e32341e
* Added _process to the IQPart data actions.
eggerdj Apr 1, 2021
681546f
* Changed node_output to a class variable.
eggerdj Apr 1, 2021
93cdfae
* Added the option to initialize the DataProcessor with given DataAct…
eggerdj Apr 1, 2021
21374e5
* Removed Kernel and Discriminator. They will be for a future PR.
eggerdj Apr 1, 2021
5f7c082
* Added docstring from Will.
eggerdj Apr 9, 2021
8838f62
* Moved docstring.
eggerdj Apr 12, 2021
77d0bb0
Update qiskit_experiments/data_processing/base.py
eggerdj Apr 12, 2021
81ddf28
Update qiskit_experiments/data_processing/base.py
eggerdj Apr 12, 2021
6af38d9
* Aligned code to _process.
eggerdj Apr 12, 2021
bd230b5
* Made data processor callable.
eggerdj Apr 12, 2021
c9801a7
* Renamed base.py to data_action.py.
eggerdj Apr 12, 2021
ca2d365
* Made nodes callable.
eggerdj Apr 12, 2021
06095d2
* Removed history property, added call_with_history.
eggerdj Apr 12, 2021
22acc21
* Renamed Population to Probability.
eggerdj Apr 13, 2021
cefcf73
* Metadata in processed_data.
eggerdj Apr 13, 2021
a0d0903
* Refactored _process(Dict[str, Any]) -> Dict[str, Any] to _process(A…
eggerdj Apr 13, 2021
1bc7c83
* Added option to specifiy which nodes to include in the history.
eggerdj Apr 13, 2021
02c46a1
Merge branch 'main' into data_processor
eggerdj Apr 13, 2021
650d9c6
Update qiskit_experiments/data_processing/nodes.py
eggerdj Apr 16, 2021
078fd07
* Removed __init__ from DataAction.
eggerdj Apr 16, 2021
d0a85c1
Merge branch 'data_processor' of github.com:eggerdj/qiskit-experiment…
eggerdj Apr 16, 2021
4899bfa
* Added the option to turn of validation.
eggerdj Apr 16, 2021
b06e3eb
Update qiskit_experiments/data_processing/data_processor.py
eggerdj Apr 16, 2021
597d60c
* Simplified validation of IQ data.
eggerdj Apr 16, 2021
055d6cc
Update qiskit_experiments/data_processing/nodes.py
eggerdj Apr 16, 2021
a498bf0
Update qiskit_experiments/data_processing/nodes.py
eggerdj Apr 16, 2021
a35b522
* Removed unnecessary wrapping of _process.
eggerdj Apr 16, 2021
74ec8f8
* Polished docstrings and ran black.
eggerdj Apr 16, 2021
ec911bd
Update qiskit_experiments/data_processing/data_action.py
eggerdj Apr 16, 2021
83ed8ff
* Removed unnecessary code in DataProcessingError.
eggerdj Apr 16, 2021
2ffb0d3
* Rewrote doc string.
eggerdj Apr 16, 2021
657f17b
* IQ data is now of type float and not complex.
eggerdj Apr 16, 2021
58ca872
* Fixed validate issue.
eggerdj Apr 20, 2021
d749d95
* Added error message to __call__ and call_with_history.
eggerdj Apr 20, 2021
cab9339
* Improved docstring.
eggerdj Apr 20, 2021
4c9acae
* Impoved class docstring.
eggerdj Apr 20, 2021
d910933
* Changed how DataProcessor._nodes are initialized in __init__.
eggerdj Apr 20, 2021
c250ad8
* Changed behavior of empty data processor.
eggerdj Apr 20, 2021
bc00e26
* Refactored call and call_with_history to use the call_internal func…
eggerdj Apr 21, 2021
81caca7
* Fixed, lint, black, and docstrings.
eggerdj Apr 21, 2021
bcce8eb
Merge branch 'main' into data_processor
eggerdj Apr 21, 2021
2452a86
Update qiskit_experiments/data_processing/data_action.py
eggerdj Apr 22, 2021
a79d270
* Added type hint to call_with_history
eggerdj Apr 22, 2021
b19c31b
Merge branch 'data_processor' of github.com:eggerdj/qiskit-experiment…
eggerdj Apr 22, 2021
5942ede
Update qiskit_experiments/data_processing/data_processor.py
eggerdj Apr 22, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 22 additions & 0 deletions qiskit_experiments/data_processing/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# This code is part of Qiskit.
#
# (C) Copyright IBM 2021.
#
# This code is licensed under the Apache License, Version 2.0. You may
# obtain a copy of this license in the LICENSE.txt file in the root directory
# of this source tree or at http://www.apache.org/licenses/LICENSE-2.0.
#
# Any modifications or derivative works of this code must retain this
# copyright notice, and modified files need to carry a notice indicating
# that they have been altered from the originals.

"""Qiskit experiments calibration data processing roots."""

from .data_action import DataAction
from .nodes import (
Probability,
ToImag,
ToReal,
)

from .data_processor import DataProcessor
75 changes: 75 additions & 0 deletions qiskit_experiments/data_processing/data_action.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
# This code is part of Qiskit.
#
# (C) Copyright IBM 2021.
#
# This code is licensed under the Apache License, Version 2.0. You may
# obtain a copy of this license in the LICENSE.txt file in the root directory
# of this source tree or at http://www.apache.org/licenses/LICENSE-2.0.
#
# Any modifications or derivative works of this code must retain this
# copyright notice, and modified files need to carry a notice indicating
# that they have been altered from the originals.

"""Defines the steps that can be used to analyse data."""

from abc import ABCMeta, abstractmethod
from typing import Any


class DataAction(metaclass=ABCMeta):
"""
Abstract action done on measured data to process it. Each subclass of DataAction must
define the way it formats, validates and processes data.
"""

def __init__(self, validate: bool = True):
"""
Args:
validate: If set to False the DataAction will not validate its input.
"""
self._validate = validate

@abstractmethod
def _process(self, datum: Any) -> Any:
"""
Applies the data processing step to the datum.

Args:
datum: A single item of data which will be processed.

Returns:
processed data: The data that has been processed.
"""

@abstractmethod
def _format_data(self, datum: Any) -> Any:
"""
eggerdj marked this conversation as resolved.
Show resolved Hide resolved
Check that the given data has the correct structure. This method may
additionally change the data type, e.g. converting a list to a numpy array.

Args:
datum: The data instance to check and format.

Returns:
datum: The data that was checked.

Raises:
DataProcessorError: If the data does not have the proper format.
"""

def __call__(self, data: Any) -> Any:
eggerdj marked this conversation as resolved.
Show resolved Hide resolved
"""
Call the data action of this node on the data.

Args:
data: The data to process. The action nodes in the data processor will
raise errors if the data does not have the appropriate format.

Returns:
processed data: The data processed by self.
"""
return self._process(self._format_data(data))

def __repr__(self):
eggerdj marked this conversation as resolved.
Show resolved Hide resolved
"""String representation of the node."""
return f"{self.__class__.__name__}(validate={self._validate})"
131 changes: 131 additions & 0 deletions qiskit_experiments/data_processing/data_processor.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
# This code is part of Qiskit.
#
# (C) Copyright IBM 2021.
#
# This code is licensed under the Apache License, Version 2.0. You may
# obtain a copy of this license in the LICENSE.txt file in the root directory
# of this source tree or at http://www.apache.org/licenses/LICENSE-2.0.
#
# Any modifications or derivative works of this code must retain this
# copyright notice, and modified files need to carry a notice indicating
# that they have been altered from the originals.

"""Actions done on the data to bring it in a usable form."""

from typing import Any, Dict, List, Set, Tuple, Union

from qiskit_experiments.data_processing.data_action import DataAction
from qiskit_experiments.data_processing.exceptions import DataProcessorError


class DataProcessor:
"""
A DataProcessor defines a sequence of operations to perform on experimental data.
Calling an instance of DataProcessor applies this sequence on the input argument.
A DataProcessor is created with a list of DataAction instances. Each DataAction
applies its _process method on the data and returns the processed data. The nodes
in the DataProcessor may also perform data validation and some minor formatting.
The output of one data action serves as input for the next data action.
DataProcessor.__call__(datum) usually takes in an entry from the data property of
an ExperimentData object (i.e. a dict containing metadata and memory keys and
possibly counts, like the Result.data property) and produces the formatted data.
DataProcessor.__call__(datum) extracts the data from the given datum under
DataProcessor._input_key (which is specified at initialization) of the given datum.
"""

eggerdj marked this conversation as resolved.
Show resolved Hide resolved
def __init__(self, input_key: str, data_actions: List[DataAction] = None):
"""Create a chain of data processing actions.

Args:
input_key: The initial key in the datum Dict[str, Any] under which the data processor
will find the data to process.
data_actions: A list of data processing actions to construct this data processor with.
If None is given an empty DataProcessor will be created.
"""
self._input_key = input_key
self._nodes = data_actions if data_actions else []

def append(self, node: DataAction):
eggerdj marked this conversation as resolved.
Show resolved Hide resolved
"""
Append new data action node to this data processor.

Args:
node: A DataAction that will process the data.
"""
self._nodes.append(node)

def __call__(self, datum: Dict[str, Any]) -> Any:
"""
Call self on the given datum. This method sequentially calls the stored data actions
on the datum.

Args:
datum: A single item of data, typically from an ExperimentData instance, that needs
to be processed. This dict also contains the metadata of each experiment.

Returns:
processed data: The data processed by the data processor.
"""
return self._call_internal(datum, False)

def call_with_history(
self, datum: Dict[str, Any], history_nodes: Set = None
eggerdj marked this conversation as resolved.
Show resolved Hide resolved
) -> Tuple[Any, List]:
"""
Call self on the given datum. This method sequentially calls the stored data actions
on the datum and also returns the history of the processed data.

Args:
datum: A single item of data, typically from an ExperimentData instance, that
needs to be processed.
history_nodes: The nodes, specified by index in the data processing chain, to
include in the history. If None is given then all nodes will be included
in the history.

Returns:
processed data: The datum processed by the data processor.
history: The datum processed at each node of the data processor.
"""
return self._call_internal(datum, True, history_nodes)

def _call_internal(
self, datum: Dict[str, Any], with_history: bool, history_nodes: Set = None
) -> Union[Any, Tuple[Any, List]]:
"""
Internal function to process the data with or with storing the history of the computation.

Args:
datum: A single item of data, typically from an ExperimentData instance, that
needs to be processed.
with_history: if True the history is returned otherwise it is not.
history_nodes: The nodes, specified by index in the data processing chain, to
include in the history. If None is given then all nodes will be included
in the history.

Returns:
datum_ and history if with_history is True or datum_ if with_history is False.

Raises:
DataProcessorError: If the input key of the data processor is not contained in datum.
"""

if self._input_key not in datum:
raise DataProcessorError(
f"The input key {self._input_key} was not found in the input datum."
)

datum_ = datum[self._input_key]

history = []
for index, node in enumerate(self._nodes):
datum_ = node(datum_)

if with_history and (
history_nodes is None or (history_nodes and index in history_nodes)
):
history.append((node.__class__.__name__, datum_, index))
eggerdj marked this conversation as resolved.
Show resolved Hide resolved

if with_history:
return datum_, history
else:
return datum_
19 changes: 19 additions & 0 deletions qiskit_experiments/data_processing/exceptions.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# This code is part of Qiskit.
#
# (C) Copyright IBM 2021.
#
# This code is licensed under the Apache License, Version 2.0. You may
# obtain a copy of this license in the LICENSE.txt file in the root directory
# of this source tree or at http://www.apache.org/licenses/LICENSE-2.0.
#
# Any modifications or derivative works of this code must retain this
# copyright notice, and modified files need to carry a notice indicating
# that they have been altered from the originals.

"""Exceptions for data processing."""

from qiskit.exceptions import QiskitError


class DataProcessorError(QiskitError):
"""Errors raised by the data processing module."""
Loading