-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unified Primitive Container Types #53
Unified Primitive Container Types #53
Conversation
Co-authored-by: Blake Johnson <blakejohnson04@gmail.com>
Co-authored-by: Blake Johnson <blakejohnson04@gmail.com>
Co-authored-by: Blake Johnson <blakejohnson04@gmail.com>
Co-authored-by: Takashi Imamichi <31178928+t-imamichi@users.noreply.github.com>
Comments due on this by EOD today. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH, I'm kind of confused by this proposal. To me it feels like extra class heirarchy and inheritance for things that don't share much in common. Like besides the name run()
the interface in BasePrimitive
is so generic I feel like you define any function as long as you wrapped it's input in a Task
and output in a Job
and call it a primitive. I think defining common container classes for the I/O to primitives is reasonable but I'm unsure of what such a generic base class provides us in practice. Like specifically if you had the container format definitions in a shared module and just defined BaseEstimatorV2
and BaseSamplerV2
using those in the type signature wouldn't that work just as well with less code and complexity?
I guess maybe what my disconnect is that this is treating both the sampler and estimator as something you would want to potentially swap between, but in practice I don't think there is a use case for that. They're too unique in their usage to facilitate that in a generic way.
0016-base-primitive-unification.md
Outdated
|
||
The central motivation of this RFC is to more clearly emphasize that primitives are a well-defined execution framework on top of which applications can be built. | ||
We aim to force each category of primitive to clearly state what units of quantum work (tasks) it is able to perform, and what outputs can be expected. | ||
It is understood that having the proposed abstract base method alone will not necessarily improve the day-to-day life of a typical user of the primitives, and neither will it necessarily improve the ease of implementing new primitive types or implementations: this change is mainly about clarifying the shared nature of the primitives, and reflecting this view in abstractions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I fully understand this. The motivation here is just to have a common subclass and to say that all primitives have a run()
method via that common parent?
0016-base-primitive-unification.md
Outdated
As an example, `qiskit_experiments` defines a suite of common diagnostic, calibration, and characterization experiments. | ||
Currently, it is designed around the `Backend.run` execution interface. | ||
One could imagine that, for example, instead of each `Experiment` providing a `circuits` attribute specifying the circuits to run through `Backend.run`, they could rather each specify a `tasks` attribute indicating which primitive units of work need performed. | ||
The `qiskit_experiments` execution and analysis machinery could then rely on the abstraction of `BasePrimitive.run` when reasoning about dispatching and collecting results. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I follow the abstract classes defined below are so broadly defined I can't imagine anything ever saying anything accepted BasePrimitive
as a type Specifically the estimator and sampler behave quite differently both in the allowed inputs and the data returned. Like for qiskit-experiments
do you have a specific case where you think this applies?
0016-base-primitive-unification.md
Outdated
|
||
class BasePrimitive(ABC, Generic[In, Out]): | ||
@abstractmethod | ||
def run(self, tasks: In | Iterable[In]) -> Job[PrimitiveResult[Out]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which Job
class is this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
qiskit.provider.JobV1
, where I am pretending it has a Generic[T]
. Aside: any reason it couldn't?
Ian's example of qiskit-experiments is already a case where we'd like to accept user workloads that consist of either sampler or estimator calls. With this proposal, we at least have some very basic assumptions of how to execute those workloads. The basic promises are:
The proposal here seems to provide nice functionality to facilitate dispatch on the specific types of |
Note that this RFC doesn't actually call for any new classes or inheritance, it just calls for 3 new methods on the |
In case it helps, any particular algorithm or experiment certainly can't swap between the two of them without, if even possible at all, a re-implementation. However, application frameworks built on top of the primitives may wish to reason about the primitives more abstractly without explicitly enumerating them, or talking about their inputs as union types of various signatures. |
It does implicitly mean new classes, because of #51 going to a new v2 version and also the proposed |
The abstract part that Experiments could actually use from this interface (if I understand correctly) is that they want to do something like: def run_experiment(primitive: BasePrimitive[TaskT, ResultT], tasks: Iterable[TaskT]) -> list[ResultT]:
return primitive.run(tasks) If the goal of this RFC is purely to ensure that the innermost "call" operation just has the same name between each primitive, that's ok by me, though personally I'd err on the side of being less constraining. It's not clear to me that this Given that Experiments will already be needing to dispatch on "is input Sampler or Estimator?", is it still worth having this abstraction to enable: if is_sampler(primitive):
tasks = generate_sampler_tasks(inputs)
else:
tasks = generate_estimator_tasks(inputs)
# Typing is the abstract `Primitive.run`
results = primitive.run(tasks)
if is_sampler(primitive):
# Required for strong typing, because the above call was abstract.
results = typing.cast(results, SamplerResults)
process_sampler_results(results)
else:
results = typing.cast(results, EstimatorResults)
process_estimator_results(results) when the alternative would be: if is_sampler(primitive):
tasks = generate_sampler_tasks(inputs)
# Now constrained to `Sampler.run`.
results = primitive.run(tasks)
process_sampler_results(results)
else:
tasks = generate_estimator_tasks(inputs)
# Constrained to `Estimator.run`
results = primitive.run(tasks)
process_estimator_results(results) My feeling is that in this example, the former is not (sufficiently) better than the latter to motivate inclusion in a base interface. |
This is where I'm struggling with the value here TBH, because I'm not sure there is a use case where you would not want to use a union type. Lets use a concrete example. If you look at what exists in I'm struggling to think of a case where someone would create something that could handle any |
Based upon the feedback here, this RFC will be updated to focus on new primitive-centric container types instead of introducing a |
Thanks @mtreinish and @jakelishman for your thoughtful comments. @jakelishman, I think you are exactly right to focus specifically on the mechanics of dispatching. While it might intuitively feel like, unifying to a common |
Co-authored-by: Blake Johnson <blakejohnson04@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, @ihincks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with this now in the reduced scope to just defining common data containers. Thanks for updating this.
Co-authored-by: Matthew Treinish <mtreinish@kortar.org>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Viewing this as a sort of "advisory" for new primitives interfaces, I'm fine with it bar the two minor comments I had, and that I don't feel really strongly about.
I think it's very good that this design document explicitly spells out that it's about the look and feel of the objects, rather than trying to enforce abstractions that don't permit any concrete use, and I can see the value in providing the helper classes.
I'm fine for this to merge as-is - neither of my comments are likely worth holding up the merge.
### TaskResult | ||
|
||
A `TaskResult` is the result of running a single `Task` and does three things: | ||
|
||
* Stores a `DataBin` instance containing the data from execution. | ||
* Stores the metadata that is possibly implementation-specific, and always specific to the executed task. | ||
* (subclasses) Contains methods to help transform the data into standard formats, including migration helpers. | ||
|
||
We generally expect each primitive type to define its own subclass of `TaskResult` to accomodate the third item, though `TaskResult` has no methods that need to be abstract. | ||
|
||
We elect to have a special container for the data (`DataBin`) so as not to pollute the `TaskResult` namespace with task-specific names, and to keep data quite distinct from metadata. | ||
|
||
```python | ||
# return a DataBin | ||
task_result.data | ||
|
||
# return a metadata dictionary | ||
task_result.metadata | ||
``` | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this implying that all subclasses of TaskResult
must have exactly two stateful attributes: data
and metadata
? I think the implication of the above paragraphs is "yes", but it might be worth spelling out in the interface. It leads to some tricks, though - if the intent is to leave the class open to later expansion, the allowance of subclasses to define arbitrary methods gets in the way; without explicitly reserving other names, there's no safe point for expansion.
If the answer is instead "no", what's the intended purpose of forcibly putting metadata
in a separate namespace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm having trouble understanding. This is not how I'd actually do it, but close enough for discussion:
@dataclass(<appropriate choices>)
class TaskResult:
data: DataBin
metadata: dict[str, Any] = field(default_factory=dict)
the allowance of subclasses to define arbitrary methods gets in the way
I'm not sure how having a get_counts()
method, for example, would get in the way.
what's the intended purpose of forcibly putting metadata in a separate namespace?
Is the alternative being considered here just putting the contents of metadata as attributes on TaskResult
? This would mean that every implementation of Estimator would need its own EstimatorTaskResult
in order to configure its own possible metadata values. It would also be quite annoying for workflow for IBM primitives, because if you wanted to modify the allowed metadata, you'd have to get the change into a tagged release of qiskit_ibm_runtime
, then you'd need to get the tagged release to be the default version on the server-side, then you could do the thing you wanted to.
|
||
### PrimitiveResult | ||
|
||
`PrimitiveResult` is the type returned by `Job.result()` and is primarily a `Sequence[TaskResult]`, with one entry for each `Task` input into the primitive's run method. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super super minor, but Python's Sequence
requires a few extra methods that we maybe don't want to actually imply - index
and count
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point, I don't want cluttering methods in the classes if we can avoid it. We can chat later about protocols.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's kind of annoying how Sequence is something close but not equal to the thing everyone wants it to be.
Further discussion may continue in issues, PRs, and Slack. |
No description provided.