-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Candid: Make services self-describing #1510
Conversation
We always had the vision that you can just take a canister ID and find out what it's interface is. This is crucial for use-cases like * importing external canisters in your code (where `dfx` would fetch the `.did` file and provide it to `moc`) * @chenyan-dfinity’s Candid interface should work for all canisters * command line tools can provide an interface for all canisters with auto-complete etc. Why don’t we have this already since months? The main reason is that for a long while I expected this to fall under “static front-end assets”, which I imagined would be uploaded by `dfx`, separately from the `.wasm`, and served by the system, independent of WasmCode, in a special “static content serving mode”, and the IDL would just be one of these files (with a well-known name). One reason for this design was that we thought this was the only way to make access to these assets trustworthy. The feature of front-end assets has been contentious, prone to feature creep, and got entangled with other issues (like, how to return data from queries in trustworthy ways). So nothing moved. But recently things started to move there again. We have a vision for how general queries can actually be trustworthy, and are leaning towards moving the frontend handling into the canister, and allow canisters to somehow fuel HTTP endpoints. And this led me to the conclusion that the candid definition is _not_ a ront-end asset, and we can and should just design the “canister can indicate its own interface” separately. This proposal is a very simple idea: A canister, by convention, reports its IDL via candid_interface : () -> (Text) query For Motoko, this would happen automatically. Benefits of this design: * We can implement it right away. * The interface is bundled with the `.wasm`. No more complex steps to be taken by `dfx` to create the `.did` file, keep it in sync, and somehow bundle it with the `.wasm` upon uploading. (We still want `moc --idl` for local development though, lest we write a tool that simulates the IC System API and locally calls `candid_interface()`.) * It _can_ be dynamic, if a canister changes its interface without redeployment. - Maybe some methods only become available after some flag got set? - Or we have a completely dynamic canister – I recently played around with a Python canister, and there I would expect to upload some code using an regular call, and the interface should match the installed code. - Or maybe we have a proxy canister that fetches its candid from the canister it proxies. * As @rossberg points out: Candid is not actually tied to the IC system, and could be used in other “service RPC” contexts as well. The presented design will neatly apply to other environments as well. * Treating this different from front-end assets is sensible: - Front-end assets need to be reachable via HTTP, it seems. But the candid interface not: Any tool that cares about Candid is able to speak Candid and IC. - Front-end assets may be developed and deployed independently from the backend. The Candid interface is tied to the canister.
This PR does not affect the produced WebAssembly code. |
I'm a bit puzzled. The underlying idea is to stuff assets and everything into a single Wasm module? But then you can only get them out by expensive calls into Wasm! And more expensive copying at the boundaries. And how will this support multiple modules? Embedding them as a binary blobs into another module? Sorry, that all sounds super-backwards to me. |
Also, I'm not fond of conflating domain and meta-domain, by having reflection as a regular method. PL has long moved away from that, in favour of mirror-based approaches, because they are a more capability-conform setup. |
Maybe I shouldn’t have spent so much words on frontend assets, but let’s keep the discussion on the IDL description … should we pull you into the ongoing meetings about assets and certified variables?
What does mirror-based mean, concretely? And how would you support a canister with a dynamic interface? What if the Oh, and: Candid messages are self-describing, so why not Candid services? |
Not that I'm keen on it, but if this is where it moves then I may want to have a word.
See https://en.wikipedia.org/wiki/Mirror_(programming) Another way of describing it is that a mirror makes reflection a separate capability from regular access to an object.
That would be somewhat better conceptually, but still mixes up capabilities.
Well, plain data does not encapsulate anything, especially not behaviour. Also, subtyping is message coercive on data, so you don't get to downcast back to rediscover elided information, while actor reflection allows you to do that. Edit: And I should have added: the types of message data are a low-level encoding detail, they are not accessible from the higher-level typed programming model, unlike the proposed method. |
But how would it be different from what we thought we’d do earlier: In both cases it would be a read query, just to a static part of the canister module.
…as you would in any other model we talked about (e.g. IDL as part of system-provided static assets). What do you have in mind instead? |
Good question. If interface queries are a system call, then at least there are possible ways in which the system could limit access to them (the system would be the mirror). One possibility would be that the system distinguishes public, reflective actor ids from private, non-reflective ones. At least that path would remain open as a future feature. |
The proposed interface is a “should”. If you create internal, private actors that should not be inspectable, then just don’t implement Both these points speak in favor of a flexible, programmating interface, in contrast to a “the system always stores an IDL file in a way that everyone can access it”. |
Then it is impossible to separate the ability to use an actor from the ability to reflect on it. You may want to provide one but not the other.
That would be ACLs instead of capabilities. I thought you were in the other camp? :)
Why would everyone have access? That depends on how that access is handled in the system. For a private actor, it may only be the owner. |
Once we have capabilities, and they allow you to control access to certiain methods, then this design with neatly map to capabilities – without a special casing “get interface” from any other “do X”. So this, too, seems to be be in favor of a uniform way of accessing canister features (including getting the interface).
Yes, but again that is a problem that we have to solve for any other query as well. It is not specific to getting the IDL, or getting the |
I suggest reading Gilad's paper (linked from the Wikipedia article). He analyses multiple case studies, some of which are relevant to our scenario. :) |
Gave it a glance. Not sure how the listed “The advantages of mirrors include:” apply here (but I don't understand all of them). Anyways, that would suggest we have a well-known “candid interace store”, a canister that maybe has this interface
and somehow (how?) knows the interface for all canisters? Is that what you are propsing? If not, then what are you proposing? |
Could we not just store the interface in an optional custom section? We could support a system method that allows a canister to query its own custom section, if it wants to navel gaze or pass it on to a some client. And the system could return custom sections without entering wasm. |
Hmm, seems very similar than just using a data section? (Which, especially with the bulk memory proposal, no longer wastes precious heap space at rest.)
And expose it as a seprate read request on the HTTP interface to external users? Or also to other canisters, via a query call to the management canister maybe? Would this allow reading any custom section? What does that buy us the presented proposal? And how would this support changing the interface without re-uploading the canister (e.g. a proxy or dynamic canister)? |
I guess, but is that yet another proposal we have to wait to arrive?
I guess it uses stuff that's readily available and doesn't so readily conflate static with dynamic, which is what I think Andreas was objecting to.
It doesn't, but perhaps that could be added later. A (universal) proxy would need a more generic entry point anyway, right? But I was just trying to mediate.... |
Yes, sorry, I see that intention and I appreciate that.
No need to wait for it, it just means that we know that one problem (wasting heap space) will go away in the future.
Quite contrary: The present proposal does not require any new features from any existing component; no new request types, no new canister state, no new system API.
That’s maybe the main point of contention: I think this mechanis must be dynamic to be general and flexible enough (see example above). If one considers that to be an anti-feature, then I understand that this design is unappealing. But a canister can upgrade itself. So this means if you need to have a dynamic interface description, you can have in anways – you just have to jump through some ugly hoops. Re Gilad's paper: I think my proposal actually satisfies Stratification, because the |
I agree with this sentiment: It seems like this proposal is doing stratification. But perhaps we should be more explicit about the "other levels" we are adding to each canister? This feature is adding to some "Candid interface level" of a canister that has yet to be named, right? |
Forgive me if this level has been named, but I don't know it. Provisionally, I see the following two levels, with these "names":
(or maybe "level 0" is really "level 2"? In any case, there is stratification in my view) |
Good point. I think the idea here is to support an "opt-in" form of dynamic reflection, and I favor that kind of design. As mentioned above, I think there is adequate stratification. I can't see a good way to add more without loosing the point here. As to the question of whether this reflection should itself be a canister service (one of Joachim's possible interpretations of the "mirror suggestion from @rossberg ): I would hope that the source of this service API information about a canister is that canister itself, perhaps from another "level" of its canister API (to satisfy the need for stratification in the design), much like in this proposal. The external service that caches this information, cross-references it, indexes it for search, etc can and should be another service. That "mirror" is way more complex and feature-ful. I think that the OOPSLA paper about Mirrors is discussing OO runtimes that do much of this kind of stuff, and perhaps that's all good to keep in mind for those designs as well. Just a thought. BTW -- The 55 Foundry folks have been floating "canister (dev) store" as an app that they'd like to create for other canister developers. Their customers would be other developers that shop around for functionality to build upon. The app store acts as a way to communicate trust and stability, do KYC for business relationships, etc. One can imagine that the "source of truth" for this app store (or any other like it) for any given canister on the IC is the entry point proposed here. On top of that feature, app store developers can build other features, cache the information, etc. |
I still personally think it should be a separate call from But in the question of whether the powers that be have made that decision, I suggest using an UTF-8 character that is quasi-impossible to type, like an emoji even, and is invalid Motoko function name so people cannot define that and break moc. |
To what does "it" refer? ( |
Sorry. My opinion on the manner is that getting the interface of a canister shouldn't be part of the API of the canister (for reasons I think were explained above), and should be something that's guaranteed by the system. There are big advantages to making API of canisters a first-class citizen to the platform they're running on. One such advantage that I described to Joachim is that we could enforce that canister upgrades cannot break existing code, a guarantee that can be provided because Candid knows about covariance of types. Joachim replied that there is nothing preventing developers from changing the API semantically (e.g. returning an empty array all the time). He also believes that verifying covariance of an API should be left to tooling, not platform. My position on the subject is; a) you will never be able to prevent people from semantically doing the wrong thing, We have the power to make those decisions now. We should embrace it to make the platform better. |
I think it's better to have a system API to fetch the Candid interface. We can store .did file as a static data in wasm, so you can update the data dynamically as well. |
You cannot prevent people from breaking code. Even if the subtyping rule doesn't allow your breaking change, you delete the code and simply trap. You can WARN people when they break the code.
True, but I can think from the other direction: someone writes an API to steal password, and all malicious apps depend on it. Should we allow the author to withdraw that API? |
This is a semantic change though, which should be handled socially. I'm talking specifically about breaking the API, which would trap every time.
Using governance, that'd be fine. Honestly this is one of the first point I brought and asked about when I was hired; how do people handle API changes in canisters. I'm scared a lot that this seems like something we're not spending nearly as enough time as I think it deserves. |
You mean
Can you explain why that is desirable or preferable? What good is enabled or prevented by this? And, crucially, why does the system care about the idl any more than any other query result? |
Here is another try at justifying this approach, explicitly listing the assumptions and goals, and drawing this conclusion – this may provide more tangible points of attack… The main underlying assumptions are:
My requirements and goals are (roughly in descending order or importance):
From these I come at the present design:
That leaves, it seems to me, the query method.
Finally, there are the nice-to-have advantag that we can do this right away. So, if you are not convinced (and care), can you pleas point to the flaws in this reasoning or where I have the wrong assumptions? |
This is a big mistake. It means people can write canisters with different wire formats and they can never communicate with each other. This hurts the ecosystem. Plus, serializing everything twice is redundant. I always thought this is a temporary assumption that we are going to fix at some point. Is this going to change in the future? |
While that is technically true, that's a low-level view and not how we should be thinking about it conceptually from a programming model perspective, at least not by default.
I don't know what this assumption is based on, but I don't buy that at all. In fact, such an approach would seem like an irresponsible security risk, because it's a wide open attack vector -- one mistake in the ACL check of the update method and anybody can replace your code. No, this is another perfect example where you absolutely want to separate interface and meta interface.
I don't see why any of that would require interface reflection? You can forward and log messages just fine without being able to inquire the set of supported messages. Again, different levels.
I mostly agree with these. However, I have an additional requirement:
From that it immediately follows that a canister is not simply a Wasm module. Instead, it is a set of assets. And there needs to be system API to retrieve assets. And once you arrive at that point, the natural solution is to simply store the IDL as an asset and rely on some convention for accessing it. |
I agree it hurts the ecosystem. But there will always be requirements that Candid doesn’t cut (e.g. HTTP Canisters). I hope we can get people to hook into the ecosystem because it is good, not because it is the only choice.
What do you mean? Where do we serialize stuff twice?
Well, conceptually or in the programming model, the interface access is not part of the canister interface. This proposal is all about the low-level view.
That seems to be a hard to consolidate difference now … Why do you think an “change code” update call is any more senstive in general then a “change state” update call? Conceptually, both just change the state machine that is the canister. And pragmatically, most use cases that would want to use our platform for its tamper proofness have the problem that unauthorized state changes are catastrophic.
This assumption mainly justifies why the interface (of the proxy) must be dynamic (i.e. not hard-coded in the wasm module or uploaded along it). When the proxy admin reconfigures the proxy to forward to another canister (state change, not code chnage), the interface evolves. But it also seems to suggest that it might be nice if the proxy can (if that is desired) fetch the evolved interface directly, instead of having the admin to manually upload the right one.
But that would only work if one dismisses the “interface can be state-dependent”, right? Or if assets becomes dynamically modifiable (and essentially a file system or key-value store that can be written from the running Wasm canister). And I think the “simply” before “storing the ID.” is not true (complexity on the dfx side, worries about keeping them in sync). And neither is “some convention for accessing it”, because Wasm modules don’t have to be accessed from the outside (probably shound't by default), but the IDL has to. And this access would (likely) be accessible to other canisters somehow (maybe via the managemnet canister, e.g. Maybe it helps to address the two related, but still mostly orthogonal, questions, and recall the options we have:
It seems that we can serve any of the interfaces with any of the implementations (although some combinations are odd). |
But we need a single wire format to transmit data. Why not Candid, but CBOR? What feature is missing from Candid that prevents us from using it in the core system? We made the decision to CBOR at that time because there is no Rust Candid library at all. It makes sense to choose any existing wire format and make progress. Now that Candid is more mature, is it a good time to revisit these decisions? And what's HTTP canisters by the way?
We serialize data in Candid and then in CBOR. |
Yes, we could replace CBOR with Candid to let the userlib talk to the HTTP handler in the replica. But that would still encapsulate the application level data that goes from the application frontend to the canister. It’s like Ethernet and IP. Or HTTP and JSON. Or … anyhing really that has different layers. And this only applies for ingress messages – inter-canster messages don’t need the CBOR/HTTP layer.
Canisters that you can talk to directly over HTTP, and that do the whole HTTP request decoding in canister (i.e. wasm) code. See https://github.com/dfinity-lab/notes/pull/3 |
That's just not true though. There is a need for an API for HTTP Canisters. That proposal could say that the endpoint must look like:
We control the proposal, we can control the interface. My view after reading these:
For people who want schemaless bytes they can always declare their endpoint as It's also a must have for an ecosystem that will exist at some point (and not just built up from scratch). A developer will be asking a lot of "what's the interface for canister ABC". If you have to search stackoverflow you've already lost. Also a good argument for adding support to comments in Candid files. |
Anything that doesn't need to talk to other Candid canisters. You can write I fully appreciate the benefits of the Candid ecosystem! I just don't see any benefits of forcing it on people who want to work at a lower level - the fact that some canisters just shovel bytes doesn't prevent any of the fuzz testing etc of your Candid-using canisters! I don't believe we can manage the complexity of the system without pulling in appropriate layers of abstraction. And I claim that a typing layer on top of an untyped actor model is a good separation of concern - like, say, the various RPC systems on top of TCP out there. |
Agreed. But I'm more concerned about people inventing a new Candid-like format that is not compatible with our ecosystem. Then it becomes a war between iOS and Android again. We can keep the abstraction, but make both of them required as part of the core system, so that we won't have defragmentation. |
I’m not concerend by that. If people feel a need for an alternative, they will build it in any case – if we force them to use Candid they’ll just mark everytying |
This is exactly what I want. The benefit is that people using Candid still have a way to process their messages, even though they may need to write their own serializer. But consider the opposite (our current status), the Candid canister has no way of communicating with non-Candid canisters. They will be forced to leave the ecosystem if they want to work with non-Candid canisters. If you really want to a way to be flexible, Motoko compiler needs to provide a flag to opt-out Candid, but do we want to go there?
Many times, people don't have a choice. Suppose a big company builds a non-Candid canister. To interact with their data and API, you have to use whatever protocol they come up with. By enforcing Candid in our core system, people will build tools/libraries to convert between these protocols. Without the enforcement, it's just isolated islands. |
Eventually, I expect that we would expose “raw” IC calls, in Motoko, so yes. This feels like FFI to me - not urgently needed, wouldn't exist in an ideal world, something is something that advanced low-level users use (and often wrap in libraries), but that eventually becomes necessary. And it's not like it's hard - some way to mark a shared function type as “raw” (and then necessarily be typed to be If our IR was expressive enough to implement Candid decoding, we'd would have had that function type in the IR for a long time, and there is the unwritten law that every feature of the IT eventually makes it to the source language (at least that's my observation so far :-)) |
Okay, this connects the fragmented world then. With Motoko enforcing Candid and replica not enforcing it, there is clearly a gap we cannot fill. Either we both make it required, or both make it optional. |
I guess we have sufficiently discussed why Candid is layer on top of the core IC system. Maybe not everybody agrees, but for now that’s the status quo. If we change that (and maybe one day the system enforces the Candid types), then I agree that the interface description ought to be hosted in a bespoke way. Until then, I suggest we stick to the current layers of abstraction, and do not assume dedicated system features for Candid. With that assumption, can the present proposal proceed? Also: Given that the If there is still opposition, may I ask for a concrete counter-proposal? |
I'm afraid not. It cannot be a regular method. Whether you give it a funny name or whether it is omitted from an IDL description is immaterial to that -- given the right quoting mechanism and/or type annotation you can still use it as any old method. An IDL description is meta data. That ought to be separate from code. The platform currently lacks a way to associate and access meta data with a canister. We should address that instead of hacking it into the wrong domain. |
So it was a flaw of candid to leave no space in the namespace of “raw canister entry points”? Should we say that a candid-exposed method
If I use
A canister is more than code, and surely can contain metadata. A raw method (not candid-exposed) is a good way for getting data out of a canister. So I don’t think it is ahack. Also, what’s your counter-proposal? |
No, it doesn't belong there. The IDL describes a canister's interface. Reflection is meta information, accessing which does not belong there, nor should it necessarily use a dynamic messaging mechanism. We need to keep our domains straight.
It it provides unsafeCoerce then it is not a strongly typed language. ;)
This is related to the asset discussion. :) I generally think that it's a hack to stuff everything into a dynamic code execution mechanism just because we can -- square peg round hole.
That we fix the platform's overly naive model of canister = wasm module. Or short of that, we use an idea similar to the frontend container you suggested to provide meta data. |
A multi-file canister module (with multiple wasm modules, and separate
The frontend container canister idea works for HTML frontends because the HTML frontend is not in one-to-one connection to canisters. But canister interfaces are, so I don’t see how that would work here. |
How about we store the .did file in the wasm module via This separates the meta-level API from the real canister, and we can use capability/ACL from ic:00. The actual did file still comes with the canister itself. |
This implemnets the ideas in #1510, albeit with a method name that clearly marks this as scaffolding.
This implements the ideas in #1510, albeit with a method name that clearly marks this as scaffolding. TL;DR: You can use the query method `__get_candid_interface_tmp_hack : () -> (text)` to fetch the Candid interface. In the discussion https://dfinity.atlassian.net/browse/NNS-3 the fact that we still can’t get the interface of a running canister, and thus have no good way of exploring the implications on developer workflow, no way to develop UI etc. is embarrasing for a feature that was clear that we want it like 18 months ago. Why were we blocked? Initially because we expected some form of system-provided assets (which we then did not get), and then because of disagreement about the relation between Candid and the raw IC. Also @hansl writes in NNS-3 that the system-integrated solution requires coordination between too many parties to be done “soon”. But this is a very dissatisfying state of affairs, because even if we can’t nail down the interface right now, we should at least unblock all use cases that block on this feature. So in the interest of agility, I want to do, as a temporary work-around, the implemnetation that we can have now, namely the Candid-internal solution that does not require system features. Let’s start _using_ this, and then decide and implement the “right” approach with less pressure. I see this as consistent with Dom’s request for more agility; with other scaffolding and experimentation going on right now (`ic1.call_simple`, work on `exec` etc.), and in a line with such esteemed and useful hacks like regex-based asset injection, `ic-router` or encoding subnet ids in canister ids.
Closing, as we have no active dicussion here. Can be resurrected when needed. |
We always had the vision that you can just take a canister ID and find
out what it's interface is. This is crucial for use-cases like
dfx
would fetchthe
.did
file and provide it tomoc
)with auto-complete etc.
Why don’t we have this already since months? The main reason is that for
a long while I expected this to fall under “static front-end assets”,
which I imagined would be uploaded by
dfx
, separately from the.wasm
, and served by the system, independent of WasmCode, in a special“static content serving mode”, and the IDL would just be one of these
files (with a well-known name). One reason for this design was that we
thought this was the only way to make access to these assets
trustworthy.
The feature of front-end assets has been contentious, prone to feature
creep, and got entangled with other issues (like, how to return data
from queries in trustworthy ways). So nothing moved.
But recently things started to move there again. We have a vision for
how general queries can actually be trustworthy, and are leaning towards
moving the frontend handling into the canister, and allow canisters to
somehow fuel HTTP endpoints.
And this led me to the conclusion that the candid definition is not a
ront-end asset, and we can and should just design the “canister can
indicate its own interface” separately.
This proposal is a very simple idea: A canister, by convention, reports
its IDL via
For Motoko, this would happen automatically.
Benefits of this design:
We can implement it right away.
The interface is bundled with the
.wasm
. No more complex steps tobe taken by
dfx
to create the.did
file, keep it in sync, andsomehow bundle it with the
.wasm
upon uploading.(We still want
moc --idl
for local development though, lest wewrite a tool that simulates the IC System API and locally calls
candid_interface()
.)It can be dynamic, if a canister changes its interface without
redeployment.
with a Python canister, and there I would expect to
upload some code using an regular call, and the interface should
match the installed code.
canister it proxies.
As @rossberg points out: Candid is not actually tied to the IC
system, and could be used in other “service RPC” contexts as well.
The presented design will neatly apply to other environments as well.
Treating this different from front-end assets is sensible:
Front-end assets need to be reachable via HTTP, it seems. But the
candid interface not: Any tool that cares about Candid is able to
speak Candid and IC.
Front-end assets may be developed and deployed independently from
the backend. The Candid interface is tied to the canister.
Candid messages are self-describing, so Candid services ought to be too.