-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ObjectService.Search
server response is always 0 OK
even on partial success
#2721
Comments
We can say that the current SEARCH is best-effort. Even if you get this additional data, there is not a lot you can do. Especially if you're to consider the case of |
Best-effort - yes it is, but client needs to understand how it ended. Being a client, i wanna distinguish complete results from possibly incomplete by any reason. That's what i expect from the responsible system. What to do with this - personal matter of each client if u mean dont respond how much nodes are unavailable - yeah it's more like sugar for now. Being in a free text message, no application will desire to process its content. At the same time, it can be very useful in test/debug setups. Depends
this is another plane of the system. As the network map changes, the execution of the storage policy changes. This topic discusses the state transfer at the op exec time |
As a node, how do you distinguish disk/shard failure? Temporary unstable state caused by the object migrations after another node goes off/online? To me, any GET/PUT operation should fixed then too (and I do not agree with it). |
@carpawell pls clarify what fix r u talking about? lets focus on proposed expected behavior of |
If you receive
Basically, I mean that our protocol is best-effort widewise, I cannot agree with you. |
@carpawell ur talking about clatification of failure responses while this issue is about partially succeeded ones. Neither but since you asked
i wouldn't generalize like that. At the moment, only |
GET and PUT can also be partially successful (every node is ok about ACL rules, all they tried to fetch an object but 404, does it mean the object does not exist? can you ensure this?)
So that is the point where we disagree. Why do you think it is the only one? SEARCH is an extremely generalized GET to me. If a node is up it answers you, if a node is down, it wont. That is why we have REPs and that is why we are decentralized. User can minimize data loss with replicas. How status can help you? |
gotcha, i should've been clarify that under "partial success" i mean details of the scenario described in this issue. Lets stick to "best-effort" term. I agree that PUT and GET can reach partial success (2/3 replicas saved or header+50% payload got), but they never finish with successful status like described SEARCH does. So, at the end of the day, only SEARCH is a best-effort op to me
status is a key point of this issue. I outlined this in expected and current behavior sections. PUT and GET dont (and never will) behave like this. So we need to resolve SEARCH 1st, then we can think about other ops responsiveness |
I understood.
Need for what? |
for this issue to be resolved |
And that is what i suggest to discuss. Is it an issue? And how a user can use the suggested new info? |
Vote up at least for including additional information into SEARCH result about if it's complete or not. NeoGo, as a user of SEARCH, needs to distinguish outcomes of object search described in this issue: either result include all matching objects or some objects may be missing. It's required by NeoGo BlockFetcher/BlockUploader services, ref. nspcc-dev/neo-go#3645, and it's critical for correct application functioning. Regarding the format of this feature: it may be a special response status-code or just some special message, it's actually not important for NeoGo. We're fine with any source of knowledge that request was partially-successful or completely-successful.
Here are some notes from the real user (NeoGo) experience regarding GET: our usage pattern is to SEARCH for the set of objects matching some criteria, then GET every object. If GET returns an error, then we just try several times more because it's clear that this object really exists. So GET is less critical wrt this issue. But of course, it would be beneficial for the client to know that GET failed not because of the fact that the object does not exist. |
"Incomplete success" can be somewhat helpful in case of temporary problems (like in #2979). The original description here is more about permanent ones which just can't be solved (no number of retries would help). And notice that if some node is to go offline you'd get a "complete success" from the other existing one. |
Even with that, BlockUploader may distinguish this result and stop its operation if such problem is located.
But this one is more critical. Do I understand correctly that according to NeoFS design there's completely no way to learn that some more matching object exist if the node that stores them is offline? |
If all nodes storing some object are offline, it's gone. And it's hard to blame NeoFS for this. Got 1 replica and node is down? Sorry, nothing can help. Got three replicas and three nodes down? Well... Synchronized meta solves a part of the problem (you'd know that the object does in fact exist), but it can't resurrect a lost object. |
being executed over multi-node container, object
SEARCH
op can have partial success by design. Current server behavior could be buggySteps to reproduce
having NeoFS with at least 2 online storage nodes (
N1
andN2
), i've done following steps:C
withREP 1 CBF 2
storage policy requiring to select any 2 storage nodes and store 1 replica of any object on themO1
->N1
,O2
->N2
C
objectsN2
Expected Behavior
Current Behavior
Possible Solution
respond with specific status non-zero code and number of unavailable container nodes. Respond with
0 OK
only when all container nodes respondedContext
i dive into
SEARCH
server within #2692, but dont think this really mattersRegression
No
The text was updated successfully, but these errors were encountered: