don't get links from network if we are the authority #2183

zippy · 2020-04-24T16:43:33Z

PR summary

This PR adds:

an optimization of not querying the DHT for get links requests when we know that we are the DHT authority for the base of the query.
two conductor fixes during error cases (like base hasn't arrived yet) in hold_aspects request:
- stop incorrectly recording aspects as held
- stop locking up the future
because sim2h doesn't get error messages from a hold_aspect request it must also no longer records aspects sent in those messages as held, and thus the conductor must explicitly send back a list of aspects held after a hold_aspect request
get authoring list doesn't really work because correct items won't be gossiped and the optimization now breaks this. Fix this by not responding with the list, and instead re-publishing.

testing/benchmarking notes

( if any manual testing or benchmarking was/should be done, add notes and/or screenshots here )

followups

( any new tickets/concerns that were discovered or created during this work but aren't in scope for review here )

changelog

if this is a code change that effects some consumer (e.g. zome developers) of holochain core, then it has been added to our between-release changelog with the format

- summary of change [PR#1234](https://github.com/holochain/holochain-rust/pull/1234)

documentation

this code has been documented according to our docs checklist

lucksus

I think we should at least make am_i_dht_authority_for_base() do what the name suggest as noted in my comment below.

Just a remark that so far we deliberately chose to not put code into core that knows about the DHT role of the node, and keep that concern confined to the network implementation. Since this is only reading DHT I don‘t see a problem adding it.

crates/core/src/workflows/get_link_result.rs

lucksus

Great!

… holding

…ng list

lucksus · 2020-05-06T20:44:22Z

crates/core/src/dht/actions/hold_aspect.rs

+                .expect("No state present when trying to respond with gossip list");
+
+            let mut address_map = AspectMap::new();
+            address_map.add(&aspect);


Huh, this sends back a GossipList with just this one item in it? Isn't this causing sim2h to be confused about the gossip list of this node?

I really think we shouldn't do this (managing results of holding workflows) here at all and instead have sim2h properly (if it isn't) poll for gossip lists.

Currently the gossiplist is additive on the sim2h side so this really acts as an acknowledgement of the hold_aspect request.

Sim2h's state knowledge is append-only.

We were previously setting "agent-holding" when we sent the hold request, but that could fail - it's much more accurate for core to send a sparse holding list that results in an additional hold record.

lucksus · 2020-05-06T20:50:07Z

crates/core/src/dht/dht_reducers.rs

+        r
+    } else {
+        let mut store = (*old_store).clone();
+        store.mark_hold_aspect_complete(id.clone(), hold_result);


Despite my last comment, I think it is a good thing to store the result of the holding workflow in the state as done here. (So we don't [always] rerun an already failed validation - and for debugging). I still think introducing the complexity of getting this result back to sim2h as a response for the holding request is too brittle and having sim2h (re-)poll for the gossip list (instead of marking something as hold because the request was sent) is the real fix here.

No, that can't work in the end. Remember that sim2h is where the intelligence lies of who should hold what. It has the knowledge of who is in the sharding neighborhood. The only reason it asks for a gossip list is to update it's understanding of who is actually holding what data so it can correctly forward queries. But at the same time it should be making the decisions about who SHOULD hold what. These fixes are about getting that right.

How is this affecting the network code's reasoning about who should hold what?

lucksus · 2020-05-06T20:51:15Z

crates/core/src/dht/dht_store.rs

@@ -49,6 +50,9 @@ pub struct DhtStore {
    /// All the entry aspects that the network has told us to hold
    holding_map: AspectMap,

+    /// Hold aspect requests from the network
+    queued_hold_aspect_requests: HashMap<ProcessUniqueId, Result<(), HolochainError>>,


Confused about the name. Should this be something like holding_attempt_results?

lucksus · 2020-05-06T20:53:10Z

crates/core/src/network/handler/lists.rs

+
+        // currently sim2h asks for the authoring list and treats it just the same
+        //as the gossiping list, i.e. as data that you hold.  This is a problem because
+        //it means that gossiping isn't actually working right.  So instead of fixing that in


Woah, if sim2h is doing, that is a problem and I would consider it a defect.

Last time I looked, sim2h was using the authoring list to check if this node has something new that the DHT does not store yet, such that sim2h would request that data from the node and publish it to the according authorities.

Yes, that was incorrect. We could, in the sim2h case actually treat it that way, and we could add to sim2h the tracking of who authored what so that it does treat authors as authoritative on the DHT. The hack of having the conductor simply republish everything it authored when it receives the authoring list request serves the exact same purpose as what say above as long as sim2h also checks to not send the data again to nodes that it knows it's already sent the publish data too. I will be adding that.

lucksus · 2020-05-06T20:57:11Z

crates/core/src/network/handler/lists.rs

+                    result
+                );
+            }
+        }


I would expect this hack to cause much more strain on the whole network since sim2h handles publishes different to how it (used to) handle(s) authoring lists. Doesn't publishes always result in store requests send to the according DHT authorities? Which would now happen every time a node comes online...

I don't think this is the right solution. If sim2h does not treat authoring lists as authoring lists anymore, that should get back in.

We cannot have authors marked as holding the data - because they are not authoritative - the meta info will be incomplete.

Because Sim2h maintains central knowledge of who holds what, a publish should become a no-op if the item has already saturated a neighborhood.

lucksus · 2020-05-06T20:58:55Z

crates/sim2h/src/lib.rs

            sim2h_handle.state().spawn_agent_holds_aspects(
                (&*space_hash).clone(),
                (&*agent_id).clone(),
                data.entry.entry_address.clone(),
                aspect_list.clone(),
-            );
+            );*/


Yes, I regard this as the real fix for handling the case of failed holding requests.

…hem up

zippy · 2020-05-11T19:47:57Z

This work has been split into multiple PRs: #2184 #2189

don't get links from network if we are the authority

588163c

zippy requested a review from lucksus April 24, 2020 16:43

lucksus reviewed Apr 27, 2020

View reviewed changes

crates/core/src/workflows/get_link_result.rs Outdated Show resolved Hide resolved

zippy added 2 commits April 28, 2020 14:09

Merge branch 'develop' into get-link-local-if-authority

3b6486a

also check holding list

5b10a8f

zippy requested a review from lucksus April 29, 2020 19:41

added some extra debugging

f583785

lucksus approved these changes May 1, 2020

View reviewed changes

zippy added 11 commits May 1, 2020 15:22

remove attempted to end hdkcall error

bf7bf78

remove spurious error

25f70a9

better error messages

68ae99a

don't mark as held if there's an error

81a8f66

fmt

bcf888f

fix fragile test

ab92c9c

another fragile test fix

c473aaa

bad regex

9bcbbbd

now handle error cases for HoldAspect reducer

f3e3dcb

remove test code

4e74cc8

clippy

9368ece

zippy requested a review from lucksus May 5, 2020 00:50

zippy added 10 commits May 4, 2020 21:06

fixed broken tests

453ffab

fix comment

d4a1deb

fix dht_reducer tests

f7536d6

remove erroneous marking of aspect held in sim2h

2a5169c

send back gossip list after hold_aspect request from sim2h to confirm…

8ec1563

… holding

fix memory_server to handle gossip list without request

0056348

fix memory_server to handle gossip list without request

006f169

stop sim2h assuming hold_aspect requests are succesfull

dc8fa77

log error when hold_aspect request fails in conductor

de4411e

fmt

c386098

zippy added 8 commits May 5, 2020 13:48

just send back the one aspect held when confirming a hold_aspect

18d24f1

respond to sim2h getauthoring list with publishing instead of authori…

2152c09

…ng list

fix bug in reducer returning state

0aa9d04

fixes for app_spec tests

b644267

sim2h remove storing of holding aspects on publish

422c035

fix get_author_list sending publish to include publish headers

8d7a781

fixes to offline-validation app_spec test

646c285

clippy

34684fb

lucksus suggested changes May 6, 2020

View reviewed changes

correlate holding attempts with their validation workflow and clean t…

414c4e3

…hem up

zippy closed this May 11, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

don't get links from network if we are the authority #2183

don't get links from network if we are the authority #2183

zippy commented Apr 24, 2020 •

edited

Loading

lucksus left a comment

lucksus left a comment

lucksus May 6, 2020

zippy May 6, 2020

neonphog May 6, 2020

lucksus May 6, 2020

zippy May 7, 2020

lucksus May 7, 2020

lucksus May 6, 2020

lucksus May 6, 2020

zippy May 7, 2020

lucksus May 6, 2020

neonphog May 6, 2020

lucksus May 6, 2020

zippy commented May 11, 2020

don't get links from network if we are the authority #2183

don't get links from network if we are the authority #2183

Conversation

zippy commented Apr 24, 2020 • edited Loading

PR summary

testing/benchmarking notes

followups

changelog

documentation

lucksus left a comment

Choose a reason for hiding this comment

lucksus left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zippy commented May 11, 2020

zippy commented Apr 24, 2020 •

edited

Loading