Merkle Tuesday #29

mafintosh · 2015-10-08T21:40:26Z

Lets do a community call on strategies for replicating merkle dags

YouTube link: http://youtu.be/DPiOdh8CoS0
Gitter link (chat/ask questions): https://gitter.im/datproject/discussions
Etherpad: https://etherpad.wikimedia.org/p/merkle-tuesday

Theme: Strategies for replicating merkle graphs
Where: Google Hangouts on Air
When: Tuesday, October 13th
Time: 1pm EST (17:00 GMT)

We are doing our 1.0 iteration of our merkle graph module, mafintosh/dat-graph and in the last couple of days we've actively been discussing the various ways of replicating merkle dags and generic hash sets on irc. dat-graph implementation is almost done so some community feedback would be appreciated.

Merkle DAG research so far: #28

Agenda:

Quick introdution to Merkle DAGs
Our requirements and the trade-offs we're willing to make
Discussion of our current dat-graph replication strategy using local graph path enumeration
Discussion of merkle tree / trie strategies

jbenet · 2015-10-12T02:35:53Z

@maxogden got the msg -- this sounds fun i think i can do it.

I think @diasdavid may want to come too.

See also ipfs/notes#12 -- this is WIP.

daviddias · 2015-10-12T17:08:43Z

awesome idea 👍

daviddias · 2015-10-12T17:26:25Z

Made a WorldTimeBuddy clock -> http://www.worldtimebuddy.com/?qm=1&lid=8,5,14,1819729&h=14&date=2015-10-13&sln=17-18

RichardLitt · 2015-10-12T17:37:31Z

Added this to the IPFS community calendar, too.

mikolalysenko · 2015-10-12T21:16:24Z

Will try to make it, currently at a conference in SLC and have to give a talk right after this. But I will try to sit in!

pfrazee · 2015-10-12T23:57:42Z

Will be in transit
/cc @dominictarr

max-mapper · 2015-10-13T16:16:27Z

YouTube link is here https://www.youtube.com/watch?v=DPiOdh8CoS0&feature=youtu.be, starts in 44 minutes

jbenet · 2015-10-13T18:49:40Z

This was great! 👍

mikolalysenko · 2015-10-13T19:12:01Z

Thanks for inviting me!

max-mapper · 2015-10-13T21:25:16Z

Anyone who missed it can watch the youtube: https://www.youtube.com/watch?v=DPiOdh8CoS0&feature=youtu.be

or check out the notes from the call:

 __   __  _______  ______    ___   _  ___      _______                                                                                             
|  |_|  ||       ||    _ |  |   | | ||   |    |       |                                                                                            
|       ||    ___||   | ||  |   |_| ||   |    |    ___|                                                                                            
|       ||   |___ |   |_||_ |      _||   |    |   |___                                                                                             
|       ||    ___||    __  ||     |_ |   |___ |    ___|                                                                                            
| ||_|| ||   |___ |   |  | ||    _  ||       ||   |___           https://gitter.im/datproject/discussions                                          
|_|   |_||_______||___|  |_||___| |_||_______||_______|                                                                                            
 _______  __   __  _______  _______  ______   _______  __   __   https://github.com/datproject/discussions/issues/29                               
|       ||  | |  ||       ||       ||      | |   _   ||  | |  |                                                                                    
|_     _||  | |  ||    ___||  _____||  _    ||  |_|  ||  |_|  |                                                                                    
  |   |  |  |_|  ||   |___ | |_____ | | |   ||       ||       |                                                                                    
  |   |  |       ||    ___||_____  || |_|   ||       ||_     _|                                                                                    
  |   |  |       ||   |___  _____| ||       ||   _   |  |   |                                                                                      
  |___|  |_______||_______||_______||______| |__| |__|  |___|                                                                                      

      a         a = hash(b + d)
     / \        b = hash(c)
    b   d       d = hash(e)
   /     \      
  c       e

    a         a = hash(hash(ant.jpg) + b + c)
   / \        b = hash(bunny.jpg)
  b   c       c = hash(cat.jpg)

  a         a = hash(hash(ant.jpg) + b + c)
  |\        b = hash(bunny.jpg + d)
  b c       c = hash(cat.jpg + d)
  |/        d = hash(dog.jpg)
  d


    x    y
  /   \ /
 a     e
 |    / \
 b   d   f
  \ /    |
   c     g
    \   /  
      j
      |
      k
      |
      l

x a b c j k l <--
y e d
f g

c
mad science from mikola:
- spanning tree + preferred path decomposition
- disjointness problem:
    find if A + B have no intersections (are they disjoint)
    its impossible to find this out in less than O(n)
    bad news: you need O(n)
    good news: dumb solution is probably optimal

- merkle dags are functional data structures
  - any functional data structure can be converted to a merkle dag by replacing pointers with hashes (merkleize)
  - functional data structure theory can be applied here

- http://cbor.io/
https://tools.ietf.org/html/rfc7049
https://github.com/diasdavid/node-ipld
- https://github.com/ipfs/go-ipld/issues

- use cases that merkle dags provide a trust framework for:
- replicate entire graph (e.g. dat where you want a complete dataset replica)
- query specific parts of the graph (e.g. ipfs where graph spans multiple galaxies, query flooding)
- the network runs the queries itself (e.g. ethereum where you have turing complete bytecode running in distributed VMs)
- replication of graphs
  1. mafintosh style preferred path replication
  2. decompose graph into sets of nodes and try to sync nodes
    - put node hashes into a trie
    - walk the trie to sync
    - similar to ethereum patricia merkle tree https://github.com/ethereum/wiki/wiki/Patricia-Tree
    - need to tune radix of the trie (because hashes are large)

"verified computing"

shared goals:
    - low level graph comparison api
      - ask if someone has nodes
      - ask for data for nodes
      - ask if someone has nodes that match a graph selection path (jbenet graph query syntax)
    - syntax for graph pathing
      - xpath might be too complicated
      - main use case is e.g. 'a..b' (all nodes between a and b)
      - another one would be e.g. /*/**/[type="file"]
    - data structures that can enable the above APIs (preferred path indexing, node trie indexing)
    - backwards compat on the public APIs
    - incremental friendly indexing & replication for data structures
    - supporting efficient streaming by selective walking of inner & intermediate nodes
    - uncle hashes & munro hashes may help this 

 agree - can think of this as stacked layers GET/SET/HAVE simple functions
 - add additional query layers on top
 - protocol should have simple extensibility
 - viz https://tools.ietf.org/html/rfc5754
 - consider whether the low-level query layer needs to know about graphs or not

 - collect any names / urls of reseearchers and projects (maybe on wiki) to contact
 - I mentioned ICNRG research group from IETF https://irtf.org/icnrg 
 - binmaps www.pds.ewi.tudelft.nl/fileadmin/pds/reports/2011/PDS-2011-005.pdf & https://github.com/gritzko/binmap & http://scattered-thoughts.net/blog/2012/01/03/binmaps-compressed-bitmaps/
 - need solutions / optimisation choices for different retrieval / replication requirements

Where can people leave feedback? 

- i need to index some stuff
- i need to query some stuff
- i have a content addressable store graph
- maybe a REST-like api for the graph would make sense? then you can layer on indexing and querying

- possible inspiration https://en.wikipedia.org/wiki/Border_Gateway_Protocol
- communication complexity notes: http://theory.stanford.edu/~tim/w15/l/w15.pdf
 {value: 'some new data', link: ['deafbeaf'], key: 'aaaabbbb'} -> {value: 'some data', author: 'mathias', key: 'deafbeaf'}

 bob         alice

 a            a         
 🔼           🔼
 b            b
 🔼           🔼
 c            c
              🔼
              d

scenario 1:
bob asks alice -> do you have any of [a,b,c]?
alice reponds -> [0,1,2]
bob asks -> give me everything from [2, infinity]

scenario 2:
bob asks alice -> do you have any of [c]?
alice reponds -> []
bob asks alice -> do you have any of [a]?
alice reponds -> [0]
bob asks alice -> do you have any of [b]?
alice reponds -> [0]
now bob knows alice has a and b but not c



------------------------

dag path notation https://github.com/ipfs/notes/issues/12

- traversals (or selections) on a graph
- XPath

<hash>/*/**/[type="file"]


<hash>/*
<hash>/*/*/*
<hash>/a/b/c/d/e/*



<hash1>..<hash2>

wrt to erlang, data structures are not replicated in any smart sense - just converted to "flat" binary structure. (dch) see http://www.erlang.org/doc/man/erlang.html http://www.erlang.org/doc/apps/erts/erl_ext_dist.html

max-mapper · 2015-10-13T21:34:53Z

Also check out this blog post: https://gdstechnology.blog.gov.uk/2015/10/13/guaranteeing-the-integrity-of-a-register/

mikolalysenko · 2015-10-13T22:24:15Z

Okasaki's thesis is a good reference on functional data structures: http://www.cs.cmu.edu/~rwh/theses/okasaki.pdf

The main trick in these things is path copying (aka structural sharing) to reduce memory. Lots of data structures are easy to functionalize, like red-black trees, kd-trees, tries, etc.

kumavis · 2015-10-27T08:43:48Z

here's null_radix's merkle-patricia-tree implementation

okdistribute changed the title ~~Merkle Monday~~ Tuerkle Tuesday Oct 9, 2015

max-mapper changed the title ~~Tuerkle Tuesday~~ Merkle Tuesday Oct 9, 2015

max-mapper closed this as completed Oct 13, 2015

daviddias mentioned this issue Oct 19, 2015

Sprint Oct 12th ipfs/team-mgmt#39

Closed

62 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merkle Tuesday #29

Merkle Tuesday #29

mafintosh commented Oct 8, 2015

jbenet commented Oct 12, 2015

daviddias commented Oct 12, 2015

daviddias commented Oct 12, 2015

RichardLitt commented Oct 12, 2015

mikolalysenko commented Oct 12, 2015

pfrazee commented Oct 12, 2015

max-mapper commented Oct 13, 2015

jbenet commented Oct 13, 2015

mikolalysenko commented Oct 13, 2015

max-mapper commented Oct 13, 2015

max-mapper commented Oct 13, 2015

mikolalysenko commented Oct 13, 2015

kumavis commented Oct 27, 2015

Merkle Tuesday #29

Merkle Tuesday #29

Comments

mafintosh commented Oct 8, 2015

jbenet commented Oct 12, 2015

daviddias commented Oct 12, 2015

daviddias commented Oct 12, 2015

RichardLitt commented Oct 12, 2015

mikolalysenko commented Oct 12, 2015

pfrazee commented Oct 12, 2015

max-mapper commented Oct 13, 2015

jbenet commented Oct 13, 2015

mikolalysenko commented Oct 13, 2015

max-mapper commented Oct 13, 2015

max-mapper commented Oct 13, 2015

mikolalysenko commented Oct 13, 2015

kumavis commented Oct 27, 2015