Skip to content

vivintw/CHORD

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

IMPORTANT INFO :
================
root :
	./dht_peer -m 1 -p 16500 -h `hostname`

peer :
	./dht_peer -m 0 -p 16501 -h `hostname` -R baseball.ccs.neu.edu -r 16500

client :
	./client -p 1234 -h `hostname` -r 16500 -R baseball.ccs.neu.edu

test_client.py  : is the pseudo client used to test.
dht_peer : dht_peer code
client : client code


nodes used for testing the code :
['akubra.ccs.neu.edu',
'budenovka.ccs.neu.edu',
'derby.ccs.neu.edu',
'kofia.ccs.neu.edu',
'oxford.ccs.neu.edu',
'baseball.ccs.neu.edu']

max object size = 255 Bytes

protocol :
==========
there are two protocols that are used by my implementation of CHORD,
1. control plane  : for the formation of the ring and making sure that the ring
                    is stable.
2. data plane : for storing and retrieving data from the Distributed hash table.

each packet sent through the network has the following format :
     ____________________
    | packetlen | packet |
     --------------------
     where :
     packetlen : length of the entire packet (excluding the length of packetlen)
                 packetlen = 2 bytes
    packet : data transferred by the underlying layers (control packet and data
             packets)

protocol format for control plane :
=================================
 ___________________________
|type |ip |ip_hash | payload|
 ---------------------------
 type = 1 byte
 ip = 4 bytes
 ip_hash = 160 bites
 payload = (depends on the packet)


type of packets available in the control protocol :

join : this is the join request that is send by a peer to the root peer node
       to indicate that it would like to join the network.

       packet representation :
        __________________________________________
       |join | peer_ip | peer_ip_hash | peer_port |
        ------------------------------------------

init : this packet is sent through a new connection that is established with
       another peer. This packet is sent by the new joinee to its successor and
       predecessor after it establishes new tcp connections with them. These
       packets are also send out after a new connection is established in an
       event of a peer leaving the network or crashing.

       packet representation :
        ______________________________________
       | init | peer_ip | peer_ip_hash | type |
        --------------------------------------
       type = 1 if the type if the peer connecting is the predecessor of the
       peer to which the connection is made. If the connection peer is the
       successor of the peer being connected to then type = 2.

update :  this packet is sent when there is a new node joining or if there is an
          existing node that crashed. These packets have updates that are meant
          for the node to which the packet is addressed to. On receiving  such
          an update packet the peer updates its cache with the mentioned peers
          as successor/predecessors of it.

          packet representation :
           ______________________________
          | update | ip | ip_hash | data |
           ------------------------------

          format for the data section of the update packet :
           ____________________________________________________________________
          |for | succ_ip | succ_hash | succ_port | pre_ip | pre_hash | pre_port|
           --------------------------------------------------------------------
          for = 4 bytes (ip address of the intended recipient)
          succ_ip = ip of the successor peer. (None => no update to be made)
          succ_hash = ip hash of the successor peer.
                        (None => no update to be made)
          succ_port = port number of the successor peer.
                        (None => no update to be made)
          pre_ip = ip of the predecessor peer. (None => no update to be made)
          pre_hash = ip hash of the predecessor peer.
                       (None => no update to be made)
          pre_port = port number of the predecessor peer.
                       (None => no update to be made)


query : this packet is used to query the successor and the predecessor of the
       peer to make sure that they are alive, and also make sure that the
       current peer is listed as its successor's predecessor and predecessor's
       successor.

       packet representation :
        _________________________________
       | query | ip | ip_hash | question |
        ---------------------------------
        question = 1 for successor and 2 for predecessor (1 byte)


answer : this packet is generated by a peer in response to a query posed to it
        by another peer.

        packet representation :
         ______________________________
        | answer | ip | ip_hash | type |
         ------------------------------
         ip = ip address of its predecessor/successor based on the query.
         ip_hash = ip hash  of its predecessor/successor based on the query.
         type = 1 if the query was about successor else 2.

dead : this packet is generated by the successor of the node which left/crashed.

      packet representation :
       __________________________________________
      | dead | ip | ip_hash | ip_hash_pre | port |
       ------------------------------------------
       ip = ip address of the peer sending the packet.
       ip_hash = ip hash of the peer sending the packet.
       ip_hash_pre = ip hash of the dead predecessor.
       port = port number of the peer sending the packet.


protocol format for data plane :
===============================
 _________________________________
|type |client_id |operation |data |
 ---------------------------------
type = 1 byte
client_id = 160 bits
operation  = 1 byte
data = (depends on the packet)


type of packets available in the data protocol :

put : this packet is used to store the data in a peer.

     packet representation :
      _______________________________
     | data | client_id | put | data |
      -------------------------------

      data representation :
       _________________________
      | obj_key | obj_len | obj |
       -------------------------
       obj_key = hash value of the key of the object.(160 bits)
       obj_len = length of the object in bytes
                 (max obj size = 512 bits => 64 bytes)
       obj = actual object that needs to be stored.

get : this packet is used to directly get data from a peer.

      packet representation :
       __________________________________
      | data | client_id | get | obj_key |
       ----------------------------------
       obj_key = hash value of the key of the object. (160 bits)


lookup : packet is used by the client to do a lookup of the ring to
         figure out where the data has to be stored to / retrieved from.

         packet representation :
          ______________________________________________
         | data | client_id | lookup | method | obj_key |
          ----------------------------------------------
          method = 1 if recursive lookup , 2 for iterative lookup (i byte)
          obj_key = hash value of the key of the object.(160 bits)

response : packet generated by a peer after the client contacts it for data.

          packet representation :
           ____________________________________
          | data | client_id | response | data |
           ------------------------------------

           data representation :
            _________________________
           | obj_key | obj_len | obj |
            -------------------------
            obj_key = hash value of the key of the object.(160 bits)
            obj_len = length of the object in bytes
                      (max obj size = 512 bits => 64 bytes)
            obj = actual object that is retrieved.

redirect : packet used by a peer to redirect the client to its successor peer.

          packet representation :
           __________________________________________
          | data | client_id | redirect |  ip | port |
           ------------------------------------------
           ip = ip of the successor peer.
           port = port of the succesor peer.

close : packet sent to the client to close its connection to the peer.

       packet representation :
        __________________________
       | data | client_id | close |
        --------------------------

move : this option is used to transfer objects between nodes.
       this packet type is used when a new node joins the network and there
       are objects in the network stored on other nodes that have to be
       transferred to the new node.

       packet representation :

       ________________________________
      | data | client_id | move | data |
       --------------------------------



Client :
========
language used  : python.
usage :
    ./client -p 1234 -h `hostname` -r 16500 -R baseball.ccs.neu.edu

the client when executed presents the user with a menu on selecting an option
the required function is executed.

on selecting store , the client first does a recursive lookup of the peer on
which the object needs to be stored. On finding the peer the client sends a put
request to that peer.

in order to retrieve the client is presented with 2 options, recursive retrieval
and iterative. In a recursive retrieval, the client makes a request to the root
peer and the peer responds with the details of the peer at which the data has to
be retrieved from. In the iterative scenario, the peer responds with a redirect
message redirecting the client to its successor if it is not the required peer.
If the current peer is the required peer then the peer responds with a redirect
message with the ip and port set to None. On figuring out the right node the
client sends a get message to the right peer.

the last option that is presented is the exit option which causes the client
to terminate.

Peer :
======
language used : python.
usage  :

  root :
	  ./dht_peer -m 1 -p 16500 -h `hostname`

  peer :
	  ./dht_peer -m 0 -p 16501 -h `hostname` -R baseball.ccs.neu.edu -r 16500

the peer runs a thread that periodically queries the successor and the
predecessor of this peer to make sure that they are alive and are responding to
the query posed to them. if the peer is dead then the cache is updated with this
information.

the main thread uses select to check if there are any socket that need to be read
from. If there is some data to be read from any of the sockets that is maintained
(the server socket, the successor and predecessor sockets). the the information
is  read and parsed. If any response is required to be sent by the peer then such
a message is sent to either the successor or predecessor sockets.

when a new peer is added to the chord ring. the predecessor of the new node
notifies its successor that its predecessor is the new node by issuing an update
packet to it. on receiving the update packet the successor node updates its cache
accordingly. The predecessor also sends an update message to the root via its
predecessor (recursively), this update message is for the new node. On receiving
this update from the root node. The new node established connections with its
successor and predecessor as mentioned in the update packet. The new node also
sends out init packets to initialize these newly established sockets and to notify
its predecessor that the new node is its successor and similarly its successor.

when a node dies/leaves the network, a dead message is sent by the predecessor of
the node that died to its predecessor . till the message reaches the node whose
successor was the dead node. On receiving the dead notification the successor
connects to the dead nodes predecessor and the ring is completed.


when a lookup data packet arrives at the root from the client the object key is
compared with the ip hash of the current peer as well as the ip hash of the
predecessor/ successor peer to decide if the object must be stored or retrieved
from the current peer. if so a response message is sent with the port and ip
address of the current peer else, the message is forwarded to the successor if
a recursive lookup is being done, else a redirect message is sent to the client
redirecting the client to the successor of the current peer. once the lookup is
complete the client gets/stores the packet in the correct peer.


In the event of a new node being added to the network objects(pre existing in
the network) that will hash to this node after the node joins the network are
transferred to the node by the new nodes successor and predecessor nodes.




KEY CHALLENGES:
===============
formation of the ring was very challenging taking up a lot of time and effort
in debugging the various connections and making sure that the node joins the
network at its expected location.

closing unwanted connections (connections from clients) and previous successor
and predecessor node was quite confusing and involved a lot of debugging.

Transferring objects that already existed in the network to the new node.


TESTING:
========
Testing the peers :
1. testing if the ring formed correctly :
   This test was conducted with 4 nodes where one node was assigned as the
   root peer and all permutation of the other 3 nodes joining the network were
   tried in different orders. The ring formed correctly at every single try.

2. testing formation of ring with 5 nodes :
    about 5 combination of the exiting 24 combinations were tried with the root
    node fixed and other peers joining in the order of various permutations.
    Each time the ring formed correctly.

3. root has the largest IP_HASH : the root was selected such that the root had
    had the largest IP_HASH in the network and various combination of joins and
    leaves of peers were tested. Each time the ring formed/re-formed correctly.

4. root has the smallest IP_HASH : same test as (3) but with root having
    the smallest IP_HASH in the network.

5. root not the smallest or largest IP_HASH : same test as (3) but with root
   hash not being the largest or smallest in the network.

6. node leaving the network : (send SIGINT to the process)
    when the node leaves the network the predecessor of the dead node contacts
    the successor of the dead node and reforms the ring. The test was executed
    with multiple nodes leaving the network one after another, each leaving
    after the ring reconnects. Each time the ring was able to reform correctly.

7. node leaving the network and rejoining the network : the nodes are able
   to leave the network and rejoin the network each after the ring is given time
   to heal. Each time the node joins the network at the same position and the
   ring is formed correctly.

8. testing the ring with pseudo client :
    a pseudo client was written to generate various objects such that the
    object key hash would lie in between the nodes in the ring and also coincided
    with the hash of the node. Storing and retrieving the objects were successful.

9. testing moving of objects between nodes :
    the same objects where used as in (8) but only the root node was up when
    the data was stored by the client. Each peer was added one by one, checking
    to make sure that the object files and index file reflected the movement of
    objects. Each time the objects were moved correctly to the right nodes.

10. retrieving data stored in a crashed node : once all the data was stored
    the pseudo client was used to retrieve the same object. The client displays
    the received error message and the node from which it got the message from.

Testing client :
1. store : The client was able to store an object successfully at each node
    in the ring. The objects were stored at the expected node.The client success
    fully did the lookup and found the right node at which the data had to be
    stored before storing the data.

2. retrieve recursive : the client was able to retrieve the data from each node
    successfully after it did a look up to find the node where the data is
    stored.

3. retrieve iterative : the client was able to send out iterative look up
    messages to each node starting from the root node, able to successfully
    find the next node from the redirect requests. The client was able to find
    the right node where the object was stored and retrieve it.

4. client exits only when asked to exit :  made sure that the client does not
    exit on any invalid input the client just displays the menu.


IMPORTANT DESIGN FEATURES :
===========================
in order to form the ring the node with the smallest IP_HASH and the largest
IP_HASH have special markers in their global cache which handles edge cases
where a node/data is being put into the network.

TCP data is received as a continuous stream of data hence each packet has a
length field attached to it so that each packet can be separated and processed
individually.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published