Hash-And-Adjust

Experimentation-code for the publication "Hash & Adjust: Competitive Demand-Aware Consistent Hashing".

We leverage consistent hashing to develop H&A, a demand-aware, constant-competitive algorithm that enhances storage utilization and access efficiency while providing robust performance guarantees. This approach is particularly beneficial for distributed systems, which manage dynamic workloads and demand adaptive mechanisms.

Consistent Hashing

Consistent hashing maps data to specific hosts using a hash function for efficient allocation and retrieval. In this implementation, we partition the hosts (servers) into subsequences of servers (ServerLists). Data-items are inserted at the beginning of the partition (=head) until the server is full. Once full, newly inserted items are forwarded to the subsequent servers, until a free spot is found. This way, non-full servers must be at the end of each partition: this is how isolated interactions are insured. When we search for an item, for example, we just need to go from head until the next non-full server.

Figure a illustrates an example of our model with 8 servers, each having a capacity of 4 items. In this scenario, item v was initially assigned to server head(v) (the closest hash value). However, since head(v) and head(v)+ were full, the item was relocated to host(v).

Figure b demonstrates the decomposition of this example into ServerLists, with each ServerList represented by a distinct color. Servers within a ServerList are shaded with varying gradients of the same color to indicate their roles as different heads. Items are colored to match their respective server.

The following plot, which can be generated by starting Experiment 1 (see main.py), illustrates the storage utilization benefits of this data-structure, combined with the access-cost efficiency of H&A:

Data-items can be immediately accessed using Traditional Consistent Hashing, however, this approach lacks the benefits of optimized data-structure design and results in significant storage space waste.

While Consistent Hashing with Bounded Loads significantly improves storage utilization compared to Traditional Consistent Hashing, the H&A algorithm goes a step further, enhancing both access efficiency and storage utilization.

Run the evaluation code

Following a short guide on how to reproduce the plots of the publication.

Dependencies

To run the simulation-code, the modules bitstring, scipy and pandas are required.

For the plotting functionality, other packages such as seaborn are used. A detailed list can be found in requirements.txt.

Experiments

The experimentation code can be found in implementation. It fetches the datasets from data and generates data in data_handling.

Experiment setup

We advise to use PyCharm (it offers the possibility to take care of the dependencies) and set the working directory to the parent folder .\Hash-And-Adjust\.

The parameters (# of servers, # of items, ...) can be set in params.py, however, for reproduction's sake, the params are overwritten at the beginning of each paper-experiment (see first few lines of experiment.py).

Run the experiments

All relevant experiments can be run from main.py in the root folder. The output will be saved in the results folder.

Data

Temporal-locality data can be generated on the fly. We uploaded the click-dataset to data, while the CAIDA-Dataset needs to be requested from the official source.

Plotting

The plotting code can be found in the plotter.py file: it uses the output from results and can be run independently from the experimentation code (see comments in main.py). Generated plots are saved in the plots folder.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hash-And-Adjust

Consistent Hashing

Run the evaluation code

Dependencies

Experiments

Experiment setup

Run the experiments

Data

Plotting

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
data		data
data_handling		data_handling
implementation		implementation
plots		plots
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
params.py		params.py
requirements.txt		requirements.txt

License

inet-tub/Hash-And-Adjust

Folders and files

Latest commit

History

Repository files navigation

Hash-And-Adjust

Consistent Hashing

Run the evaluation code

Dependencies

Experiments

Experiment setup

Run the experiments

Data

Plotting

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages