Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mpl: add mpl_hash #21

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

mpl: add mpl_hash #21

wants to merge 1 commit into from

Conversation

hzhou
Copy link
Owner

@hzhou hzhou commented Sep 25, 2023

Pull Request Description

Add mpl_hash, a string to string hash.

uthash is cumbersome to use as a string to string hash. It requires users to define the hash struct, and manage key and value memory allocations, and it is not good in performance due to fragmented memory allocations.

This PR adds MPL_hash, a specialized string to string hash utility. It manages key and value memories using a dynamically managed memory slab for better efficiency. The strings are dynamically added but only freed at MPL_hash_free. To erase a key, simply set that key to value NULL.

Comparison in usage:

/* uthash */
#include "uthash.h"

struct my_struct {
    char *key;
    char *value;
    UT_hash_handle hh;
};

void main(void)
{
    const char *key = "name";
    const char *val = "value";
    struct my_struct *hash = NULL;

    struct my_struct *s;
    s = (struct my_struct *) MPL_malloc(sizeof(*s), MPL_MEM_OTHER);
    s->key = strdup(key);
    s->value = strdup(val);
    HASH_ADD_KEYPTR(hh, hash, s->key, strlen(s->key), s, MPL_MEM_OTHER);

    HASH_FIND_STR(hash, "name", s);
    if (s) {
        printf("Found in hash: %s -> %s\n", s->key, s->value);
    }

    struct my_struct *tmp;
    HASH_ITER(hh, hash, s, tmp) {
        MPL_free(s->key);
        MPL_free(s->value);
        HASH_DEL(hash, s);
        MPL_free(s);
    }
}
/* mpl_hash */
#include "mpl_hash.h"

void main(void)
{
    const char *key = "name";
    const char *val = "value";
    struct MPL_hash *hash = NULL;

    hash = MPL_hash_new();

    MPL_hash_set(hash, key, val);

    char *s = MPL_hash_get(hash, key);
    if (s) {
        printf("Found in hash: %s -> %s\n", key, s);
    }

    MPL_hash_free(hash);
}

Performance-wise, adding 1 million entries with 1000 unique keys, measured on my workstation using time:

    uthash:   0.234s
    mpl_hash:  0.095s

[skip warnings]

Author Checklist

  • Provide Description
    Particularly focus on why, not what. Reference background, issues, test failures, xfail entries, etc.
  • Commits Follow Good Practice
    Commits are self-contained and do not do two things at once.
    Commit message is of the form: module: short description
    Commit message explains what's in the commit.
  • Passes All Tests
    Whitespace checker. Warnings test. Additional tests via comments.
  • Contribution Agreement
    For non-Argonne authors, check contribution agreement.
    If necessary, request an explicit comment from your companies PR approval manager.

@hzhou hzhou changed the title 2309 mpl hash mpl: add mpl_hash Sep 25, 2023
Add mpl_hash, a string to string hash.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant