Skip to content

Commit

Permalink
SQLite and RocksDB support for KVtags (#165)
Browse files Browse the repository at this point in the history
SQLite and RocksDB support for KVtags
  • Loading branch information
houjun authored Nov 26, 2023
1 parent 1e03014 commit 23b1fdc
Show file tree
Hide file tree
Showing 13 changed files with 1,164 additions and 259 deletions.
14 changes: 14 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -467,7 +467,21 @@ if(PDC_ENABLE_FASTBIT)
set(ENABLE_FASTBIT 1)
endif()


# Metadata with RocksDB
#-----------------------------------------------------------------------------
option(PDC_ENABLE_ROCKSDB "Enable RocksDB (experimental)." OFF)
if(PDC_ENABLE_ROCKSDB)
set(ENABLE_ROCKSDB 1)
endif()

# Metadata with SQLite
#-----------------------------------------------------------------------------
option(PDC_ENABLE_SQLITE3 "Enable SQLite3 (experimental)." OFF)
if(PDC_ENABLE_SQLITE3)
set(ENABLE_SQLITE3 1)
endif()

# Check availability of symbols
#-----------------------------------------------------------------------------
check_symbol_exists(malloc_usable_size "malloc.h" HAVE_MALLOC_USABLE_SIZE)
Expand Down
41 changes: 29 additions & 12 deletions docs/source/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -253,6 +253,33 @@ PDC object APIs
* Delete data from an object.
* For developers: see pdc_client_connect.c. Use PDC_obj_get_info to retrieve name. Then forward name to servers to fulfill requests.

---------------------------
PDC region APIs
---------------------------


---------------------------
PDC property APIs
---------------------------


---------------------------
PDC metadata APIs
---------------------------
PDC maintains object metadata (obj name, dimension, create time, etc.) in a distributed hash table. Each object's metadata can be
accessed with its object ID. Users can also issue metadata queries to retrieve the object IDs that meet the query constraints.

PDC allows users to add key-value tags to each object, where key is a string and value can be a binary array of any datatype and length.
The key-value tags are stored in an in-memory linked list by default.

PDC has metadata indexing and querying support when DART is enabled. See ``DART`` section in the Developer Notes.

PDC additionally supports managing the key-value tags with RocksDB and SQLite, both are considered experimental at the moment.
Either RocksDB or SQLite can be enabled by turning on the ``PDC_ENABLE_ROCKSDB`` or ``PDC_USE_SQLITE3`` flag in CMake, setting the
``ROCKSDB_DIR`` or ``SQLITE3_DIR`` and setting the environment variable ``PDC_USE_ROCKSDB`` or ``PDC_USE_SQLITE3`` to 1 before launching the server.
Users can use the same PDC query APIs when RocksDB or SQLite is enabled.


* perr_t PDCobj_put_tag(pdcid_t obj_id, char *tag_name, void *tag_value, psize_t value_size)
* Input:
* obj_id: Local object ID
Expand Down Expand Up @@ -285,17 +312,7 @@ PDC object APIs
* For developers: see pdc_client_connect.c. Need to use PDCtag_delete to submit RPCs to the servers for metadata update.
---------------------------
PDC region APIs
---------------------------


---------------------------
PDC property APIs
---------------------------


---------------------------
PDC query APIs
PDC Data query APIs
---------------------------

* pdc_query_t *PDCquery_create(pdcid_t obj_id, pdc_query_op_t op, pdc_var_type_t type, void *value)
Expand Down Expand Up @@ -883,4 +900,4 @@ Developers notes
* Object

* Object property See `Object Property <file:///Users/kenneth/Documents/Berkeley%20Lab/pdc/docs/build/html/pdcapis.html#object-property>`_
* Object structure (pdc_obj_pkg.h and pdc_obj.h) See `Object Structure <file:///Users/kenneth/Documents/Berkeley%20Lab/pdc/docs/build/html/pdcapis.html#object-structure>`_
* Object structure (pdc_obj_pkg.h and pdc_obj.h) See `Object Structure <file:///Users/kenneth/Documents/Berkeley%20Lab/pdc/docs/build/html/pdcapis.html#object-structure>`_
8 changes: 7 additions & 1 deletion docs/source/developer-notes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -138,6 +138,13 @@ For No-index approach, here are the APIs you can call for different communicatio
* PDC_Client_query_kvtag (point-to-point)
* PDC_Client_query_kvtag_mpi (collective)

The default PDC kvtags are stored within each object's metadata as a linked list, and any query involves traversing the list in memory.

We have additional support to manage the kvtags with RocksDB and SQLite. With this approach, each PDC server creates and accesses its own RocksDB and SQLite database file, which is stored as an in-memory file in /tmp directory. When RocksDB or SQLite is enabled with setting the environment variable ``PDC_USE_ROCKSDB=1`` or ``PDC_USE_SQLITE3=1``.
With the RocksDB implementation, each kvtag is stored as a RocksDB key-value pair. To differenciate the kvtags for different objects, we encode the object ID to the key string used for the RocksDB, and store the value as the RocksDB value. As a result, the value can be retrieved directly when its object ID and key string is known. Otherwise we must iterate over the entire DB to search for an kvtag.
With the SQLite3 implementation, each kvtag is inserted as a row in a SQLite3 table. Currently, the table has the following columns and SQLite3 datatypes: objid (INTEGER), name (TEXT), value_text(TEXT), value_int(INTEGER), value_float(REAL), value_double(REAL), value_blob(BLOB). We create a SQL SELECT statement automatically on the server when receiving a query request from the PDC client. Currently this implementation is focused on supporting string/text affix search and integer/float (single) value match search.
Currently, both the RocksDB and the SQLite implementation are developed for benchmarking purpose, the database files are removed at server finalization time, and restart is not supported.

Index-facilitated Approach
---------------------------------------------

Expand Down Expand Up @@ -398,7 +405,6 @@ Also, to make sure your code with Julia function calls doesn't get compiled when

For more info on embedded Julia support, please visit: `Embedded Julia https://docs.julialang.org/en/v1/manual/embedding/`_.


---------------------------------------------
Docker Support
---------------------------------------------
Expand Down
12 changes: 10 additions & 2 deletions src/api/pdc_client_connect.c
Original file line number Diff line number Diff line change
Expand Up @@ -9020,8 +9020,16 @@ _standard_all_gather_result(int query_sent, int *n_res, uint64_t **pdc_ids, MPI_
uint64_t *all_ids = (uint64_t *)malloc(ntotal * sizeof(uint64_t));
MPI_Allgatherv(*pdc_ids, *n_res, MPI_UINT64_T, all_ids, all_nmeta_array, disp, MPI_UINT64_T, world_comm);

if (*pdc_ids)
free(*pdc_ids);

*n_res = ntotal;
*pdc_ids = all_ids;

free(all_nmeta_array);
free(disp);

return;
}

void
Expand Down Expand Up @@ -9127,7 +9135,7 @@ PDC_Client_query_kvtag_mpi(const pdc_kvtag_t *kvtag, int *n_res, uint64_t **pdc_

if (*n_res <= 0) {
*n_res = 0;
*pdc_ids = (uint64_t *)malloc(0);
*pdc_ids = NULL;
}
else {
// print the pdc ids returned by this client, along with the client id
Expand Down Expand Up @@ -9349,4 +9357,4 @@ PDC_Client_search_obj_ref_through_dart_mpi(dart_hash_algo_t hash_algo, char *que
}
#endif

/******************** Collective Object Selection Query Ends *******************************/
/******************** Collective Object Selection Query Ends *******************************/
33 changes: 26 additions & 7 deletions src/server/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,17 @@ if(PDC_ENABLE_FASTBIT)
find_library(FASTBIT_LIBRARY fastbit $ENV{HOME}/cori/fastbit-2.0.3/install)
endif()

if(PDC_ENABLE_ROCKSDB)
add_definitions(-DENABLE_ROCKSDB=1)
find_path(ROCKSDB_INCLUDE_DIR include/db.h)
find_library(ROCKSDB_LIBRARY rocksdb 8.1.1< REQUIRED)
endif()

if(PDC_ENABLE_SQLITE3)
add_definitions(-DENABLE_SQLITE3=1)
find_package(SQLite3 3.31.0 REQUIRED)
endif()

include_directories(
${PDC_COMMON_INCLUDE_DIRS}
${PDC_INCLUDES_BUILD_TIME}
Expand All @@ -28,6 +39,7 @@ include_directories(
${PDC_SOURCE_DIR}/src/utils/include
${MERCURY_INCLUDE_DIR}
${FASTBIT_INCLUDE_DIR}
${ROCKSDB_INCLUDE_DIR}
)

add_definitions( -DIS_PDC_SERVER=1 )
Expand Down Expand Up @@ -57,9 +69,17 @@ add_library(pdc_server_lib
)
if(PDC_ENABLE_FASTBIT)
message(STATUS "Enabled fastbit")
target_link_libraries(pdc_server_lib mercury ${PDC_COMMONS_LIBRARIES} -lm -ldl ${PDC_EXT_LIB_DEPENDENCIES} ${FASTBIT_LIBRARY}/libfastbit.so)
target_link_libraries(pdc_server_lib ${MERCURY_LIBRARY} ${PDC_COMMONS_LIBRARIES} -lm -ldl ${PDC_EXT_LIB_DEPENDENCIES} ${FASTBIT_LIBRARY}/libfastbit.so)
elseif(PDC_ENABLE_ROCKSDB)
if(PDC_ENABLE_SQLITE3)
target_link_libraries(pdc_server_lib ${MERCURY_LIBRARY} ${PDC_COMMONS_LIBRARIES} -lm -ldl ${PDC_EXT_LIB_DEPENDENCIES} ${ROCKSDB_LIBRARY} SQLite::SQLite3)
else()
target_link_libraries(pdc_server_lib ${MERCURY_LIBRARY} ${PDC_COMMONS_LIBRARIES} -lm -ldl ${PDC_EXT_LIB_DEPENDENCIES} ${ROCKSDB_LIBRARY})
endif()
elseif(PDC_ENABLE_SQLITE3)
target_link_libraries(pdc_server_lib ${MERCURY_LIBRARY} ${PDC_COMMONS_LIBRARIES} -lm -ldl ${PDC_EXT_LIB_DEPENDENCIES} SQLite::SQLite3)
else()
target_link_libraries(pdc_server_lib mercury ${PDC_COMMONS_LIBRARIES} -lm -ldl ${PDC_EXT_LIB_DEPENDENCIES})
target_link_libraries(pdc_server_lib ${MERCURY_LIBRARY} ${PDC_COMMONS_LIBRARIES} -lm -ldl ${PDC_EXT_LIB_DEPENDENCIES})
endif()

add_executable(pdc_server.exe
Expand All @@ -78,10 +98,9 @@ if(NOT ${PDC_INSTALL_BIN_DIR} MATCHES ${PROJECT_BINARY_DIR}/bin)
install(
TARGETS
pdc_server.exe
DESTINATION ${PDC_INSTALL_BIN_DIR}
pdc_server_lib
LIBRARY DESTINATION ${PDC_INSTALL_LIB_DIR}
ARCHIVE DESTINATION ${PDC_INSTALL_LIB_DIR}
RUNTIME DESTINATION ${PDC_INSTALL_BIN_DIR}
)
endif()




12 changes: 12 additions & 0 deletions src/server/include/pdc_server.h
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,18 @@
#include "iapi.h"
#endif

#ifdef ENABLE_ROCKSDB
#include "rocksdb/c.h"
extern rocksdb_t *rocksdb_g;
extern int use_rocksdb_g;
#endif

#ifdef ENABLE_SQLITE3
#include "sqlite3.h"
extern sqlite3 *sqlite3_db_g;
extern int use_sqlite3_g;
#endif

#ifdef ENABLE_MULTITHREAD
// Mercury multithread
#include "mercury_thread.h"
Expand Down
10 changes: 10 additions & 0 deletions src/server/include/pdc_server_metadata.h
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,8 @@ extern pdc_remote_server_info_t *pdc_remote_server_info_g;
extern double total_mem_usage_g;
extern int is_hash_table_init_g;
extern int is_restart_g;
extern int use_rocksdb_g;
extern int use_sqlite3_g;

/****************************/
/* Library Private Typedefs */
Expand All @@ -83,6 +85,14 @@ typedef struct pdc_cont_hash_table_entry_t {
pdc_kvtag_list_t *kvtag_list_head;
} pdc_cont_hash_table_entry_t;

#ifdef ENABLE_SQLITE3
typedef struct pdc_sqlite3_query_t {
pdcid_t **obj_ids;
int nobj;
int nalloc;
} pdc_sqlite3_query_t;
#endif

/***************************************/
/* Library-private Function Prototypes */
/***************************************/
Expand Down
Loading

0 comments on commit 23b1fdc

Please sign in to comment.