Skip to content

Add nested JSON example to docs #287

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Feb 20, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
255 changes: 223 additions & 32 deletions docs/user_guide/05_hash_vs_json.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
Expand Down Expand Up @@ -311,36 +311,13 @@
"metadata": {},
"source": [
"### Working with JSON\n",
"Redis also supports native **JSON** objects. These can be multi-level (nested) objects, with full JSONPath support for updating/retrieving sub elements:\n",
"\n",
"```python\n",
"{\n",
" \"name\": \"bike\",\n",
" \"metadata\": {\n",
" \"model\": \"Deimos\",\n",
" \"brand\": \"Ergonom\",\n",
" \"type\": \"Enduro bikes\",\n",
" \"price\": 4972,\n",
" }\n",
"}\n",
"```\n",
"\n",
"JSON is best suited for use cases with the following characteristics:\n",
"- Ease of use and data model flexibility are top concerns\n",
"- Application data is already native JSON\n",
"- Replacing another document storage/db solution"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Full JSON Path support\n",
"Because Redis enables full JSON path support, when creating an index schema, elements need to be indexed and selected by their path with the desired `name` AND `path` that points to where the data is located within the objects.\n",
"\n",
"> By default, RedisVL will assume the path as `$.{name}` if not provided in JSON fields schema."
]
},
{
"cell_type": "code",
"execution_count": 11,
Expand Down Expand Up @@ -505,11 +482,230 @@
"source": [
"jindex.delete()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Working with nested data in JSON\n",
"\n",
"Redis also supports native **JSON** objects. These can be multi-level (nested) objects, with full JSONPath support for updating/retrieving sub elements:\n",
"\n",
"```json\n",
"{\n",
" \"name\": \"Specialized Stump jumper\",\n",
" \"metadata\": {\n",
" \"model\": \"Stumpjumper\",\n",
" \"brand\": \"Specialized\",\n",
" \"type\": \"Enduro bikes\",\n",
" \"price\": 3000\n",
" },\n",
"}\n",
"```\n",
"\n",
"#### Full JSON Path support\n",
"Because Redis enables full JSON path support, when creating an index schema, elements need to be indexed and selected by their path with the desired `name` AND `path` that points to where the data is located within the objects.\n",
"\n",
"> By default, RedisVL will assume the path as `$.{name}` if not provided in JSON fields schema. If nested provide path as `$.object.attribute`\n",
"\n",
"### As an example:"
]
},
{
"cell_type": "code",
"execution_count": 45,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/Users/robert.shelton/.pyenv/versions/3.11.9/lib/python3.11/site-packages/huggingface_hub/file_download.py:1142: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.\n",
" warnings.warn(\n"
]
}
],
"source": [
"from redisvl.utils.vectorize import HFTextVectorizer\n",
"\n",
"emb_model = HFTextVectorizer()\n",
"\n",
"bike_data = [\n",
" {\n",
" \"name\": \"Specialized Stump jumper\",\n",
" \"metadata\": {\n",
" \"model\": \"Stumpjumper\",\n",
" \"brand\": \"Specialized\",\n",
" \"type\": \"Enduro bikes\",\n",
" \"price\": 3000\n",
" },\n",
" \"description\": \"The Specialized Stumpjumper is a versatile enduro bike that dominates both climbs and descents. Features a FACT 11m carbon fiber frame, FOX FLOAT suspension with 160mm travel, and SRAM X01 Eagle drivetrain. The asymmetric frame design and internal storage compartment make it a practical choice for all-day adventures.\"\n",
" },\n",
" {\n",
" \"name\": \"bike_2\",\n",
" \"metadata\": {\n",
" \"model\": \"Slash\",\n",
" \"brand\": \"Trek\",\n",
" \"type\": \"Enduro bikes\",\n",
" \"price\": 5000\n",
" },\n",
" \"description\": \"Trek's Slash is built for aggressive enduro riding and racing. Featuring Trek's Alpha Aluminum frame with RE:aktiv suspension technology, 160mm travel, and Knock Block frame protection. Equipped with Bontrager components and a Shimano XT drivetrain, this bike excels on technical trails and enduro race courses.\"\n",
" }\n",
"]\n",
"\n",
"bike_data = [{**d, \"bike_embedding\": emb_model.embed(d[\"description\"])} for d in bike_data]\n",
"\n",
"bike_schema = {\n",
" \"index\": {\n",
" \"name\": \"bike-json\",\n",
" \"prefix\": \"bike-json\",\n",
" \"storage_type\": \"json\", # JSON storage type\n",
" },\n",
" \"fields\": [\n",
" {\n",
" \"name\": \"model\",\n",
" \"type\": \"tag\",\n",
" \"path\": \"$.metadata.model\" # note the '$'\n",
" },\n",
" {\n",
" \"name\": \"brand\",\n",
" \"type\": \"tag\",\n",
" \"path\": \"$.metadata.brand\"\n",
" },\n",
" {\n",
" \"name\": \"price\",\n",
" \"type\": \"numeric\",\n",
" \"path\": \"$.metadata.price\"\n",
" },\n",
" {\n",
" \"name\": \"bike_embedding\",\n",
" \"type\": \"vector\",\n",
" \"attrs\": {\n",
" \"dims\": len(bike_data[0][\"bike_embedding\"]),\n",
" \"distance_metric\": \"cosine\",\n",
" \"algorithm\": \"flat\",\n",
" \"datatype\": \"float32\"\n",
" }\n",
"\n",
" }\n",
" ],\n",
"}"
]
},
{
"cell_type": "code",
"execution_count": 46,
"metadata": {},
"outputs": [],
"source": [
"# construct a search index from the json schema\n",
"bike_index = SearchIndex.from_dict(bike_schema)\n",
"\n",
"# connect to local redis instance\n",
"bike_index.connect(\"redis://localhost:6379\")\n",
"\n",
"# create the index (no data yet)\n",
"bike_index.create(overwrite=True)"
]
},
{
"cell_type": "code",
"execution_count": 47,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"['bike-json:de92cb9955434575b20f4e87a30b03d5',\n",
" 'bike-json:054ab3718b984532b924946fa5ce00c6']"
]
},
"execution_count": 47,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"bike_index.load(bike_data)"
]
},
{
"cell_type": "code",
"execution_count": 48,
"metadata": {},
"outputs": [],
"source": [
"from redisvl.query import VectorQuery\n",
"\n",
"vec = emb_model.embed(\"I'd like a bike for aggressive riding\")\n",
"\n",
"v = VectorQuery(vector=vec,\n",
" vector_field_name=\"bike_embedding\",\n",
" return_fields=[\n",
" \"brand\",\n",
" \"name\",\n",
" \"$.metadata.type\"\n",
" ]\n",
" )\n",
"\n",
"\n",
"results = bike_index.query(v)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Note:** As shown in the example if you want to retrieve a field from json object that was not indexed you will also need to supply the full path as with `$.metadata.type`."
]
},
{
"cell_type": "code",
"execution_count": 49,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[{'id': 'bike-json:054ab3718b984532b924946fa5ce00c6',\n",
" 'vector_distance': '0.519989073277',\n",
" 'brand': 'Trek',\n",
" '$.metadata.type': 'Enduro bikes'},\n",
" {'id': 'bike-json:de92cb9955434575b20f4e87a30b03d5',\n",
" 'vector_distance': '0.657624483109',\n",
" 'brand': 'Specialized',\n",
" '$.metadata.type': 'Enduro bikes'}]"
]
},
"execution_count": 49,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"results"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Cleanup"
]
},
{
"cell_type": "code",
"execution_count": 44,
"metadata": {},
"outputs": [],
"source": [
"bike_index.delete()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3.8.13 ('redisvl2')",
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
Expand All @@ -523,14 +719,9 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.12"
"version": "3.11.9"
},
"orig_nbformat": 4,
"vscode": {
"interpreter": {
"hash": "9b1e6e9c2967143209c2f955cb869d1d3234f92dc4787f49f155f3abbdfb1316"
}
}
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
Expand Down