@@ -16,10 +16,10 @@ It is the "official" client of RediSearch, and should be regarded as its canonic
16
16
17
17
## Features
18
18
19
- RediSearch is a source avaliable ([ RSAL] ( https://raw.githubusercontent.com/RediSearch/RediSearch/master/LICENSE ) ), high performance search engine implemented as a [ Redis Module] ( https://redis.io/topics/modules-intro ) .
19
+ RediSearch is a source avaliable ([ RSAL] ( https://raw.githubusercontent.com/RediSearch/RediSearch/master/LICENSE ) ), high performance search engine implemented as a [ Redis Module] ( https://redis.io/topics/modules-intro ) .
20
20
It uses custom data types to allow fast, stable and feature rich full-text search inside Redis.
21
21
22
- This client is a wrapper around the RediSearch API protocol, that allows you to utilize its features easily.
22
+ This client is a wrapper around the RediSearch API protocol, that allows you to utilize its features easily.
23
23
24
24
### RediSearch's features include:
25
25
@@ -35,44 +35,354 @@ This client is a wrapper around the RediSearch API protocol, that allows you to
35
35
36
36
For more details, visit [ http://redisearch.io ] ( http://redisearch.io )
37
37
38
- ## Example: Using the Python Client
38
+ ## Examples
39
+
40
+ ### Creating a client instance
41
+
42
+ When you create a redisearch-py client instance, the only required argument
43
+ is the name of the index.
44
+
45
+ ``` py
46
+ from redisearch import Client
47
+
48
+ client = Client(" my-index" )
49
+ ```
50
+
51
+ To connect with a username and/or password, pass those options to the client
52
+ initializer.
53
+
54
+ ``` py
55
+ client = Client(" my-index" , username = " user" , password = " my-password" )
56
+ ```
57
+
58
+ ### Using core Redis commands
59
+
60
+ Every instance of ` Client ` contains an instance of the redis-py ` Client ` as
61
+ well. Use this object to run core Redis commands.
62
+
63
+ ``` py
64
+ import datetime
65
+
66
+ from redisearch import Client
67
+
68
+ START_TIME = datetime.datetime.now()
69
+
70
+ client = Client(" my-index" )
71
+
72
+ client.redis.set(" start-time" , START_TIME )
73
+ ```
74
+
75
+ ### Checking if a RediSearch index exists
76
+
77
+ To check if a RediSearch index exists, use the ` FT.INFO ` command and catch
78
+ the ` ResponseError ` raised if the index does not exist.
79
+
80
+ ``` py
81
+ from redis import ResponseError
82
+ from redisearch import Client
83
+
84
+ client = Client(" my-index" )
85
+
86
+ try :
87
+ client.info()
88
+ except ResponseError
89
+ # Index does not exist. We need to create it!
90
+ ```
91
+
92
+ ### Defining a search index
93
+
94
+ Use an instance of ` IndexDefinition ` to define a search index. You only need
95
+ to do this when you create an index.
96
+
97
+ RediSearch indexes follow Hashes in your Redis databases by watching * key
98
+ prefixes* . If a Hash whose key starts with one of the search index's
99
+ configured key prefixes is added, updated, or deleted from Redis, RediSearch
100
+ will make those changes in the index. You configure a search index's key
101
+ prefixes using the ` prefix ` parameter of the ` IndexDefinition ` initializer.
102
+
103
+ ** NOTE** : Once you create an index, RediSearch will continuously index these
104
+ keys when their Hashes change.
105
+
106
+ ` IndexDefinition ` also takes a * schema* . The schema specifies which fields to
107
+ index from within the Hashes that the index follows. The field types are:
108
+
109
+ * TextField
110
+ * TagField
111
+ * NumericalField
112
+ * GeoField
113
+
114
+ For more information on what these field types mean, consult the [ RediSearch
115
+ documentation] ( https://oss.redislabs.com/redisearch/Commands/#ftcreate ) on
116
+ the ` FT.CREATE ` command.
117
+
118
+ With redisearch-py, the schema is an iterable of ` Field ` instances. Once you
119
+ have an ` IndexDefinition ` instance, you can create the instance by passing a
120
+ schema iterable to the ` create_index() ` method.
39
121
40
122
``` py
41
- from redisearch import Client, TextField, IndexDefinition, Query
123
+ from redisearch import Client, IndexDefinition
42
124
43
- # Creating a client with a given index name
44
- client = Client(" myIndex" )
125
+ SCHEMA = (
126
+ TextField(" title" , weight = 5.0 ),
127
+ TextField(" body" )
128
+ )
129
+
130
+ client = Client(" my-index" )
45
131
46
- # IndexDefinition is available for RediSearch 2.0+
47
- definition = IndexDefinition(prefix = [' doc:' , ' article:' ])
132
+ definition = IndexDefinition(prefix = [' blog:' ])
133
+
134
+ try :
135
+ client.info()
136
+ except ResponseError
137
+ # Index does not exist. We need to create it!
138
+ client.create_index(SCHEMA , definition = definition)
139
+ ```
48
140
49
- # Creating the index definition and schema
50
- client.create_index((TextField(" title" , weight = 5.0 ), TextField(" body" )), definition = definition)
141
+ ### Indexing a document
51
142
52
- # Indexing a document for RediSearch 2.0+
53
- client.redis.hset(' doc:1' ,
54
- mapping = {
55
- ' title' : ' RediSearch' ,
56
- ' body' : ' Redisearch impements a search engine on top of redis'
57
- })
143
+ A RediSearch 2.0 index continually follows Hashes with the key prefixes you
144
+ defined, so if you want to add a document to the index, you only need to
145
+ create a Hash with one of those prefixes.
58
146
147
+ ``` py
148
+ # Indexing a document with RediSearch 2.0.
149
+ doc = {
150
+ ' title' : ' RediSearch' ,
151
+ ' body' : ' Redisearch adds querying, indexing, and full-text search to Redis'
152
+ }
153
+ client.redis.hset(' doc:1' , mapping = doc)
154
+ ```
155
+
156
+ Past versions of RediSearch required that you call the ` add_document() `
157
+ method. This method is deprecated, but we include its usage here for
158
+ reference.
159
+
160
+ ``` py
59
161
# Indexing a document for RediSearch 1.x
60
162
client.add_document(
61
163
" doc:2" ,
62
164
title = " RediSearch" ,
63
165
body = " Redisearch implements a search engine on top of redis" ,
64
166
)
167
+ ```
168
+
169
+ ### Querying
65
170
66
- # Simple search
67
- res = client.search(" search engine" )
171
+ #### Basic queries
68
172
69
- # the result has the total number of results, and a list of documents
70
- print (res.total) # "2"
71
- print (res.docs[0 ].title) # "RediSearch"
173
+ Use the ` search() ` method to perform basic full-text and field-specific
174
+ searches. This method doesn't take many of the options available to the
175
+ RediSearch ` FT.SEARCH ` command -- read the section on building complex
176
+ queries later in this document for information on how to use those.
72
177
73
- # Searching with complex parameters:
74
- q = Query(" search engine" ).verbatim().no_content().with_scores().paging(0 , 5 )
178
+ ``` py
179
+ res = client.search(" evil wizards" )
180
+ ```
181
+ #### Result objects
182
+
183
+ Results are wrapped in a ` Result ` object that includes the number of results
184
+ and a list of matching documents.
185
+
186
+ ``` py
187
+ >> > print (res.total)
188
+ 2
189
+ >> > print (res.docs[0 ].title)
190
+ " Wizard Story 2: Evil Wizards Strike Back"
191
+ ```
192
+
193
+ #### Building complex queries
194
+
195
+ You can use the ` Query ` object to build complex queries:
196
+
197
+ ``` py
198
+ q = Query(" evil wizards" ).verbatim().no_content().with_scores().paging(0 , 5 )
75
199
res = client.search(q)
200
+ ```
201
+
202
+ For an explanation of these options, see the [ RediSearch
203
+ documentation] ( https://oss.redislabs.com/redisearch/Commands/#ftsearch ) for
204
+ the ` FT.SEARCH ` command.
205
+
206
+ #### Query syntax
207
+
208
+ The default behavior of queries is to run a full-text search across all
209
+ ` TEXT ` fields in the index for the intersection of all terms in the query.
210
+
211
+ So the example given in the "Basic queries" section of this README,
212
+ ` client.search("evil wizards") ` , run a full-text search for the intersection
213
+ of "evil" and "wizard" in all ` TEXT ` fields.
214
+
215
+ Many more types of queries are possible, however! The string you pass into
216
+ the ` search() ` method or ` Query() ` initializer has the full range of query
217
+ syntax available in RediSearch.
218
+
219
+ For example, a full-text search against a specific ` TEXT ` field in the index
220
+ looks like this:
221
+
222
+ ``` py
223
+ # Full-text search
224
+ res = client.search(" @title:evil wizards" )
225
+ ```
226
+
227
+ Finding books published in 2020 or 2021 looks like this:
228
+
229
+ ``` python
230
+ client.search(" @published_year:[2020 2021]" )
231
+ ```
232
+
233
+ To learn more, see the [ RediSearch
234
+ documentation] ( https://oss.redislabs.com/redisearch/Query_Syntax/ ) on query
235
+ syntax.
236
+
237
+ ### Aggregations
238
+
239
+ This library contains a programmatic interface to run [ aggregation
240
+ queries] ( https://oss.redislabs.com/redisearch/Aggregations/ ) with RediSearch.
241
+
242
+ #### Making an aggregation query
243
+
244
+ To make an aggregation query, pass an instance of the ` AggregateRequest `
245
+ class to the ` search() ` method of an instance of ` Client ` .
246
+
247
+ For example, here is what finding the most books published in a single year
248
+ looks like:
249
+
250
+ ``` py
251
+ from redisearch import Client
252
+ from redisearch import reducers
253
+ from redisearch.aggregation import AggregateRequest
254
+
255
+ client = Client(' books-idx' )
256
+
257
+ request = AggregateRequest(' *' ).group_by(
258
+ ' @published_year' , reducers.count().alias(" num_published" )
259
+ ).group_by(
260
+ [], reducers.max(" @num_published" ).alias(" max_books_published_per_year" )
261
+ )
262
+
263
+ result = client.aggregate(request)
264
+ ```
265
+
266
+ #### A redis-cli equivalent query
267
+
268
+ The aggregation query just given is equivalent to the following
269
+ ` FT.AGGREGATE ` command entered directly into the redis-cli:
270
+
271
+ ``` sql
272
+ FT .AGGREGATE books- idx *
273
+ GROUPBY 1 @published_year
274
+ REDUCE COUNT 0 AS num_published
275
+ GROUPBY 0
276
+ REDUCE MAX 1 @num_published AS max_books_published_per_year
277
+ ```
278
+
279
+ #### The AggregateResult object
280
+
281
+ Aggregation queries return an ` AggregateResult ` object that contains the rows
282
+ returned for the query and a cursor if you're using the [ cursor
283
+ API] ( https://oss.redislabs.com/redisearch/Aggregations/#cursor_api ) .
284
+
285
+ ``` py
286
+ from redisearch.aggregation import AggregateRequest, Asc
287
+
288
+ request = AggregateRequest(' *' ).group_by(
289
+ [' @published_year' ], reducers.avg(' average_rating' ).alias(' average_rating_for_year' )
290
+ ).sort_by(
291
+ Asc(' @average_rating_for_year' )
292
+ ).limit(
293
+ 0 , 10
294
+ ).filter(' @published_year > 0' )
295
+
296
+ ...
297
+
298
+
299
+ In [53 ]: resp = c.aggregate(request)
300
+ In [54 ]: resp.rows
301
+ Out[54 ]:
302
+ [[' published_year' , ' 1914' , ' average_rating_for_year' , ' 0' ],
303
+ [' published_year' , ' 2009' , ' average_rating_for_year' , ' 1.39166666667' ],
304
+ [' published_year' , ' 2011' , ' average_rating_for_year' , ' 2.046' ],
305
+ [' published_year' , ' 2010' , ' average_rating_for_year' , ' 3.125' ],
306
+ [' published_year' , ' 2012' , ' average_rating_for_year' , ' 3.41' ],
307
+ [' published_year' , ' 1967' , ' average_rating_for_year' , ' 3.603' ],
308
+ [' published_year' , ' 1970' , ' average_rating_for_year' , ' 3.71875' ],
309
+ [' published_year' , ' 1966' , ' average_rating_for_year' , ' 3.72666666667' ],
310
+ [' published_year' , ' 1927' , ' average_rating_for_year' , ' 3.77' ]]
311
+ ```
312
+
313
+ #### Reducer functions
314
+
315
+ Notice from the example that we used an object from the ` reducers ` module.
316
+ See the [ RediSearch documentation] ( https://oss.redislabs.com/redisearch/Aggregations/#groupby_reducers )
317
+ for more examples of reducer functions you can use when grouping results.
318
+
319
+ Reducer functions include an ` alias() ` method that gives the result of the
320
+ reducer a specific name. If you don't supply a name, RediSearch will generate
321
+ one.
322
+
323
+ #### Grouping by zero, one, or multiple fields
324
+
325
+ The ` group_by ` statement can take a single field name as a string, or multiple
326
+ field names as a list of strings.
327
+
328
+ ``` py
329
+ AggregateRequest(' *' ).group_by(' @published_year' , reducers.count())
330
+
331
+ AggregateRequest(' *' ).group_by(
332
+ [' @published_year' , ' @average_rating' ],
333
+ reducers.count())
334
+ ```
335
+
336
+ To run a reducer function on every result from an aggregation query, pass an
337
+ empty list to ` group_by() ` , which is equivalent to passing the option
338
+ ` GROUPBY 0 ` when writing an aggregation in the redis-cli.
339
+
340
+ ``` py
341
+ AggregateRequest(' *' ).group_by([], reducers.max(" @num_published" ))
342
+ ```
343
+
344
+ ** NOTE** : Aggregation queries require at least one ` group_by() ` method call.
345
+
346
+ #### Sorting and limiting
347
+
348
+ Using an ` AggregateRequest ` instance, you can sort with the ` sort_by() ` method
349
+ and limit with the ` limit() ` method.
350
+
351
+ For example, finding the average rating of books published each year, sorting
352
+ by the average rating for the year, and returning only the first ten results:
353
+
354
+ ``` py
355
+ from redisearch import Client
356
+ from redisearch.aggregation import AggregateRequest, Asc
357
+
358
+ c = Client()
359
+
360
+ request = AggregateRequest(' *' ).group_by(
361
+ [' @published_year' ], reducers.avg(' average_rating' ).alias(' average_rating_for_year' )
362
+ ).sort_by(
363
+ Asc(' @average_rating_for_year' )
364
+ ).limit(0 , 10 )
365
+
366
+ c.aggregate(request)
367
+ ```
368
+
369
+ ** NOTE** : The first option to ` limit() ` is a zero-based offset, and the second
370
+ option is the number of results to return.
371
+
372
+ #### Filtering
373
+
374
+ Use filtering to reject results of an aggregation query after your reducer
375
+ functions run. For example, calculating the average rating of books published
376
+ each year and only returning years with an average rating higher than 3:
377
+
378
+ ``` py
379
+ from redisearch.aggregation import AggregateRequest, Asc
380
+
381
+ req = AggregateRequest(' *' ).group_by(
382
+ [' @published_year' ], reducers.avg(' average_rating' ).alias(' average_rating_for_year' )
383
+ ).sort_by(
384
+ Asc(' @average_rating_for_year' )
385
+ ).filter(' @average_rating_for_year > 3' )
76
386
```
77
387
78
388
## Installing
@@ -115,6 +425,6 @@ Finally, invoke the virtual environment and run the tests:
115
425
116
426
```
117
427
. ./venv3/bin/activate
118
- REDIS_PORT=6379 python test/test.py
428
+ REDIS_PORT=6379 python test/test.py
119
429
REDIS_PORT=6379 python test/test_builder.py
120
430
```
0 commit comments