-
Notifications
You must be signed in to change notification settings - Fork 0
/
ChangeLog
495 lines (403 loc) · 17.8 KB
/
ChangeLog
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
libzim 7.2.0
============
* Add methods to get/print (dependences) versions (@kelson42, #452)
* Fix Emscripten compilation (@kelson42, @mossroy, #643)
libzim 7.1.0
============
* Fix dirent test on 32 bits architectures (@mgautierfr #632)
* Fix compilation on Alpine - with musl (@amirouche #649)
* Don't crash if ZIM without illustration nor X/W namespace (@mgautierfr #641)
* Switch default suggestion operator to AND (@maneeshpm #644)
* Add a new method Archive::getMetadataItem (@mgautierfr #639)
* Better indexion criterias (@mgautierfr #642)
* Avoid duplicated archives in the searcher (@veloman-yunkan #648)
* Fix random entry (@veloman-yunkan #650)
* Various improvements.
- CI @mgautierfr #640, @kelson42 #638, @legoktm #654
- Doc @rgaudin #646
libzim 7.0.0
============
Version 7.0.0 is a major release.
The API has been completely rewritten.
Most notable change is that namespaces are now hidden.
The new API is described in documentation, which includes a Transition Guide from v6.
ZIM files created with it uses new ZIM minor version (6.1 - see Header section of spec.)
Both backward and forward compatibility is kept.
Improvements
------------
* Rewrite creator and reader API
This removes the namespace from the API. Article are automatically put in
the right namespace ('A') and the retrivial of content is made using
specific API. (@mgautier #454)
* Better handling of the conditional compilation without xapian.
Before that, the search API was present (but returning empty result) if
libzim was compiled without xapian. Now the API is not present anymore.
User code must check if libzim is compiled with xapian or not by checking
if LIBZIM_WITH_XAPIAN is defined or not. (@mgautierfr #465)
* Add a new specific listing in zim files to list entries considered as "front
article". At creation, wrapper MUST pass the hint `FRONT_ARTICLE` to
correctly mark the entry. Search by title uses this list if present.
(@mgautierfr #487)
* Store the wellknown entries in the `W` namespace (`W/mainPage`)
(@mgautierfr #497)
* Rewrite Search API. Fix potential memory link and allow correct reusing of
create search. (@mgautierfr #530)
* New suggestion search API. The api mimics the Search API but specialized
for suggestion (@maneeshpm #574)
* Add `zim::Archive` constructors to open an archive using a existing file descriptor.
This API is not available on Windows. (@veloman-yunkan #449)
* Make zstd the default compression algorithm (@veloman-yunkan #480)
* The method `zim::Archive::checkIntegrity` now if the mimetypes indicated in the
dirents are correct (@veloman-yunkan #505)
* Writer doesn't add a `.zim` extension to the given path. (@maneeshpm #503)
* Implement random entry picking. We are choosing a entry from the "front
article" list if present. (@mgautierfr #476)
* Creator now create the `M/Counter` metadata. (@mgautierfr)
* Better Illustration handling. Favicon is replaced by Illustration.
Illustration can now have different size and scale (even if the API do
not use this feature) (@mgautierfr #540)
* Search iterator now have a method `getZimId` to know the Id of the zim
corresponding to the result (useful for multizim search) (@maneeshpm #557)
Bug fixes
---------
* The method `zim::Archive::checkIntegrity` now check if the dirents are
correctly sorted. (@veloman-yunkan #448)
* Handle large MIME-type list. Some zim file may have a pretty large mimetype
list. (@veloman-yunkan #460)
* Fix handling of zim file containing item of size 0. (@mgautierfr #483)
* Better parsing of the entry paths to detect the namespace (@maneeshpm #479)
* Fix zim file creation on Windows (@mgautierfr #508)
* Better algorithm tunning for suggestion search (@maneeshpm #492)
* The default indexer now index html content only. (@mgautierfr #511)
* Better suggestion search : Don't use stopwords, use OP_PHRASE
(@maneeshpm #501)
* Remove duplicate in the suggestion search (@maneeshpm #515)
* Remove the termlist from the xapian database, lower memory usage
(@maneeshpm #528)
* Add a anchor in the suggestion search to search term at the beginning of
the title (@maneeshpm #526)
* Make the suggestion search working with special characters (`&`, `+`)
(@veloman-yunkan #534)
* Fix creator issue not detecting that cluster must be extended if it
contains only 32-bit-sized content. (@veloman-yunkan #552)
* Correctly generate suggestion snippet. (@maneeshpm #545)
* Better cluster size configuration (@mgautierfr #555)
* Make search iterator `getTitle` return the real title of the entry and not
the one stored in the xapian database (caseless) (@maneeshpm #586)
* Correcly close a zim creator to avoid a crash when the creator is
destructed without being started (@mgautierfr #613)
* Reduce the creator memory usage by reducing the memory size of the dirent
(@mgautier #616, #628)
* Write the cluster using a bigger chunk size for performance
(@mgautierfr #506)
* Change the default cluster size to 2MiB (@mgautierfr #585)
* The default mimetype for metadata now include the utf8 chardet
(@rgaudin #626)
* Improve the estimation of the number of search/suggestion results by forcing
Xapian to evaluate at least 10 results (@mgautier #625)
Other
-----
* Update xapian stopwords list. (@data-man #447)
* Remove direct pthread dependency (use c++11 thread library). (@mgautierfr #443)
We still need pthread library on linux and freebsd as C++11 is using it internally.
* [CI] Make the libzim CI compile libzim natively on Windows (@mgautierfr #453).
* [CI] Build libzim package for Ubuntu Hirsute and Impish
(@legoktm #459, #580)
* Always create zim file using the major version 6. (@mgautierfr #512)
* Move the test data files out of the git repository. Now test files are
stored in `zim-testing-suite` repository and must be downloaded.
(@mgautierfr #538, #535)
* Add search iterator unit test (@maneeshpm #547)
* Correctly fix search iterator method case to use camelCase everywhere
(@maneeshpm #563)
* Add a cast to string opertor on Uuid (@maneeshpm #582)
* Make unittest print the path of the missing zim file when something goes
wrong (@kelson42 #601)
* Delete temporary data (index) after we called `finishZimCreation` instead of
waiting for creator destruction. (@mgautierfr #603)
* Add basic user documentation (@mgautierfr #611)
Known bugs
----------
Suggestion system using in current libkiwix doesn't work with new zim files
created with this release (and future ones).
New libkiwix version will be fixed and will work with new and old zim files.
libzim 6.3.2
============
This is a hotfix of 6.3.0 :
* libzim now create zimfile with zstd compression 19 instead of 22.
So new libzim do not need to allocate 128Mb per cluster at decompression
time.
* At reading time, on 32 bits architectures, zstd cluster are not keep in
cache. This avoid use to also keep the decompression stream which reserve
128Mb of memory address.
libzim 6.3.1
============
The release process of 6.3.1 was buggy. So, no 6.3.1.
libzim 6.3.0
============
* Rewrite internal reader structure to use stream decompression.
This allow libzim to not decompresse the whole cluster to get an article
content. This is big performance improvement, it speedups random access by
2, with a very small cost when doing "full" incremental reading
(zim-check/zim-dump). (@veloman-yunkan)
* Better dirent lookup.
Dirent lookup is the process of locating article data starting from the url
or title. This improves reading of zim file up to 10% (@veloman-yunkan)
* Add basic, first version of `validate` function to check internal structure
of a zim file. (@veloman-yunkan, @MiguelRocha)
* Fix compilation of libzim without xapian (@mgautierfr)
* Remove zlib dependency (and support of very old files created using zlib
compression) (@mgautierfr)
* New unit tests and various small fixes.
libzim 6.2.2
============
* Check blob index before access it in the cluster.
* Refactoring of the cluster reading.
libzim 6.2.1 (release process broken)
=====================================
* Update readme and add link to repology.org packages list.
* Fix compilation on windows.
libzim 6.2.0
============
* Fix compilation of libzim on freebsd.
* Rewrite unit tests to remove python based test and use gtest all the time.
* Make libzstd mandatory.
* Support for meson 0.45.
* Fix multipart support on macos.
* Add a documentation system.
* Better cache system implementation (huge speed up).
* Various (and numerous) small refactoring.
libzim 6.1.8
============
* Increase default timeout for test to 120 seconds/test
* Compression algorithm to use can be passed to `zim::writer::Creator`
* Add automatic debian packaging of libzim.
* Fix using of tmpdir (and now use env var TMPDIR) during tests.
libzim 6.1.7
============
* Do not assume urlPtrPos is just after the mimetype list.
* Fix compilation of compression test.
* Do not exit but throw an exception if an ASSERT is not fulfill.
libzim 6.1.6
============
* Better (faster) implementation of the ordering of article by cluster.
* Fix compression algorithm.
libzim 6.1.5
============
* [Writer] Remove unused declaration of classes.
Those classes were not implemented nor used at all.
libzim 6.1.4
============
* [Writer] Fix excessive memory usage. Data of the cluster were clean at the
end of the process, not once we don't need it.
libzim 6.1.3
============
* [Writer] Use a `.tmp` suffix and rename to `.zim` at the end of the write
proces.
* Add unit tests
* Do not include uncessary `windows.h` headers in public zim's headers.
libzim 6.1.2
============
* [CI] Fix codecov configuration
* [Writer] Fix threads synchronization at end of writing process.
libzim 6.1.1
============
* Fix bug around the find function
libzim 6.1.0
============
* Compile now on OpenBSD
* [Test] Use the main function provided by gtest.
* [CI] Move the CI compilation to github actions.
* Add stopwords for 54 new languages.
* [Writer] Improve the way we are writing cluster at zim creation time.
- Clusters are directly written in the zim file instead of using temporary
files.
- mimetypes are limited to 944 bytes.
* Add a new type of iterator to iterate over articles in a performant way
reducing decompression of clusters. This is now the new default iterator.
* Add support for zim files compressed with zstd compression algorithm.
This is not possible to use zstd to create zim file for now.
libzim 6.0.2
============
* Fix search suggestion parsing.
libzim 6.0.1
============
* Fix crash when trying to open an empty file.
* Ensure that pytest tests are run on the CI.
libzim 6.0.0
============
* [Writer] Index the articles in differents threads. This is a huge speed
improvement as the main thread in not blocked by indexing.
* Index the title only if `shouldIndex` return true.
libzim 5.1.0
============
* Improve indexation of the title.
* Better pertinence of suggestions (only for new zim files)
* Improvement of the speed of Leveinstein distance for suggestions (for old
zims)
libzim 5.0.2
============
* Improve README.
* Remove gtest as embeded subproject.
* Better lzma compression.
* Better performance of the leveinstein algorithm (better suggestions
performance)
libzim 5.0.1
============
* Update README.
* [Writer] Add debug information (print progress of the clusters writing).
* [Writer] Correctly print the url to the user.
* [CI] Add code coverage.
libzim 5.0.0
============
* Fix thread slipping for win32 crosscompilation.
* Fix a potential invalid access when reading dirent.
* Fix memory leak in the decompression algorithm.
* [Writer] Fix a memory leak (cluster cleanning)
* [Writer] Write article data in a temporary cluster file instead of a
temporary file per article.
* [Writer] Better algorithm to store the dirent while creating the zim
file. Better memory usage.
* [Writer] [API Change] Url/Ns are now handle using the same struct Url.
* [Writer] [API Change] No more aid and redirectAid. A redirectArticle
have to implement redirectUrl.
* [Writer] Use a memory pool to avoid multiple small memory allocations.
* [Writer] [API Change] Rename `ZimCreator` to `Creator`.
* [API Change] File's `search` and `suggestions` now return a unique_ptr
instead of a raw pointer.
libzim 4.0.7
============
* Build libzim without rpath.
libzim 4.0.6
============
* Support zim file created with cluster not written sequentially.
* Remove a meson warning.
libzim 4.0.5
============
* Store the xapian database in the right url.
* Do not fail when reading very small zim file (<256b).
* Do not print message on normal behavior.
* [BUILDSYSTEM] Be able to build a dynamic lib (libzim.so) but using static
dependencies.
* [CI] Use last version of meson.
* [CI] Use the new deps archive xz
libzim 4.0.4
============
* Fix opening of multi-part zim.
* Fix convertion of path to wpath on Windows.
libzim 4.0.3
============
* Implement low level file manipilation using different backends
libzim 4.0.2
============
* [Windows] Fix opening of zim file bigger than 4GiB
libzim 4.0.1
============
* [Writer] Fix wrong redirectyon log message
* Make libzim compile natively on windows using MSVC
* Better message when failing to read a zim file.
* Make libzim on windows correctly open unicode path.
* Add compilation option to use less memory (but more I/O).
Usefull on low memory devices (android)
* Small fixes
libzim 4.0.0
============
* [Writer] Remove a lot of memory copy.
* [Writer] Add xapian indexing directly in libzim.
* [Writer] Better API.
* [Writer] Use multi-threading to write clusters.
* [Writer] Ensure mimetype of articles article is not null.
* Extend test timeout for cluster's test.
* Less memory copy for cluster's test.
* Allow skipping test using a lot memory using env variable
`SKIP_BIG_MEMORY_TEST=1`
* Explicitly use the icu namespace to allow using of packaged icu lib.
* Use a temporary file name as long as the ZIM writting process is
not finished (#163)
* [Travis] Do no compile using gcc-5 (but the default trusty's one 4.8)
libzim 3.3.0
============
* Fix handling of big cluster (>4GiB) on 32 bits architecture. This is mainly
done by :
* Do not mmap the whole cluster by default.
* MMap only the memory asociated to an article.
* If an article is > 4GiB, the blob associated to it is invalid
(data==size==0).
* Other information are still valid (directAccessInformation, ...)
* Fix writing of extended cluster in writer.
* Compile libzim on macos.
* Build libzim setting RPATH.
* Search result urls are now what is stored in the zim file. They should not
start with a `/`. This is a revert of the change made in last release.
(See kiwix/kiwix-lib#123)
* Spelling corrections in README.
libzim 3.2.0
============
* Support geo query if the xapian database has indexed localisation.
* Handle articles bigger than 4Go in the zim file (#110).
* Use AND operator between search term.
* Fix compilation with recent clang (#95).
* Add method to get article's data localisation in the zim file.
* Be able to get only a part of article (#77).
* Do not crash if we cannot open the xapian Database for some reasons.
(kiwix/kiwix-tools#153)
* Do not assumen there is always a checksum in the zim file.
(kiwix/kiwix-tools#150)
* Try to do some sanity checks when opening a zim file.
* Use pytest to do some tests (when cython is available).
* Use levenshtein distance to sort and have better suggestion results.
* Search result urls are now always absolute (starts with a '/').
(kiwix/kiwix-lib#110)
* Open the file readonly when checking the zim file (and so be able to check
read only file).
* Accept absolute url starting with '/' when searching for article.
* Fix various bugs
libzim 3.1.0
============
* Lzma is not a optional dependency anymore.
* Better handle (report and not crash) invalid zim file.
* Embed source of gtest (used only if gtest is not available on the system)
* Move zimDump tools out of libzim repository to zim-tools
* ZimCreator tools doesn't not read command line to set options.
libzim 3.0.0
============
This is a major change of the libzim.
Expect a lot new improvement and API changes.
* Add a suggestion mode to the search
* Fix licensing issues
* Fix wrong stemming of the query when searching
* Deactivate searching (and so crash) in the embedded database if the zim is
splitted
* Rewrite the low level memory management of libzim when reading a zim file:
* We use a buffer base entity to handle memory and reading file instead of
reading file using stream.
* MMap the memory when posible to avoid memory copy.
* Use const when posible (API break)
* Move to googletest instead of cxxtools for unit-tests.
* Fix endiannes bug on arm.
* Do not install private headers. Those headers declare private structure and
should not be visible (API break)
* Compile libzim with `-Werror` and `-Wall` options.
* Make libzim thread safe for reading article.
The search part is not thread safe, and all search operation must be
protected by a lock.
* Add method to get only a part of a article.
* Move some tools to zim-tools repository.
libzim 2.0.0
============
* Move to meson build system
`libzim` now use `meson` as build system instead of `autotools`
* Move to C++11 standard.
* Fulltext search in zim file.
We have integrated the xapian fulltext search in libzim.
So now, libzim provide an API to search in a zim containing embeded fulltext
index. This means that :
*libzim need xapian as (optional) dependencies (if you want compile with
xapian support).
* The old and unused search API has been removed.
* Remove bzip2 support.
* Remove Symbian support.
* Few API hanges
* Make some header files private (not installed);
* A `Blob` can now be cast to a `string` directly;
* Change a lot of `File` methods to const methods.