Elasticsearch

The search_elastic app adds a full text search for files stored in ownCloud. It requires an elasticsearch server and can index all files supported by apache tika, e.g., plain text, .docx, .xlsx, .pptx, .odt, .ods and .pdf files. The source code is available on GitHub. For more information please read the documentation at https://doc.owncloud.com.

Maintainers

Jörn Friedrich Dreyer

Todo

update elastica lib
restore compatibility with oc8.2
restore compatibility with elasticsearch (requires new indexes)
store groups and users with access to filter search results by group membership
store fileid instead of filenames so we don't have to handle renames
use instanceid to set up index - allows using the same elasticsearch instance for multiple oc instances
store the filename to allow faster search in shared files
- index files and folders
store tags?
store image / video dimensions?
sharing a file immediately after it has been uploaded throws an exception
- fix exception / do not try to update a nonexistent document
- get all users and groups when initially indexing the document
move share updates to background job -> eventually searchable
- descend subdirs when updating
- check permissions again on search and remove results if no longer accessible
  - compensate for removed entries in search results, too many will confuse the paging logic
--index in batches (make batch size configurable, 0 = unlimited)
- CLI cron.php executes all jobs
  - limit number of files to 250? per job?
add occ commands
- index all files or only those of a specific user
- enable / disable automatic background scanning via cron
  - admin settings ui for this
check js for result link handling so clicking a result dos not do a full page load, there seems to be js in place that already does the file highlighting
- the old filehandler logic does not seem to work, removed it, now using plain link
send code snippets for search_lucene
use file tab
- show index status
- remember index error message in db
check encryption compatibility
- had to jump a few hoops to get master key working
- not compatible with user individual keys
  - at least index metadata in this case (catch encryption exception and ignore content extraction)
statistics on admin settings page
statistics on personal settings page
cleanup code
port test suite from search_lucene
resolve path for shared files
files with empty content extraction are reindexed indefinitely? e.g., empty text file
more debug logging
wildcard search ... but there is a bug in core js code preventing wildcard search
- well partly. * and ? are no longer supported. Instead, we now mimic core, which is called a match phrase prefix type query
to find out why a node cannot be found by its contents mark it as "NO CONTENT EXTRACTED"?
how should we handle files in userhome/files_versions/ or userhome/thumbnails/ ... currently a 'vanished' message will be logged ... annoying

Name		Name	Last commit message	Last commit date
Latest commit History 642 Commits
.github		.github
.phan		.phan
appinfo		appinfo
css		css
img		img
js		js
lib		lib
templates/settings		templates/settings
tests		tests
vendor-bin		vendor-bin
.drone.star		.drone.star
.gitignore		.gitignore
.php-cs-fixer.dist.php		.php-cs-fixer.dist.php
CHANGELOG.md		CHANGELOG.md
FEATURES.md		FEATURES.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
TESTING.md		TESTING.md
add.php		add.php
composer.json		composer.json
composer.lock		composer.lock
phpcs.xml		phpcs.xml
phpstan.neon		phpstan.neon
phpunit.xml		phpunit.xml
sonar-project.properties		sonar-project.properties

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Elasticsearch

Maintainers

Todo

About

Releases 22

Packages

Contributors 39

Languages

License

owncloud/search_elastic

Folders and files

Latest commit

History

Repository files navigation

Elasticsearch

Maintainers

Todo

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 22

Packages 0

Contributors 39

Languages

Packages