ome · khaledk2 · May 31, 2022 · May 17, 2022 · May 18, 2022
diff --git a/deployment/ansible/run_searchengine_index_cache_services.yml b/deployment/ansible/run_searchengine_index_cache_services.yml
diff --git a/deployment/ansible/run_searchengine_index_services.yml b/deployment/ansible/run_searchengine_index_services.yml
@@ -0,0 +1,24 @@
+#Issue setup ip address inside the hpa config file for postgres to accept the connection from it
+- name: Deploying search engine cache and indexing
+  connection: local
+  hosts: local
+  vars_files:
+    searchengine_vars.yml
+  tasks:
+
+   - name: Get data from postgres database and insert them to Elasticsearch index using docker searchengine
+     become: yes
+     docker_container:
+       image: "{{ searchengine_docker_image }}"
+       name: searchengine_index
+       cleanup: True
+       auto_remove: yes
+       command: "get_index_data_from_database"
+       networks:
+         - name: searchengine-net
+           ipv4_address: 10.11.0.11
+       published_ports:
+         - "5577:5577"
+       state: started
+       volumes:
+         - "{{ apps_folder }}/searchengine/searchengine/:/etc/searchengine/"
diff --git a/deployment/ansible/searchengine_vars.yml b/deployment/ansible/searchengine_vars.yml
@@ -8,8 +8,8 @@ database_user_password: pass1234
 cache_rows: 10000
 #searchenginecache_folder: /data/searchengine/searchengine/cacheddata/
 search_engineelasticsearch_docker_image: docker.elastic.co/elasticsearch/elasticsearch:7.16.2
-searchengine_docker_image: openmicroscopy/omero-searchengine:latest
-searchengineclient_docker_image: openmicroscopy/omero-searchengineclient:latest
+searchengine_docker_image: searchengine
+searchengineclient_docker_image: searchengineclient
 ansible_python_interpreter: path/to/bin/python
 searchengine_cache: searchengine_cache
 searchengine_index: searchengine_index

diff --git a/docs/configuration/configuration_installtion.rst b/docs/configuration/configuration_installtion.rst
@@ -10,9 +10,11 @@ The application should have the access attributes (e.g, URL, username, password,
         * DATABASE_USER
         * DATABASE_PASSWORD
         * DATABASE_NAME
-        * CASH_FOLDER
         * ELASTICSEARCH__URL
         * PAGE_SIZE
+    *  Although the user can edit this file to set the values, there are some methods inside manage.py which could help to set the configuration e.g.
+        * set_database_configuration
+        * set_elasticsearch_configuration
 
 * When the app runs for the first time, it will look for the application configuration file.
 
@@ -26,40 +28,38 @@ The application should have the access attributes (e.g, URL, username, password,
 
 There is a need to create the ELasticsearch indices and insert the data to them to be able to use the application.
 
-* There is another method inside manage.py (get_index_data_from_database) to allow indexing automatically from the app.
+* There is a method inside manage.py (get_index_data_from_database) to allow indexing automatically from the app.
 
-* Another method to index the data by
+* Another method to index the data by:
     * The data is extracted from the IDR/Omero database using some SQL queries and saved to csv files ({path/to/project}/omero_search_engine/search_engine/cache_functions/elasticsearch/sql_to_csv.py)
     * The image index data is generated in a big file, so it is recommended to split it into several files to facilitate processing the data and inserting it into the index. In Linux os, users can use the split command to divide the file, for example:
         * split -l 2600000 images.csv
     * create_index: Create the Elasticsearch indices, it can be used to create a single index or all the indices; the default is creating all the indices.
     * the indices are saved in this script ({path/to/project}/omero_search_engine/search_engine/cache_functions/elasticsearch/elasticsearch_templates.py)
     * add_resource_data_to_es_index: Insert the data to the ELasticsearch index; the data can be in a single file (CSV format) or multiple files.
 
-* It has some utility functions inside the manage.py script to build hd5 cash files.
-    * These files contain the available key and value pair inside the database.
-    * The user builds them using a direct connection with the Postgres database server.
-    * These cashed data is available to the user through URLs as it is described in the user manual.
 
 Application installation using docker:
 ======================================
 Ubuntu and Centos7 images are provided
-* The user should pull the image from:
+* The user may build the docker image using the following command:
 
-    * Ubuntu: [imageurl]
-    * Centos: [imageurl]
+    * docker build . -f deployment/docker/centos/Dockerfile -t searchengine
 
-* The user should first pull the image and then run using a command docker run and then the image name.
-* The image runs on port 5569 so mapping this port is required to expose the port to the host machine
-* Also, folders (i.e. /etc/searchengine) and user home folder ($HOME) should be mapped to folder inside the the host machine.
+* Alternatively, the user can pull the openmicroscopy docker image by using the following command::
+    * docker pull openmicroscopy/omero-searchengine:latest
+
+* The image runs on port 5577 so mapping this port is required to expose the port to the host machine
+* Also, folders (i.e. /etc/searchengine) and local data folder (e.g. user home folder) should be mapped to folder inside the the host machine.
     * It will be used to save the configuration file so the user can configure his instance
     * in addition, it will be used to save the logs files and other cached data.
 
 * Example of running the docker run command for Centos image: which maps the etc/searchengine to the user home folder to save the log files, in addition, to mapping the application configuration file
-    * docker run --rm -p 5569:5569 v /home/kmohamed001/.app_config.yml:/opt/app-root/src/.app_config.yml -v $HOME/:/etc/searchengine/  searchengine
+    * docker run --rm -p 5577:5577 -d  -v $HOME/:/etc/searchengine/  searchengine
+* This is an example of a Docker image command to un indexing and re-indexing:
+    * docker run -d  --name searchengine_2 -v $HOME/:/etc/searchengine/  -v $HOME/:/opt/app-root/src/logs/  --network=searchengine-net searchengine get_index_data_from_database
 * The user can call any method inside manage.py by adding the method name by end of the run command. e.g:
-    *  docker run --rm -p 5569:5569 v /home/kmohamed001/.app_config.yml:/opt/app-root/src/.app_config.yml -v $HOME/:/etc/searchengine/  searchengine  show_saved_indices
-
+    *  docker run --rm -p 5577:5577 -v $HOME/:/etc/searchengine/  searchengine show_saved_indices
 
 Searchengine installation and configuration using Ansible:
 ==========================================================
@@ -69,16 +69,15 @@ There is an ansible playbook (management-searchengine.yml) that has been written
 * It will configure and create the required folders
 * It will configure the three apps and run them
 * There is a variables file (searchengine_vars.yml) that the user needs to edit before running the playbook
-    * The variable names are self-explained
+    * The variable names are self-explained and should be customized to the host machine
 * To check that the apps have been installed and run, the user can use wget or curl to call:
-  * for searchengine, http://127.0.0.1:5556/api/v2/resources/
+  * for searchengine, http://127.0.0.1:5556/api/v1/resources/
   * for searchengine client, http://127.0.0.1:5556
   * for Elasticsearch, http://127.0.0.1:9201
-* After deploying the apps using the playbook, it is needed to run another playbook for indexing:
+* After deploying the apps, it is needed to run another playbook for indexing:
     * run_searchengine_index_services.yml
     * If the Postgresql database server is located at the same machine which hosts the searchengine, it is needed to:
-        * Edit pg_hba.conf file (one of the postgresql configuration files) and add two client ips (i.e. 10.11.0.10 and 10.11.0.11)
+        * Edit pg_hba.conf file (one of the postgresql configuration files) and add client IP (i.e. 10.11.0.11)
         * Reload the configuration; so the PostgreSQL accepts the connection from indexing and caching services.
     * As the caching and indexing processes take a long time, there are another two playbooks that enable the user to check if they have finished or not:
         * check_indexing_service.yml
-        * check_caching_service.yml
diff --git a/readme.rst b/readme.rst
@@ -1,6 +1,6 @@
 OMERO Search Engine
 --------------------
-* OMERO search engine app  is used to search metadata (key-value pairs)
+* OMERO search engine app is used to search metadata (key-value pairs)
 
 * The search engine query is a dict that has three parts:
 
@@ -13,11 +13,12 @@ OMERO Search Engine
 
 * The second part of the query is or_filters; it has alternatives to search the database; it answers a question like finding the images which can satisfy one or more of conditions inside this list. It is a list of dict also and have the same format as the dict inside and_filter
 
-* The third part is the main_attributes, it allows the user to search using one or more of project _id, dataset_id, owner_id, group_id, owner_id, etc. It supports two operators, equals and not_equals. Hence, it is possible to search one project instead of all the projects, also it is possible to search the results which belong to a specific user or a group.
+* The third part is the main_attributes, it allows the user to search using one or more of project _id, dataset_id, group_id, owner_id, group_id, owner_id, etc. It supports two operators, equals and not_equals. Hence, it is possible to search one project instead of all the projects, also it is possible to search the results which belong to a specific user or a group.
 
 * The search engine returns the results in a JSON which has the following keys:
 
-    * 'notice': A message to report an error or a message to the sender.
+    * 'notice': report a message to the sender whihc may includes an error message.
+    " 'Error': specific error message
     * 'query_details': The submitted query.
     * 'resource': The resource, e.g. image
     * 'server_query_time': The server query times in seconds
@@ -30,11 +31,11 @@ OMERO Search Engine
 
 * It is possible to query the search engine to get all the available resources (e.g. image) and their keys (names) using the following URL:
 
-    * 127.0.0.01:5556/api/v1/resources/all/keys
+    * 127.0.0.01:5577/api/v1/resources/all/keys
 
 * The user can get the available values for a specific key for a recourse, e.g. what are the available values for Organism:
 
-    * http://127.0.0.1:5556/api/v1/resources/image/getannotationvalueskey/?key=Organism
+    * http://127.0.0.1:5577/api/v1/resources/image/getannotationvalueskey/?key=Organism
 
 * The following python script  sends a query to the search engine and gets the results
 
@@ -49,7 +50,7 @@ OMERO Search Engine
     # url to get the next page for a query, bookmark is needed
     image_page_ext = "/resources/image/searchannotation_page/"
     # search engine url
-    base_url = "http://idr-testing.openmicroscopy.org/searchengineapi/api/v1/"
+    base_url = "http://127.0.0.1:5577/api/v1/"
 
     import sys
 
@@ -142,7 +143,6 @@ OMERO Search Engine
     * It is used to build the query
     * It will display the results when they are ready
 
-
 * The app uses Elasticsearch
     * There is a method inside manage.py (create_index) to create a separate index for image, project, dataset, screen, plate and well using two templates:
         * image template (image_template) for image index. It is derived from some Omero tables into a single Elasticsearch index (image, annoation_mapvalue, imageannotationlink, project, dataset, well, plate and screen to generate a single index.
@@ -153,7 +153,7 @@ OMERO Search Engine
         * There is a method inside manage.py script (add_resource_data_to_es_index) that reads the CSV files and inserts the data to the Elasticsearch index.
         * I am investigating automatic updates of the elastic search data in case of the data inside the PostgreSQL database has been changed.
 
-    * The data can be transferred directly from the Omero database to the Elasticsearch using a method inside manage.py (get_index_data_from_database):
+    * The data can be transferred directly from the OMERO database to the Elasticsearch using a method inside manage.py (get_index_data_from_database):
         * It creates the elastic search indices for each resource
         * it queries the Omero database, after receiving the data it process and push them to the Elasticsearch indices.
         * This process takes a relatively long time, it depends on the hosting machine specs. The user can adjust how many rows can be processed at one call to the Omero database:
@@ -164,4 +164,4 @@ OMERO Search Engine
     * There is a method inside manage.py script (add_resource_data_to_es_index) which reads the CSV files and inserts the data to the Elasticsearch index.
     * I am investigating automatic updates of the elastic search data in case of the data inside the PostgreSQL database has been changed.
 
-For the configuration and installation instructions, please read the following document doc/configuration/configuration_installtion.rs
+For the configuration and installation instructions, please read the following document doc/configuration/configuration_installtion.rst
diff --git a/search_engine/api/v1/resources/urls.py b/search_engine/api/v1/resources/urls.py
@@ -8,7 +8,7 @@
 
 @resources.route('/',methods=['GET'])
 def index():
-    return "Omero search engine (API V1)"
+    return "OMERO search engine (API V1)"
 
 @resources.route('/<resource_table>/searchannotation_page/',methods=['POST'])
 def search_resource_page(resource_table):
@@ -185,4 +185,3 @@ def search(resource_table):
     from search_engine.api.v1.resources.query_handler import simple_search
     results=simple_search(key, value, operator,case_sensitive,bookmark, resource_table)
     return jsonify(results)
-