Skip to content

Commit

Permalink
Install Guide: various improvements #2884
Browse files Browse the repository at this point in the history
- Glassfish init script #2640
- Solr init script #2401
- more on blocking endpoints #976
- documented :SystemEmail #2663
- a bit on admin and monitoring
- reinstalling fresh
  • Loading branch information
pdurbin committed Jan 28, 2016
1 parent 585f11d commit bdbe607
Show file tree
Hide file tree
Showing 6 changed files with 212 additions and 53 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
#! /bin/sh
# chkconfig: 2345 99 01
# description: GlassFish App Server

set -e

ASADMIN=/usr/local/glassfish4/bin/asadmin

case "$1" in
start)
echo -n "Starting GlassFish server: glassfish"
# Increase file descriptor limit:
ulimit -n 32768
# Allow "memory overcommit":
# (basically, this allows to run exec() calls from inside the
# app, without the Unix fork() call physically hogging 2X
# the amount of memory glassfish is already using)
echo 1 > /proc/sys/vm/overcommit_memory

#echo
#echo "GLASSFISH IS UNDER MAINTENANCE;"
#echo "PLEASE DO NOT USE service init script."
#echo
LANG=en_US.UTF-8; export LANG
$ASADMIN start-domain domain1
echo "."
;;
stop)
echo -n "Stopping GlassFish server: glassfish"
#echo
#echo "GLASSFISH IS UNDER MAINTENANCE;"
#echo "PLEASE DO NOT USE service init script."
#echo

$ASADMIN stop-domain domain1
echo "."
;;

*)
echo "Usage: /etc/init.d/glassfish {start|stop}"
exit 1
esac

exit 0
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
****
#!/bin/sh

# Starts, stops, and restarts Apache Solr.
#
# chkconfig: 35 92 08
# description: Starts and stops Apache Solr

SOLR_DIR="/usr/local/solr-4.6.0/example"
JAVA_OPTIONS="-Xmx1024m -DSTOP.PORT=8079 -DSTOP.KEY=mustard -jar start.jar"
LOG_FILE="/var/log/solr.log"
JAVA="/usr/bin/java"

case $1 in
start)
echo "Starting Solr"
cd $SOLR_DIR
$JAVA $JAVA_OPTIONS 2> $LOG_FILE &
;;
stop)
echo "Stopping Solr"
cd $SOLR_DIR
$JAVA $JAVA_OPTIONS --stop
;;
restart)
$0 stop
sleep 1
$0 start
;;
*)
echo "Usage: $0 {start|stop|restart}" >&2
exit 1
;;
esac

23 changes: 21 additions & 2 deletions doc/sphinx-guides/source/installation/administration.rst
Original file line number Diff line number Diff line change
Expand Up @@ -49,18 +49,37 @@ If indexing stops, this command should pick up where it left off based on which

``curl http://localhost:8080/api/admin/index/continue``

Glassfish
---------

``server.log`` is the main place to look when you encounter problems. Hopefully an error message has been logged. If there's a stack trace, it may be of interest to developers, especially they can trace line numbers back to a tagged version.

For debugging purposes, you may find it helpful to increase logging levels as mentioned in the :doc:`/developers/debugging` section of the Developer Guide.

This guide has focused on using the command line to manage Glassfish but you might be interested in an admin GUI at http://localhost:4848

Monitoring
----------

In production you'll want to monitor the usual suspects such as CPU, memory, free disk space, etc.

https://github.com/IQSS/dataverse/issues/2595 contains some information on enabling monitoring of Glassfish, which is disabled by default.

There is a database table called ``actionlogrecord`` that captures events that may be of interest. See https://github.com/IQSS/dataverse/issues/2729 for more discussion around this table.

User Administration
-------------------

There isn't much in the way of user administration tools built in to Dataverse.

Deleting an API Token
+++++++++++++++++++++

If an API token is compromised it should be deleted. Users can generate a new one for themselves, but someone with access to the database can delete tokens as well.
If an API token is compromised it should be deleted. Users can generate a new one for themselves as explained in the :doc:`/user/account` section of the User Guide, but you may want to preemptively delete tokens from the database.

Using the API token 7ae33670-be21-491d-a244-008149856437 as an example:

``delete from apitoken where tokenstring = '7ae33670-be21-491d-a244-008149856437';``

You should expect the output ``DELETE 1`` after issuing the command above.

After the API token has been deleted, users can generate a new one per :doc:`/user/account`.
24 changes: 21 additions & 3 deletions doc/sphinx-guides/source/installation/config.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,9 @@ Securing Your Installation
Blocking API Endpoints
++++++++++++++++++++++

The :doc:`/api/native-api` contains a useful but potentially dangerous API endpoint called "admin" that allows you to change system settings, make ordinary users into superusers, and more. In addition, there is a "test" API used for development and troubleshooting. By default, both of these APIs can be operated on remotely and without the need for any authentication. https://github.com/IQSS/dataverse/issues/1886 was opened to explore changing these defaults, but until then it is very important to block both the "admin" and "test" endpoint with a script called ``post-install-api-block.sh`` available from https://github.com/IQSS/dataverse/blob/develop/scripts/api/post-install-api-block.sh . See also the section on ``:BlockedApiPolicy`` below.
The :doc:`/api/native-api` contains a useful but potentially dangerous API endpoint called "admin" that allows you to change system settings, make ordinary users into superusers, and more. There is a "test" API endpoint used for development and troubleshooting that has some potentially dangerous methods. The ``builtin-users`` endpoint lets people create a local/builtin user account if they know the ``BuiltinUsers.KEY`` value described below.

By default, all APIs can be operated on remotely and without the need for any authentication. https://github.com/IQSS/dataverse/issues/1886 was opened to explore changing these defaults, but until then it is very important to block both the "admin" and "test" endpoint (and at least consider blocking ``builtin-users``). For details please see also the section on ``:BlockedApiPolicy`` below.

Forcing HTTPS
+++++++++++++
Expand Down Expand Up @@ -208,9 +210,11 @@ Out of the box, all API endpoints are completely open as mentioned in the sectio
:BlockedApiEndpoints
++++++++++++++++++++

A comma separated list of API endpoints to be blocked. For a production installation, "admin" and "test" should be blocked, as mentioned in the section on security above.
A comma separated list of API endpoints to be blocked. For a production installation, "admin" and "test" should be blocked (and perhaps "builtin-users" as well), as mentioned in the section on security above:

``curl -X PUT -d "admin,test,builtin-users" http://localhost:8080/api/admin/settings/:BlockedApiEndpoints``

``curl -X PUT -d "admin,test" http://localhost:8080/api/admin/settings/:BlockedApiEndpoints``
See the :doc:`/api/index` for a list of API endpoints.

:BlockedApiKey
++++++++++++++
Expand All @@ -219,6 +223,20 @@ Used in conjunction with the ``:BlockedApiPolicy`` being set to ``unblock-key``.

``curl -X PUT -d s3kretKey http://localhost:8080/api/admin/settings/:BlockedApiKey``

BuiltinUsers.KEY
++++++++++++++++

The key required to create users via API as documented at :doc:`/api/native-api`. Unlike other database settings, this one doesn't start with a colon.

``curl -X PUT -d builtInS3kretKey http://localhost:8080/api/admin/settings/:BuiltinUsers.KEY``

:SystemEmail
++++++++++++

This is the email address that "system" emails are sent from such as password reset links.

``curl -X PUT -d "Support <support@example.edu>" http://localhost:8080/api/admin/settings/:SystemEmail``

:DoiProvider
++++++++++++

Expand Down
46 changes: 46 additions & 0 deletions doc/sphinx-guides/source/installation/installation-main.rst
Original file line number Diff line number Diff line change
Expand Up @@ -89,3 +89,49 @@ UnknownHostException While Deploying
++++++++++++++++++++++++++++++++++++

If you are seeing "Caused by: java.net.UnknownHostException: myhost: Name or service not known" in server.log and your hostname is "myhost" the problem is likely that "myhost" doesn't appear in ``/etc/hosts``. See also http://stackoverflow.com/questions/21817809/glassfish-exception-during-deployment-project-with-stateful-ejb/21850873#21850873

Fresh Reinstall
---------------

Early on when you're installing Dataverse, you may think, "I just want to blow away what I've installed and start over." That's fine. You don't have to uninstall the various components like Glassfish, PostgreSQL and Solr, but you should be conscious of how to clear out their data.

Drop database
+++++++++++++

In order to drop the database, you have to stop Glassfish, which will have open connections. Before you stop Glassfish, you may as well undeploy the war file. First, find the name like this:

``asadmin list-applications``

Then undeploy it like this:

``asadmin undeploy dataverse-VERSION``

Stop Glassfish with the init script provided in the :doc:`prerequisites` section or just use:

``asadmin stop-domain``

With Glassfish down, you should now be able to drop your database and recreate it:

``psql -U dvnapp -c 'DROP DATABASE "dvndb"' template1``

Clear Solr
++++++++++

The database is fresh and new but Solr has stale data it in. Clear it out with this command:

``curl http://localhost:8983/solr/update/json?commit=true -H "Content-type: application/json" -X POST -d "{\"delete\": { \"query\":\"*:*\"}}"``


Deleting uploaded files
+++++++++++++++++++++++

The path below will depend on the value for ``dataverse.files.directory`` as described in the :doc:`config` section:

``rm -rf /usr/local/glassfish4/glassfish/domains/domain1/files``

Rerun Installer
+++++++++++++++

With all the data cleared out, you should be ready to rerun the installer per above.

Related to all this is a series of scripts at https://github.com/IQSS/dataverse/blob/develop/scripts/deploy/phoenix.dataverse.org/deploy that Dataverse developers use have the test server http://phoenix.dataverse.org rise from the ashes before integration tests are run against it. Your mileage may vary. :)
93 changes: 45 additions & 48 deletions doc/sphinx-guides/source/installation/prerequisites.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,12 @@ You **may** find it helpful to look at how the configuration is done automatical

Java
----

Dataverse requires Java 8 (also known as 1.8).

Installing Java
===============

Dataverse should run fine with only the Java Runtime Environment (JRE) installed, but installing the Java Development Kit (JDK) is recommended so that useful tools for troubleshooting production environments are available. We recommend using Oracle JDK or OpenJDK.

The Oracle JDK can be downloaded from http://www.oracle.com/technetwork/java/javase/downloads/index.html
Expand All @@ -35,8 +39,10 @@ Glassfish

Glassfish Version 4.1 is required. There are known issues with Glassfish 4.1.1 as chronicled in https://github.com/IQSS/dataverse/issues/2628 so it should be avoided until that issue is resolved.

**Important**: once Glassfish is installed, a new version of the Weld library (v2.2.10.SP1) must be downloaded and installed. This fixes a serious issue in the library supplied with Glassfish 4.1 ( see https://github.com/IQSS/dataverse/issues/647 for details).
Installing Glassfish
====================

**Important**: once Glassfish is installed, a new version of the Weld library (v2.2.10.SP1) must be downloaded and installed. This fixes a serious issue in the library supplied with Glassfish 4.1 ( see https://github.com/IQSS/dataverse/issues/647 for details). Please note that if you plan to front Glassfish with Apache you must also patch Grizzly as explained in the :doc:`shibboleth` section.

- Download and install Glassfish (installed in ``/usr/local/glassfish4`` in the example commands below)::

Expand All @@ -55,59 +61,31 @@ Glassfish Version 4.1 is required. There are known issues with Glassfish 4.1.1 a

# /usr/local/glassfish4/bin/asadmin osgi lb | grep 'Weld OSGi Bundle'

The Dataverse installation script will start Glassfish if necessary, but while you're configuring Glassfish, you might find the following init script helpful to have Glassfish start on boot::

set -e
ASADMIN=/usr/local/glassfish4/bin/asadmin
case "$1" in
start)
echo -n "Starting GlassFish server: glassfish"
# Increase file descriptor limit:
ulimit -n 32768
# Allow "memory overcommit":
# (basically, this allows to run exec() calls from inside the
# app, without the Unix fork() call physically hogging 2X
# the amount of memory glassfish is already using)
echo 1 > /proc/sys/vm/overcommit_memory

# Set UTF8 as the default encoding:
LANG=en_US.UTF-8; export LANG
$ASADMIN start-domain domain1
echo "."
;;
stop)
echo -n "Stopping GlassFish server: glassfish"

$ASADMIN stop-domain domain1
echo "."
;;

*)
echo "Usage: /etc/init.d/glassfish {start|stop}"
exit 1
esac
exit 0
Glassfish Init Script
=====================

The Dataverse installation script will start Glassfish if necessary, but while you're configuring Glassfish, you might find the following init script helpful to have Glassfish start on boot.

Adjust `this Glassfish init script <../_static/installation/files/etc/init.d/glassfish>`_ for your needs or write your own.

It is not necessary to have Glassfish running before you execute the Dataverse installation script because it will start Glassfish for you.

PostgreSQL
----------

1. Installation
================
Installing PostgreSQL
=======================

Version 9.x is required. Previous versions have not been tested.

1A. RHEL and similar systems:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The version that ships with RHEL 6 and above is fine::

# yum install postgresql-server
# chkconfig postgresql on
# service postgresql initdb
# service postgresql start

2. Configure access to PostgreSQL for the installer script
==========================================================
Configure Access to PostgreSQL for the Installer Script
=======================================================

- When using localhost for the database server, the installer script needs to have direct access to the local PostgreSQL server via Unix domain sockets. This is configured by the line that starts with ``local all all`` in the pg_hba.conf file. The location of this file may vary depending on the distribution. But if you followed the suggested installation instructions above, it will be ``/var/lib/pgsql/data/pg_hba.conf`` on RHEL and similar. Make sure the line looks like this (it will likely be pre-configured like this already)::

Expand All @@ -119,8 +97,8 @@ The version that ships with RHEL 6 and above is fine::

This is a security risk, as it opens your database to anyone with a shell on your server. It does not however compromise remote access to your system. Plus you only need this configuration in place to run the installer. After it's done, you can safely reset it to how it was configured before.

3. Configure database access for the Dataverse application
==========================================================
Configure Database Access for the Dataverse Application
=======================================================

- The application will be talking to PostgreSQL over TCP/IP, using password authentication. If you are running PostgreSQL on the same server as Glassfish, we strongly recommend that you use the localhost interface to connect to the database. Make you sure you accept the default value ``localhost`` when the installer asks you for the PostgreSQL server address. Then find the localhost (127.0.0.1) entry that's already in the ``pg_hba.conf`` and modify it to look like this::

Expand All @@ -140,13 +118,24 @@ This is a security risk, as it opens your database to anyone with a shell on you

- **Important: you must restart Postgres** for the configuration changes to take effect! On RHEL and similar (provided you installed Postgres as instructed above)::
# service postgresql-9.3 restart
# service postgresql restart

PostgreSQL Init Script
======================

The standard init script that ships RHEL 6 and similar should work fine. Enable it with this command::

# chkconfig postgresql on

Solr
----

- Download and Install Solr::
The Dataverse search index is powered by Solr.

Installing Solr
===============

Download and install Solr with these commands::

# wget https://archive.apache.org/dist/lucene/solr/4.6.0/solr-4.6.0.tgz
# tar xvzf solr-4.6.0.tgz
Expand All @@ -162,18 +151,26 @@ With the Dataverse-specific schema in place, you can now start Solr::

# java -jar start.jar

The command above will start Solr in the foreground which is good for a quick sanity check that Solr accepted the schema file, but you'll want to put the process in the background by appending `` &`` or by using an init script. The Vagrant environment uses this init script for Solr but your mileage may vary: https://github.com/IQSS/dataverse/blob/develop/conf/vagrant/etc/init.d/solr
Solr Init Script
================

The command above will start Solr in the foreground which is good for a quick sanity check that Solr accepted the schema file, but starting Solr with an init script is recommended. You can attempt to adjust `this Solr init script <../_static/installation/files/etc/init.d/solr>`_ for your needs or write your own.

Solr should be running before the installation script is executed.

Securing Solr
=============

Solr must be firewalled off from all hosts except the server(s) running Dataverse. Otherwise, any host that can reach the Solr port (8983 by default) can add or delete data, search unpublished data, and even reconfigure Solr. For more information, please see https://wiki.apache.org/solr/SolrSecurity
Solr must be firewalled off from all hosts except the server(s) running Dataverse. Otherwise, any host that can reach the Solr port (8983 by default) can add or delete data, search unpublished data, and even reconfigure Solr. For more information, please see https://wiki.apache.org/solr/SolrSecurity

You may want to poke a temporary hole in your firewall to play with the Solr GUI. More information on this can be found in the :doc:`/developers/dev-environment` section of the Developer Guide.

jq
--

Installing jq
=============

``jq`` is a command line tool for parsing JSON output that is used by the Dataverse installation script. https://stedolan.github.io/jq explains various ways of installing it, but a relatively straightforward method is described below. Please note that you must download the 64- or 32-bit version based on your architecture. In the example below, the 64-bit version is installed. We confirm it's executable and in our ``$PATH`` by checking the version (1.4 or higher should be fine)::

# cd /usr/bin
Expand Down

0 comments on commit bdbe607

Please sign in to comment.