Skip to content

Commit

Permalink
Merge branch 'develop' into 2939-shib #2939
Browse files Browse the repository at this point in the history
  • Loading branch information
pdurbin committed Mar 17, 2016
2 parents 2859eb5 + b39c957 commit bac238f
Show file tree
Hide file tree
Showing 10 changed files with 831 additions and 519 deletions.
23 changes: 20 additions & 3 deletions doc/sphinx-guides/source/installation/config.rst
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,24 @@ The :doc:`prerequisites` section explained that Dataverse requires a specific So
jetty.xml
+++++++++

Stop Solr and edit ``solr-4.6.0/example/etc/jetty.xml`` to have the following value: ``<Set name="requestHeaderSize">102400</Set>``. Without this higher value in place, it will appear that no data has been added to your dataverse installation and ``WARN org.eclipse.jetty.http.HttpParser – HttpParser Full for /127.0.0.1:8983`` will appear in the Solr log. See also https://support.lucidworks.com/hc/en-us/articles/201424796-Error-when-submitting-large-query-strings-
Stop Solr and edit ``solr-4.6.0/example/etc/jetty.xml`` to add a line having to do with ``requestHeaderSize`` as follows:

.. code-block:: xml
<Call name="addConnector">
<Arg>
<New class="org.eclipse.jetty.server.bio.SocketConnector">
<Set name="host"><SystemProperty name="jetty.host" /></Set>
<Set name="port"><SystemProperty name="jetty.port" default="8983"/></Set>
<Set name="maxIdleTime">50000</Set>
<Set name="lowResourceMaxIdleTime">1500</Set>
<Set name="statsOn">false</Set>
<Set name="requestHeaderSize">102400</Set>
</New>
</Arg>
</Call>
Without this ``requestHeaderSize`` line in place, which increases the default size, it will appear that no data has been added to your Dataverse installation and ``WARN org.eclipse.jetty.http.HttpParser – HttpParser Full for /127.0.0.1:8983`` will appear in the Solr log. See also https://support.lucidworks.com/hc/en-us/articles/201424796-Error-when-submitting-large-query-strings-

Network Ports
-------------
Expand Down Expand Up @@ -112,15 +129,15 @@ The password reset feature requires ``dataverse.fqdn`` to be configured.
dataverse.siteUrl
+++++++++++++++++

| and specify the alternative protocol and port number.
| and specify the protocol and port number you would prefer to be used to advertise the URL for your Dataverse.
| For example, configured in domain.xml:
| ``<jvm-options>-Ddataverse.fqdn=dataverse.foobar.edu</jvm-options>``
| ``<jvm-options>-Ddataverse.siteUrl=http://${dataverse.fqdn}:8080</jvm-options>``
dataverse.files.directory
+++++++++++++++++++++++++

This is how you configure the path to which files uploaded by users are stored. The installer prompted you for this value.
This is how you configure the path to which files uploaded by users are stored.

dataverse.auth.password-reset-timeout-in-minutes
++++++++++++++++++++++++++++++++++++++++++++++++
Expand Down
31 changes: 24 additions & 7 deletions doc/sphinx-guides/source/installation/installation-main.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,10 @@ Running the Dataverse Installer
-------------------------------

A scripted, interactive installer is provided. This script will configure your Glassfish environment, create the database, set some required options and start the application. Some configuration tasks will still be required after you run the installer! So make sure to consult the next section.
At this point the installer only runs on RHEL 6 and similar.
At this point the installer only runs on RHEL 6 and similar and MacOS X (recommended as the platform for developers).

Generally, the installer has a better chance of succeeding if you run it against a freshly installed Glassfish node that still has all the default configuration settings. In any event, please make sure that it is still configured to accept http connections on port 8080 - because that's where the installer expects to find the application once it's deployed.


You should have already downloaded the installer from https://github.com/IQSS/dataverse/releases when setting up and starting Solr under the :doc:`prerequisites` section. Again, it's a zip file with "dvinstall" in the name.

Expand All @@ -21,31 +24,47 @@ Execute the installer script like this::
# cd dvinstall
# ./install

The script will prompt you for some configuration values. If this is a test/evaluation installation, it should be safe to accept the defaults for most of the settings:
**NEW in Dataverse 4.3:** It is no longer necessary to run the installer as root!
Just make sure the user that runs the installer has the write permission in the Glassfish directory. For example, if your Glassfish directory is owned by root, and you try to run the installer as a regular user, it's not going to work.
(Do note, that you want the Glassfish directory to be owned by the same user that will be running Glassfish. And you most likely won't need to run it as root. The only reason to run it as root would be to be able to run the application on the default HTTP(S) ports 80 and 443, instead of 8080 and 8181. However, an easier, and more secure way to achieve that would be to instead keep Glassfish running on a high port, and hide it behind an Apache Proxy, via AJP, running on port 80. This configuration is in fact required if you choose to have your Dataverse support Shibboleth authentication. See more discussion on this here: :doc:`shibboleth`.)


The script will prompt you for some configuration values. If this is a test/evaluation installation, it may be possible to accept the default values provided for most of the settings:

- Internet Address of your host: localhost
- Glassfish Directory: /usr/local/glassfish4
- Administrator email address for this Dataverse: (none)
- SMTP (mail) server to relay notification messages: localhost
- Postgres Server: localhost
- Postgres Server Address: [127.0.0.1]
- Postgres Server Port: 5432
- Postgres ADMIN password: secret
- Name of the Postgres Database: dvndb
- Name of the Postgres User: dvnapp
- Postgres user password: secret
- Remote Solr indexing service: LOCAL
- Will this Dataverse be using TwoRavens application: NOT INSTALLED
- Rserve Server: localhost
- Rserve Server Port: 6311
- Rserve User Name: rserve
- Rserve User Password: rserve

**New, as of 4.3:**

- Administration Email address for the installation;
- Postgres admin password - We'll need it in order to create the database and user for the Dataverse to use, without having to run the installer as root. If you don't know your Postgres admin password, you may simply set the authorization level for localhost to "trust" in the PostgreSQL ``pg_hba.conf`` file (See the PostgreSQL section in the Prerequisites). If this is a production evnironment, you may want to change it back to something more secure, such as "password" or "md5", after the installation is complete.
- Network address of a remote Solr search engine service (if needed) - In most cases, you will be running your Solr server on the same host as the Dataverse application (then you will want to leave this set to the default value of ``LOCAL``). But in a serious production environment you may set it up on a dedicated separate server.
- The URL of the TwoRavens application GUI, if this Dataverse node will be using a companion TwoRavens installation. Otherwise, leave it set to ``NOT INSTALLED``.

The script is to a large degree a derivative of the old installer from DVN 3.x. It is written in Perl. If someone in the community is eager to rewrite it, perhaps in a different language, please get in touch. :)

All the Glassfish configuration tasks performed by the installer are isolated in the shell script ``dvinstall/glassfish-setup.sh`` (as ``asadmin`` commands).

As the installer finishes, it mentions a script called ``post-install-api-block.sh`` which is **very important** to execute for any production installation of Dataverse. Security will be covered in :doc:`config` section but for now, let's make sure your installation is working.
**IMPORTANT:** Please note, that "out of the box" the installer will configure the Dataverse to leave unrestricted access to the administration APIs from (and only from) localhost. Please consider the security implications of this arrangement (anyone with shell access to the server can potentially mess with your Dataverse). An alternative solution would be to block open access to these sensitive API endpoints completely; and to only allow requests supplying a pre-defined "unblock token" (password). If you prefer that as a solution, please consult the supplied script ``post-install-api-block.sh`` for examples on how to set it up.

Logging In
----------

Out of the box, Glassfish runs on port 8080 and 8181 rather than 80 and 443, respectively, so visiting http://localhost:8080 (substituting your hostname) should bring up a login page. See the :doc:`shibboleth` page for more on ports, but for now, let's confirm we can log in by using port 8080. Poke a temporary hole in your firewall.
Out of the box, Glassfish runs on port 8080 and 8181 rather than 80 and 443, respectively, so visiting http://localhost:8080 (substituting your hostname) should bring up a login page. See the :doc:`shibboleth` page for more on ports, but for now, let's confirm we can log in by using port 8080. Poke a temporary hole in your firewall, if needed.

Superuser Account
+++++++++++++++++
Expand All @@ -62,8 +81,6 @@ Use the following credentials to log in:

Congratulations! You have a working Dataverse installation. Soon you'll be tweeting at `@dataverseorg <https://twitter.com/dataverseorg>`_ asking to be added to the map at http://dataverse.org :)

(While you're logged in, you should go ahead and change the email address of the dataverseAdmin account to a real one rather than "dataverse@mailinator.com" so that you receive notifications.)

Trouble? See if you find an answer in the troubleshooting section below.

Next you'll want to check out the :doc:`config` section.
Expand Down
2 changes: 1 addition & 1 deletion doc/sphinx-guides/source/installation/prep.rst
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,6 @@ Here are some questions to keep in the back of your mind as you test and move in
Next Steps
----------

Proceed to the :doc:`prerequisites` section will help you get ready to run the Dataverse installation script.
Proceed to the :doc:`prerequisites` section which will help you get ready to run the Dataverse installation script.

.. |3webservers| image:: ./img/3webservers.png
43 changes: 24 additions & 19 deletions doc/sphinx-guides/source/installation/prerequisites.rst
Original file line number Diff line number Diff line change
Expand Up @@ -91,48 +91,53 @@ The version that ships with RHEL 6 and above is fine::
# service postgresql initdb
# service postgresql start

Configure Access to PostgreSQL for the Installer Script
=======================================================
The standard init script that ships RHEL 6 and similar should work fine. Enable it with this command::

# chkconfig postgresql on



Configuring Database Access for the Dataverse Application (and the Dataverse Installer)
=====================================================================================

- When using localhost for the database server, the installer script needs to have direct access to the local PostgreSQL server via Unix domain sockets. This is configured by the line that starts with ``local all all`` in the pg_hba.conf file. The location of this file may vary depending on the distribution. But if you followed the suggested installation instructions above, it will be ``/var/lib/pgsql/data/pg_hba.conf`` on RHEL and similar. Make sure the line looks like this (it will likely be pre-configured like this already)::
- The application and the installer script will be connecting to PostgreSQL over TCP/IP, using password authentication. In this section we explain how to configure PostgreSQL to accept these connections.

local all all peer

- If the installer still fails to connect to the databse, we recommend changing this configuration entry to ``trust``::
- If PostgreSQL is running on the same server as Glassfish, find the localhost (127.0.0.1) entry that's already in the ``pg_hba.conf`` and modify it to look like this::

local all all trust
host all all 127.0.0.1/32 md5

This is a security risk, as it opens your database to anyone with a shell on your server. It does not however compromise remote access to your system. Plus you only need this configuration in place to run the installer. After it's done, you can safely reset it to how it was configured before.
Once you are done with the prerequisites and run the installer script (documented here: :doc:`installation-main`) it will ask you to enter the address of the Postgres server. Simply accept the default value ``127.0.0.1`` there.

Configure Database Access for the Dataverse Application
=======================================================

- The application will be talking to PostgreSQL over TCP/IP, using password authentication. If you are running PostgreSQL on the same server as Glassfish, we strongly recommend that you use the localhost interface to connect to the database. Make you sure you accept the default value ``localhost`` when the installer asks you for the PostgreSQL server address. Then find the localhost (127.0.0.1) entry that's already in the ``pg_hba.conf`` and modify it to look like this::
- The Dataverse installer script will need to connect to PostgreSQL **as the admin user**, in order to create and set up the database that the Dataverse will be using. If for whatever reason it is failing to connect (for example, if you don't know/remember what your Postgres admin password is), you may choose to temporarily disable all the access restrictions on localhost connections, by changing the above line to::

host all all 127.0.0.1/32 trust

Note that this rule opens access to the database server **via localhost only**. Still, in a production environment, this may constitute a security risk. So you will likely want to change it back to "md5" once the installer has finished.

host all all 127.0.0.1/32 password

- If the Dataverse application is running on a different server, you will need to add a new entry to the ``pg_hba.conf`` granting it access by its network address::

host all all [ADDRESS] 255.255.255.255 password
host all all [ADDRESS] 255.255.255.255 md5

(``[ADDRESS]`` should be the numeric IP address of the Glassfish server).
Where ``[ADDRESS]`` is the numeric IP address of the Glassfish server. Enter this address when the installer asks for the PostgreSQL server address.

- In some distributions, PostgreSQL is pre-configured so that it doesn't accept network connections at all. Check that the ``listen_address`` line in the configuration file ``postgresql.conf`` is not commented-out and looks like this::
- In some distributions, PostgreSQL is pre-configured so that it doesn't accept network connections at all. Check that the ``listen_address`` line in the configuration file ``postgresql.conf`` is not commented out and looks like this::

listen_addresses='*'

The file ``postgresql.conf`` will be located in the same directory as the ``pg_hba.conf`` above.

- **Important: you must restart Postgres** for the configuration changes to take effect! On RHEL and similar (provided you installed Postgres as instructed above)::
- **Important: PostgreSQL must be restarted** for the configuration changes to take effect! On RHEL and similar (provided you installed Postgres as instructed above)::
# service postgresql restart

PostgreSQL Init Script
======================
On MacOS X a "Reload Configuration" icon is usually supplied in the PostgreSQL application folder. Or you could look up the process id of the PostgreSQL postmaster process, and send it the SIGHUP signal::

kill -1 PROCESS_ID

The standard init script that ships RHEL 6 and similar should work fine. Enable it with this command::

# chkconfig postgresql on

Solr
----
Expand Down
22 changes: 19 additions & 3 deletions scripts/api/post-install-api-block.sh
Original file line number Diff line number Diff line change
@@ -1,11 +1,27 @@
#!/bin/bash

# Run this script post-installation, to block all the settings that
# should not be available to the general public in a production Dataverse installation.
# This script can be run on a system that was set up with unrestricted access to
# the sensitive API endpoints, in order to block it for the general public.

# First, revoke the authentication token from the built-in user:
curl -X DELETE $SERVER/admin/settings/BuiltinUsers.KEY

# Block the sensitive endpoints:
# Relevant settings:
# - :BlockedApiPolicy - one of allow, drop, localhost-only, unblock-key
# - :BlockedApiKey - when using the unblock-key policy, pass this key in the unblock-key query param to allow the call to a blocked endpoint
# - :BlockedApiEndpoints - comma separated list of blocked api endpoints.
# - :BlockedApiEndpoints - comma separated list of blocked api endpoints

# This leaves /api/admin and /api/test blocked to all connections except from those
# coming from localhost:
curl -X PUT -d localhost-only http://localhost:8080/api/admin/settings/:BlockedApiPolicy
curl -X PUT -d admin,test http://localhost:8080/api/admin/settings/:BlockedApiEndpoints

# In some situations, you may prefer an alternative solution - to block ALL connections to
# these endpoints completely; but allow connections authenticated with the defined
# "unblock key" (password):

#curl -X PUT -d YOURSUPERSECRETUNBLOCKKEY http://localhost:8080/api/admin/settings/:BlockedApiKey
#curl -X PUT -d unblock-key http://localhost:8080/api/admin/settings/:BlockedApiPolicy


42 changes: 40 additions & 2 deletions scripts/api/setup-all.sh
Original file line number Diff line number Diff line change
@@ -1,4 +1,23 @@
#!/bin/bash

SECURESETUP=1

for opt in $*
do
case $opt in
"--insecure")
SECURESETUP=0
;;
"-insecure")
SECURESETUP=0;
;;
*)
echo "invalid option: $opt"
exit 1 >&2
;;
esac
done

command -v jq >/dev/null 2>&1 || { echo >&2 '`jq` ("sed for JSON") is required, but not installed. Download the binary for your platform from http://stedolan.github.io/jq/ and make sure it is in your $PATH (/usr/bin/jq is fine) and executable with `sudo chmod +x /usr/bin/jq`. On Mac, you can install it with `brew install jq` if you use homebrew: http://brew.sh . Aborting.'; exit 1; }

echo "deleting all data from Solr"
Expand Down Expand Up @@ -34,7 +53,6 @@ curl -X PUT -d 10.5072/FK2 "$SERVER/admin/settings/:Authority"
curl -X PUT -d EZID "$SERVER/admin/settings/:DoiProvider"
curl -X PUT -d / "$SERVER/admin/settings/:DoiSeparator"
curl -X PUT -d burrito $SERVER/admin/settings/BuiltinUsers.KEY
curl -X PUT -d empanada $SERVER/admin/settings/:BlockedApiKey
curl -X PUT -d localhost-only $SERVER/admin/settings/:BlockedApiPolicy
echo

Expand All @@ -58,4 +76,24 @@ echo
# OPTIONAL USERS AND DATAVERSES
#./setup-optional.sh

echo "Setup done. Consider running post-install-api-block.sh for blocking the sensitive API."
if [ $SECURESETUP = 1 ]
then
# Revoke the "burrito" super-key;
# Block the sensitive API endpoints;
curl -X DELETE $SERVER/admin/settings/BuiltinUsers.KEY
curl -X PUT -d admin,test $SERVER/admin/settings/:BlockedApiEndpoints
echo "Access to the /api/admin and /api/test is now disabled, except for connections from localhost."
else
echo "IMPORTANT!!!"
echo "You have run the setup script in the INSECURE mode!"
echo "Do keep in mind, that access to your admin API is now WIDE-OPEN!"
echo "Also, your built-in user is still set up with the default authentication token"
echo "(that is distributed as part of this script, hence EVERYBODY KNOWS WHAT IT IS!)"
echo "Please consider the consequences of this choice. You can block access to the"
echo "/api/admin and /api/test endpoints, for all connections except from localhost,"
echo "and revoke the authentication token from the built-in user by executing the"
echo "script post-install-api-block.sh."
fi

echo
echo "Setup done."
2 changes: 1 addition & 1 deletion scripts/deploy/phoenix.dataverse.org/post
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#/bin/sh
cd scripts/api
./setup-all.sh | tee /tmp/setup-all.sh.out
./setup-all.sh --insecure | tee /tmp/setup-all.sh.out
cd ../..
psql -U dvnapp dvndb -f scripts/database/reference_data.sql
scripts/search/tests/publish-dataverse-root
Expand Down
Loading

0 comments on commit bac238f

Please sign in to comment.