Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New version of Deployment Guide for Ubuntu 16. #48

Open
wants to merge 47 commits into
base: master
Choose a base branch
from

Conversation

csqrs
Copy link

@csqrs csqrs commented May 4, 2018

I would like to contribute this to your project. I decided to offer a new file, rather than e.g.
interleave Ubuntu 16 instructions in the existing one. That would have been too messy.
Please let me know if you are interested in this, and/or what I can do to bring it to an
acceptable form. I've worked through the instructions from start to finish building a
server from scratch, and it works. However, I'm sure there are further tests you make
to verify functionality. And thank you very much for the original document! It's invaluable!

hash -r
```

The following should tell apt to not upgrade ffmpeg (we need version 1.1.1):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have actually tested up to 3.3.x and it works fine just have not had the opportunity to update deployment guide

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to 3.3.7.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot to point out. The newer ffmpeg doesn't support libfaac. https://trac.ffmpeg.org/wiki/Encode/AAC

MP4s won't work right if you don't change the codec:

drush -y vset islandora_video_mp4_audio_codec "libfdk_aac"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed a typo in the ffmpeg section and added this drush command
in the later Drupal configuration section.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PS. I tested an mp4 ingest and it worked.


#### Blazegraph <a id="blazegraph"></a>

This step is optional. Instructions are based on http://dev.digibess.it/doku.php?id=reloaded:be_blazeg. We have to install under a separate Tomcat from Fedora, as Fedora cannot be running when rebuilding the Resource Index (see https://github.com/discoverygarden/trippi-sail). This also makes migrating this service to a separate machine easier.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

```
cd /usr/local

wget http://projects.iq.harvard.edu/files/fits/files/fits-0.10.1.zip
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to 1.1.1.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

unzip -o fits-0.10.1.zip && rm -rf fits-0.10.1.zip && ln -s fits-0.10.1 fits && chmod a+x /usr/local/fits/fits.sh
```

#### Adore-Djatoka Install<a id="adore-djatoka-install"></a>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we get rid of djatoka we should be able to use the openjdk which makes maintenance a lot easier.


CustomLog ${APACHE_LOG_DIR}/access.log combined

# These are needed for getting Adore-Djatoka working over https
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of these proxy pass settings are not needed.

/fedora/get
/fedora/services
/fedora/describe
/fedora/risearch

Do not need to be exposed for general functionality.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left these in place, as in the Islandora Vagrant, and with the target of
deploying 'https' whether with Adore-Djatoka or Cantaloupe.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand why is exposing these endpoints required for https?

/fedora/get /fedora/services /fedora/describe /fedora/risearch
Just was curious about exposing this on a production system when they don't need to be. I wasn't referring to the Cantaloupe or Adore-djatoka proxypass settings those would still be needed.

For https to work a number of approaches can be taken since the usual issue is that the servlet container doesn't know how to trust the cert. If you are using a cert with a provider that is supported by Java's built in trust store you could remove "-Djavax.net.ssl.trustStore=/usr/local/fedora/server/truststore -Djavax.net.ssl.trustStorePassword=tomcat" from the Java OPTs and it might just work otherwise you could import the SSL cert chain e.g. https://tomcat.apache.org/tomcat-7.0-doc/ssl-howto.html#Importing_the_Certificate.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The origin of these pass throughs as needed for Adore-Djatoka over https is, I believe, this
thread: https://groups.google.com/forum/?pli=1#!msg/islandora/yWlpQtL8gEU/cxF4ZmvbY8YJ

In my experience, which is limited, Cantaloupe never needed more than the standard two
line proxy pass, whether over http or https, but Adore-Djotaka just wouldn't work under https
with those basic two.

I'll review this in the next couple of days and see if I can't make this clearer/cleaner.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated this part. I'm not sure why all those endpoints were needed, plus
I haven't actually tested them with Adore-Djatoka. But they are present in the Vagrant.
... but that's not production ready. Suggestions? Maybe it's time to drop Adore? I do
think it's important to have a working https configuration.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think your suggestion about the Java cert handling is probably the right way to go,
since that's probably the underlying problem. However, that doesn't explain why
https seems to work fine for Cantaloupe but not for Adore-Djatoka.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,

Vagrant is not intended to be used as a reference for a production-ready setup it is geared towards development.

None of these should be needed in a production level setup with or without https https://github.com/Islandora-Labs/islandora_vagrant_base_box/blob/master/scripts/drupal.sh#L76-L88. You only need the one for Djatoka and Cantaloupe.

The exposed Fedora endpoints that are exposed could be useful for someone in development who wants to bypass Islandora to hit Fedora directly (or perhaps another use case outside of Islandora that I am not aware of) otherwise they are not needed.

In Cantaloupe did you have https://github.com/discoverygarden/cantaloupe_configs/blob/master/cantaloupe.properties#L152 set to true? If so it will work with certs that are not trusted.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re: Cantaloupe certs checking, yes, I use that set to 'true'.

I've removed all the bogus proxy pass settings and left a note about potential issues for Adore-Djatoka.


`cp $FEDORA_HOME/server/config/spring/akubra-llstore.xml $FEDORA_HOME/server/config/spring/akubra-llstore.xml.bak`

Edit that file and find the `<bean name="fsObjectStore"` and `<bean name="fsDatastreamStore"` clauses and replace the `<constructor-arg value="PATH"` values with appropriate ones for your installation, e.g. `/srv/fedora/data/objectStore` and `/srv/fedora/data/datastreamStore`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Symlinks also work


#### Replace Mulgara with Blazegraph <a id="mulgara-blazegraph"></a>

This is optional. The instructions are based on http://dev.digibess.it/doku.php?id=reloaded:be_repmulg.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


#sed -i "s|upload_max_filesize = 2M|upload_max_filesize = 2048M|g" /etc/php/7.1/apache2/php.ini

echo "apc.shm_size = 64M" >> $APC_CONFIG_FILE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

APC even used anymore?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not, but I'm not an expert, so I leave that to you.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not used anymore.

Probably should use opcache instead but not sure about including that here though since the settings could impact people who are actively changing their code semi-frequently. Curious what kind of luck others have had with it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a note in the 'install.properties' section that mentions that APC is deprecated in favour of Zend Opcache, which is probably based on what I found when trying to install and configure APC. The note says "to be dealt with at a later date". I'm adding another comment at this point.


sed -i "s|resolver.static = FilesystemResolver|resolver.static = HttpResolver|" cantaloupe.properties

sed -i "s|HttpResolver.trust_all_certs = false|HttpResolver.trust_all_certs = true|" cantaloupe.properties
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if we should leave "trust_all_certs" by default. It would be better to get Java/Tomcat to trust the SSL cert.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for a back end server like this, that doesn't connect to the outside world
it's probably fine. Do we even use ssl? 'https' is disabled.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When Adore-Djatoka or Cantaloupe have to retrieve an image from the frontend webserver and it (the webserver) is using HTTPS it will attempt to check the SSL cert. If it doesn't know how to trust the cert the request will fail. Generally, if this is the case you should make sure the CA root certificate/chain is added to the Truststore. Cantaloupe provides an option so it will trust the cert anyway even if it doesn't know how to trust the cert or if it is invalid. Generally, we should avoid doing this in production but it is useful for dev/testing/troubleshooting.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to be in Victoria until the 18th so I won't be able to look at this until then.
I guess we should add something about how to update the Tomcat trust store.
I'm running Cantaloupe stand alone, so that's a different kettle of fish...

@giancarlobi
Copy link

@lutaylor @DiegoPino @csqrs I successfully upgraded Cantaloupe 3.4 to 4.0. You can find here my steps. Just in case you think add this to the Guide.

@csqrs
Copy link
Author

csqrs commented Jun 29, 2018

@giancarlobi I've updated to 4.0 as you suggested.

@csqrs
Copy link
Author

csqrs commented Jun 29, 2018

I'm going to be on vacation for the next 3 weeks, so won't be able to respond to suggestions etc.

@csqrs
Copy link
Author

csqrs commented Jun 29, 2018

Ah. Right. Missed that. So that's why you enabled the endpoint API. ;) Will fix shortly.

@giancarlobi
Copy link

giancarlobi commented Jun 29, 2018 via email

…e of the API is required for this operation.
[Service]
Type=simple
User=cantaloupe
ExecStart=/usr/bin/nohup /usr/bin/java -Dcantaloupe.config=/usr/local/cantaloupe/cantaloupe.properties -Xmx256m -jar /usr/local/cantaloupe/cantaloupe-4.0.war

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need nohup as cantaloupe starded as service so:
ExecStart=/usr/bin/java -Dcantaloupe.config=/usr/local/cantaloupe/cantaloupe.properties -Xmx256m -jar /usr/local/cantaloupe/cantaloupe-4.0.war

@csqrs
Copy link
Author

csqrs commented Jul 26, 2018

Thanks @giancarlobi . Works for me.

@csqrs
Copy link
Author

csqrs commented Aug 6, 2018

@giancarlobi @DiegoPino @lutaylor Following the conversation here:

https://groups.google.com/forum/?pli=1#!topic/islandora-dev/18rWjTRdCS0

I've updated the "install Blazegraph" section to include running it "standalone". It turns out it's
pretty easy, and works for me. Maybe we should replace the first option with the second. That
would remove some concerns about running so many Tomcats.

@giancarlobi
Copy link

@csqrs thanks, I really think that running it standalone is the best solution for the whole stack.
I have no experience with BZ standalone but I'd like to test/migrate to this solution, maybe later in August.

@DiegoPino
Copy link

@giancarlobi @csqrs we run Blazegraph as standalone (Jetty based) Service in Ubuntu for some of our partners and it works pretty well. Only issue is that you need to be aware of the real memory needs of this and reduce on the other side what you give to tomcat to make everything work faster/stable, or you could end over assigning resources to JVM stuff or worst, to little. If you need me to share some of my settings feel free to ask 👍 i really see good work happening here =)

@giancarlobi
Copy link

@csqrs sorry, maybe I'm wrong, do you have to also copy https://github.com/discoverygarden/blazegraph_conf to stand alone installation?

@csqrs
Copy link
Author

csqrs commented Aug 6, 2018

@giancarlobi As far as I can tell, the DG blazegraph config is not needed. The deb installs
what's needed in /etc/blazegraph/ as well as an 'init' script in /etc/init.d/. If you install the deb
on top of an existing installation based on the DG material, the init file will get clobbered. ;)

Also, @DiegoPino there's a handy "testing" java options line in /etc/default/blazegraph
installed by the deb that lets you give just a small amount of memory for testing.

@giancarlobi
Copy link

@csqrs great, many thanks !!!

@lutaylor
Copy link
Contributor

lutaylor commented Aug 7, 2018

@csqrs @giancarlobi @DiegoPino
Hello folks,

The RWstore.proproperties file might need to be tweaked a bit. I am not sure what is provided with the deb or how it performs. The example in the https://github.com/discoverygarden/blazegraph_conf
is based on the quads RWstore.properties sample for NanoSparqlServer https://wiki.blazegraph.com/wiki/index.php/NanoSparqlServer. The sample to date has scaled well with large installations with millions of objects. I have been meaning to try out the deb as a possible alternative for the Tomcat setup. Has anyone used it yet on a larger install? e.g. over a million objects?

@csqrs
Copy link
Author

csqrs commented Aug 7, 2018

@lutaylor Comparing the two configs, there are some differences, though the deb
provided one is also 'quads'. I certainly haven't tested it on a large installation.
@DiegoPino can probably speak to that, since he's seen some real installations.

The most notable differences have to do with the 'btree' 'branchingFactor' values
(smaller in the deb provided conf) and an 'rdf' related 'textIndex' is 'false' in the
deb provided conf while 'true' in the DG conf, it seems to me. But I'm no expert.

Also, the log level is set to WARN by default by the deb, but ERROR by the DG
provided log4j config.

I'll add some comments about these differences, as well as the java memory requirements.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants