-
Notifications
You must be signed in to change notification settings - Fork 492
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
4364 install prov flask #4461
4364 install prov flask #4461
Conversation
Formatting is bad, will be fixed
new file: scripts/vagrant/install-provenance.sh
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Short version:
- cleaner separation between production build CPL RPM / production installation might be helpful (or existing separation clarified?). Possibly one script to setup the build VM/container, one to build the RPM, and one that installs and configures it?
- preferable to use apache or nginx over the flask development server in production
|
||
If you're feeling adventurous, you can attempt to install CPL directly on your Mac but this is not recommended. Rather, below CPL runs as a REST service within Vagrant. | ||
|
||
First, install Vagrant and VirtualBox as described in the :doc:`tools` section. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like it would be less complex to move these two files (possibly renamed if they're too DV specific) to the prov-cpl repo.
If not, it would probably be more consistent if these were moved to conf/
(rather than doc/
); or we should move the docker-related stuff to a different directory with the idea of having all dev VM/containerization/etc living in one place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I put files in doc
to make them easily downloadable from Sphinx. If anything I would move more conf files under Sphinx when they are used as examples. "Download this file, edit it, etc."
We can certainly see if @jacksonokuhn would like to have Vagrant files in the prov-cpl repo. I didn't want to inhibit progress by making that a dependency. My goal was to unblock Dataverse developers who wanted to get started on prov related work and needed a prov system running on their laptops. Incremental progress. We can make improvements in the future.
|
||
curl -X PUT -d 'http://localhost:7777' http://localhost:8080/api/admin/settings/:ProvServiceUrl | ||
|
||
Installing CPL on Ubuntu |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to mention which versions of ubuntu are expected to work. On version is mentioned below, but it might be clearer to move it to earlier in this section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As someone who is relatively familiar with Vagrant, I would disagree with documenting the version of Ubuntu because what matters is the "base box" in the Vagrantfile. This is currently config.vm.box = "ubuntu/xenial64"
which means 16.04 (the latest LTS as of this writing). Basically a comment here has the potential to get out of date. That said, I'm fine adding a comment if it's truly helpful.
|
||
Build the RPM: | ||
|
||
``rpmbuild -ba dataverse-provenance.spec`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency, it might be better to put these steps into a script that can be run from within the Vagrantfile. Manually editing the existing Vagrantfile might be simpler, but would also slow down automating things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe but I think @landreev and I are the only ones on the team who have ever built an RPM. I purposefully documented the steps in prose. I could see adding more automation once we've gotten more familiar with the process. Unfortunately, we can't use the HMDC Jenkins server to automate the RPM building because it's running el6 and there appear to be incompatibilities between el6 and el7. I left a comment about this at #4364 (comment) . For now the plan is the build this RPM in Vagrant and I don't want to be the only one who knows how. Note also that the version of the RPM matters. At this stage in the game, a human needs to make decisions about when to bump the version and when to get in touch with the upstream author about cutting a release, etc. It's complicated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok - I was reminded of the discussions we had in #4369 about documentation / scripts for "surgery" on dependencies, which was why I mentioned it.
|
||
Download :download:`dataverse-provenance.service <../_static/installation/files/etc/systemd/system/dataverse-provenance.service>` and place it at ``/etc/systemd/system/dataverse-provenance.service``. Here are the default contents of that file but see below for adjustsments you might want to make: | ||
|
||
.. literalinclude:: ../_static/installation/files/etc/systemd/system/dataverse-provenance.service |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using the flask development server in production is probably something we should avoid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh. Sure enough http://flask.pocoo.org/docs/0.12/deploying/#deployment says, "While lightweight and easy to use, Flask’s built-in server is not suitable for production as it doesn’t scale well and by default serves only one request at a time." Do you have any suggestions on how to deploy a Flask app in a better way? Heads up to @jacksonokuhn about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apache / wsgi would be the way that I'd go; but there are other approaches.
@@ -0,0 +1,79 @@ | |||
#!/bin/bash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest splitting these into build-provenance.sh
and install-provenance.sh
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If anything, I might delete this script. It doesn't quite work, especially the part around postgresql-setup-conf.sql
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds like a way to reduce the confusion.
sudo pip install flask | ||
cd /prov-cpl/bindings/python/RestAPI | ||
REST_SERVICE_USER=postgres # FIXME: create a "cplrest" user? | ||
su $REST_SERVICE_USER -s /bin/sh -c "python cpl-rest.py --host=0.0.0.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not returning when vagrant up
was a little surprising to me. I'd agree about creating a cplrest
user; even though it shouldn't matter inside a dev VM.
I also ran into the vagrant halt
doesn't halt problem with this file; but that may be more related to my system than this Vagrantfile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. I haven't run into problems with vagrant halt
(or vagrant destroy
). I know it's weird to run the service as the postgres
user but like you said, this is a dev VM and doesn't really matter. For production I'm using the name @jacksonokuhn and I talked about (cplservice
) but I'm open to changing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cplservice
for production sounds reasonable to me; but I'll defer to @jacksonokuhn . postgres
for dev works (even if it's non-standard).
Install ``dataverse-provenance`` RPM | ||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
Download this :download:`dataverse-provenance RPM <../_static/installation/files/home/rpmbuild/rpmbuild/RPMS/x86_64/dataverse-provenance-0.1-1.x86_64.rpm>` and install it with: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
guides.dataverse.org isn't currently available over https; distributing rpms over http is something that it's probably worth avoiding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a fair point but no one has complained (that I'm aware of) about downloading the rapache RPM over plain HTTP: http://guides.dataverse.org/en/4.8.5/installation/r-rapache-tworavens.html#c-rapache
We did have a very security-minded individual take a look at our dev guide a couple years ago. You might enjoy reading through this comment, @pameyer: #2863 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I installed rapache, it would be an issue I'd raise.
echo "Starting CPL REST service..." | ||
sudo bash -c "su -l $CPL_SERVICE_USER -c 'LD_LIBRARY_PATH=/usr/local/lib python cpl-rest.py --host=0.0.0.0 &'" | ||
echo "Checking version of CPL REST service..." | ||
curl http://localhost:5000/provapi/version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These checks failed when I ran them vagrant up
; cpl-rest.py
wasn't running; initial guess was cwd
mismatch. Initial guess was incorrect; CPL
module not installed in default python of cplsystem
user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I mentioned above, the line about postgresql-setup-conf.sql
isn't quite working. That's probably why the checks fail. The database user probably hasn't been created.
echo "Install dependencies for CPL REST service..." | ||
sudo yum install -y python-flask | ||
echo "Downloading CPL REST service script..." | ||
sudo bash -c "su -l $CPL_SERVICE_USER -c 'wget https://raw.githubusercontent.com/ProvTools/prov-cpl/8150ee315abc21712b49da2bf4cfdbf308eef1d7/bindings/python/RestAPI/cpl-rest.py'" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused why this script would download a file from a repository that has already been cloned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The postgres
user doesn't have access to /home/vagrant
which is where the git repo has been cloned.
Closing. As explained at #4364 (comment) the RPM will be built using a Vagrantfile in the prov-cpl repo. The RPM will likely be hosted as a binary uploaded to a release of that repo. Installation instructions will reside in the prov-cpl repo. |
connects to #4364 As a Dataverse Installation Administrator, I want to set up the Provenance system to run alongside Dataverse so that researchers using my installation can track and view provenance