Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ansible install question - seems to be failing on last glassfish4 step #9

Closed
jmjamison opened this issue Nov 1, 2017 · 69 comments
Closed

Comments

@jmjamison
Copy link

I've been able to use the vagrant install with no trouble. But running the ansible script I do get a fatal error on the last glassfish step.

fatal: [dataverse]: FAILED! => {"changed": true, "cmd": "unzip -d /tmp /tmp/glassfish-4.1.zip", "delta": "0:00:00.109161", "end": "2017-10-31 23:23:59.418016", "failed": true, "rc": 1, "start": "2017-10-31 23:23:59.308855", "stderr": "replace /tmp/glassfish4/bin/asadmin? [y]es, [n]o, [A]ll, [N]one, [r]ename: NULL\n(EOF or read error, treating as "[N]one" ...)", "stdout": "Archive: /tmp/glassfish-4.1.zip\n creating: /tmp/glassfish4/.org.opensolaris,pkg/download/\n creating: /tmp/glassfish4/.org.opensolaris,pkg/file/", "stdout_lines": ["Archive: /tmp/glassfish-4.1.zip", " creating: /tmp/glassfish4/.org.opensolaris,pkg/download/", " creating: /tmp/glassfish4/.org.opensolaris,pkg/file/"], "warnings": ["Consider using unarchive module rather than running unzip"]}

I can log in and check all the services in the readme document but am unable to log into the dataverse app (as I did with the vagrant install on my laptop).

Thanks,
Jamie Jamison
UCLA Social Science Data Archive
jamison@library.ucla.edu

@pdurbin
Copy link
Member

pdurbin commented Nov 1, 2017

I'm glad to hear the vagrant install works, at least. 😄

I added a note to myself to ping @donsizemore in IRC tomorrow. Maybe @pameyer too.

@donsizemore
Copy link
Contributor

it'll be an ansible version thing - which is why it works in the vagrant box but not in a bare OS. probably time to update the vagrant image and deal with whatever breaks.

@pdurbin
Copy link
Member

pdurbin commented Nov 2, 2017

Oh, I forgot that this repo has a Vagrantfile too. I assumed @jmjamison was talking about the Vagrantfile from the main Dataverse repo.

@jmjamison while this gets sorted out, if you're interested, I'm looking for people to try out NDS Labs Workbench and leave comments with how it went at IQSS/dataverse#4152 😄

@jmjamison
Copy link
Author

jmjamison commented Nov 2, 2017

Both vagrant and ansible are on the same repo. And I'll try out IQSS/dataverse#4152.

@donsizemore
Copy link
Contributor

@jmjamison Just out of curiosity, what version of Ansible are you using? EPEL 7 has bumped to 2.4.0, Ubuntu 16.04 provides 2.0.0. Just trying to gauge which deprecated modules to ignore ;->

@donsizemore
Copy link
Contributor

Hmmm. works for me using EPEL-provided ansible-2.4.0

@jmjamison
Copy link
Author

I'm using ansible 2.2.1.0 on Ubuntu. Maybe I should update ansible.

@jmjamison
Copy link
Author

I've update ansible to: ansible 2.4.1.0. I tried to rerunning the install and got this error:
fatal: [dataverse]: FAILED! => {"changed": true, "cmd": "unzip -d /tmp /tmp/glassfish-4.1.zip", "delta": "0:00:00.103068", "end": "2017-11-03 20:59:00.779011", "failed": true, "msg": "non-zero return code", "rc": 1, "start": "2017-11-03 20:59:00.675943", "stderr": "replace /tmp/glassfish4/bin/asadmin? [y]es, [n]o, [A]ll, [N]one, [r]ename: NULL\n(EOF or read error, treating as "[N]one" ...)", "stderr_lines": ["replace /tmp/glassfish4/bin/asadmin? [y]es, [n]o, [A]ll, [N]one, [r]ename: NULL", "(EOF or read error, treating as "[N]one" ...)"], "stdout": "Archive: /tmp/glassfish-4.1.zip\n creating: /tmp/glassfish4/.org.opensolaris,pkg/download/\n creating: /tmp/glassfish4/.org.opensolaris,pkg/file/", "stdout_lines": ["Archive: /tmp/glassfish-4.1.zip", " creating: /tmp/glassfish4/.org.opensolaris,pkg/download/", " creating: /tmp/glassfish4/.org.opensolaris,pkg/file/"]}

I'll go over this and see if I can figure out the new error.

@jmjamison
Copy link
Author

The task it seems to be failing at is: < TASK [dataverse : unzip glassfish] >

I also get a warning before the fatal message:
[WARNING]: Consider using unarchive module rather than running unzip

@pameyer
Copy link

pameyer commented Nov 3, 2017

@jmjamison it's looking to me like there's been a previous unzip of the zip file in the tmp directory, the ansible task isn't checking to see if the directory has been created before running, and it's failing waiting on input that never comes for the question of overwriting or not.

So possibly adding -o to the current unzip -d /tmp ... would help; or removing any /tmp/glassfish* directories.

@jmjamison
Copy link
Author

I removed the /tmp/glassfish directories and tried again. It almost worked but found another instance of GlassFIsh server so I hand't cleaned up everything. Will do so and try again though I think updating my version of ansible was all that was needed.

@jmjamison
Copy link
Author

jmjamison commented Nov 3, 2017

I'm running into a problem with port 4848. I've tried changing that in the domain.xml file which didn't work. Next I'm going to see if I can change that in AWS and install again.

tcp6 0 0 :::4848 :::* LISTEN -

@jmjamison
Copy link
Author

On a fresh aws instance I'm still failing at Glassfish:
fatal: [dataverse]: FAILED! => {"changed": true, "cmd": "nohup /usr/local/glassfish4/bin/asadmin deploy /tmp/dvinstall/dataverse.war", "delta": "0:02:26.683220", "end": "2017-11-04 00:46:19.298230", "failed": true, "msg": "non-zero return code", "rc": 1, "start": "2017-11-04 00:43:52.615010", "stderr": "Remote server does not listen for requests on [localhost:4848]. Is the server up?\nUnable to get remote commands. \nClosest matching local command(s): \n help", "stderr_lines": ["Remote server does not listen for requests on [localhost:4848]. Is the server up?", "Unable to get remote commands. ", "Closest matching local command(s): ", " help"], "stdout": "Command deploy failed.", "stdout_lines": ["Command deploy failed."]}

@pdurbin
Copy link
Member

pdurbin commented Nov 4, 2017

Is the server up?

If it helps, the way the Dataverse installer checks if Glassfish is up or not is with asadmin list-domains: https://github.com/IQSS/dataverse/blob/v4.8.1/scripts/installer/install#L1057

@jmjamison I'm wondering if Ansible is a hard requirement for you. I'm all for reproducible, but as a first cut maybe you could go through the vanilla non-Ansible process described in the Installation Guide at http://guides.dataverse.org

Just a thought. I'm also wondering if you got a chance to try our that NDS Labs Workbench. 😄

@jmjamison
Copy link
Author

Actually we do need a script-able install. Sorry but I haven't gotten to far yet with NDS labs workbench.

@pdurbin
Copy link
Member

pdurbin commented Nov 7, 2017

@jmjamison that's cool. There's a non-interactive mode described at http://guides.dataverse.org/en/4.8/installation/installation-main.html if that's of interest.

@jmjamison
Copy link
Author

Progress - a new error! Got past Glassfish problems. From reading posts it seems that I needed to ad my hostname to the /etc/hosts file. That worked. Then I got:
fatal: [dataverse]: FAILED! => {"changed": true, "cmd": "unzip -d /tmp /tmp/dvinstall.zip", "delta": "0:00:00.021204", "end": "2017-11-08 21:14:58.360153", "failed": true, "msg": "non-zero return code", "rc": 1, "start": "2017-11-08 21:14:58.338949", "stderr": "replace /tmp/dvinstall/data/dv-uma-sub2.json? [y]es, [n]o, [A]ll, [N]one, [r]ename: NULL\n(EOF or read error, treating as "[N]one" ...)", "stderr_lines": ["replace /tmp/dvinstall/data/dv-uma-sub2.json? [y]es, [n]o, [A]ll, [N]one, [r]ename: NULL", "(EOF or read error, treating as "[N]one" ...)"], "stdout": "Archive: /tmp/dvinstall.zip", "stdout_lines": ["Archive: /tmp/dvinstall.zip"]}

@donsizemore
Copy link
Contributor

@jmjamison the ansible dataverse role isn't idempotent as ansible aims to be, mostly because the glassfish / dataverse installation process is entirely different from subsequent upgrades. are you wiping out /usr/local/glassfish4 or its equivalent in between runs? the role pretty much expects a clean OS.

@jmjamison
Copy link
Author

jmjamison commented Nov 8, 2017 via email

@jmjamison
Copy link
Author

Given the problems I'm going to try to install directly on the server (localhost) rather than ssh. Has anyone been able to get the script to run successfully that way?

@pdurbin
Copy link
Member

pdurbin commented Nov 9, 2017

Given the problems I'm going to try to install directly on the server (localhost) rather than ssh.

I'm not sure what you mean by "rather than ssh". It's quite common to ssh into a server and run the Dataverse installer. Are you talking about ssh vs. attaching directly to a physical console (KVM?) It shouldn't matter.

@jmjamison
Copy link
Author

This is the task where I'm having the problem.
First this:
/ TASK [dataverse : install glassfish upstart script for
\ Debian/Ubuntu] /

    \   ^__^
     \  (oo)\_______
        (__)\       )\/\
            ||----w |
            ||     ||

name: install glassfish upstart script for Debian/Ubuntu
template: src=glassfish.conf.j2 dest=/etc/init
owner=root group=root mode=0644
when: ansible_os_family == "Debian"
tags: glassfish

skipping: [dataverse] => {"changed": false, "skip_reason": "Conditional result was False", "skipped": true}

I presume it is false in that I'm installing on RedHat, then
/ TASK [dataverse : install glassfish upstart script for
\ Debian/Ubuntu] /

    \   ^__^
     \  (oo)\_______
        (__)\       )\/\
            ||----w |
            ||     ||

skipping: [dataverse] => {"changed": false, "skip_reason": "Conditional result was False", "skipped": true}
which runs:
name: start glassfish with asadmin so subsequent Ansible-initiated restarts succeed on RedHat/CentOS
become: yes
become_user: "{{ dataverse.glassfish.user }}"
shell: "nohup {{ dataverse.glassfish.root }}/bin/asadmin start-domain"
when: ansible_os_family == "RedHat" and
ansible_distribution_major_version == "7"
tags: glassfish

and gives this ubiquitous error:
name: start glassfish with asadmin so subsequent Ansible-initiated restarts succeed on RedHat/CentOS
become: yes
become_user: "{{ dataverse.glassfish.user }}"
shell: "nohup {{ dataverse.glassfish.root }}/bin/asadmin start-domain"
when: ansible_os_family == "RedHat" and
ansible_distribution_major_version == "7"
tags: glassfish

Thats the part I'm still trying to research.

@pdurbin
Copy link
Member

pdurbin commented Nov 14, 2017

@donsizemore what's with all the cowsay? 😄

@jmjamison
Copy link
Author

Sorry, I need to output more before posting.

@pdurbin
Copy link
Member

pdurbin commented Nov 14, 2017

@jmjamison it's all good. Did you see my comment the other day about how the Dataverse installer has a non-interactive mode? @pameyer developed it and I asked him about it. My understanding it that he mostly uses Ansible (his own code, not this repo) to install prerequisites like Postgres and such.

I'm not sure if this comment is helpful or not. In Vagrant I do all my setup with shell scripts ( https://github.com/IQSS/dataverse/tree/v4.8.2/scripts/vagrant ). I haven't really gotten into Ansible yet (but I did use Puppet back in the day).

@pdurbin
Copy link
Member

pdurbin commented Nov 14, 2017

@jmjamison I was just chatting a bit with @donsizemore at http://irclog.iq.harvard.edu/dataverse/2017-11-14#i_60164 (you're welcome to join us at http://chat.dataverse.org !) and I think the fundamental problem might be Ubuntu. Would you be able to try on CentOS? As a project, Dataverse has committed resources getting Dataverse to run on RHEL/CentOS but not Debian/Ubuntu (please see IQSS/dataverse#1059 ). @donsizemore is a member of the community and has made a valiant attempt to support Ubuntu in this Ansible playbook but as he said in IRC just now (see link above), "i never coded it for Ubuntu/Debian. the Readme.md says CentOS 7 and means it".

I will say that there are people out in the community running Dataverse on Ubuntu, perhaps in production. The best way to get in touch with them is to ask at https://groups.google.com/forum/#!forum/dataverse-community . I hope this helps and that it's not too much of a downer that Dataverse supports RHEL/CentOS rather than supporting both RHEL/CentOS and Ubuntu. We had to pick one due to limited resources. 😞

@jmjamison
Copy link
Author

I tried out Ubuntu to see if there was a difference. Mostly I've been trying this on RHEL 7 but I will try CentOS.

@pdurbin
Copy link
Member

pdurbin commented Nov 14, 2017

@jmjamison oh! You started with RHEL and it didn't work and then you tried Ubuntu? I'm surprised it didn't work on RHEL.

@jmjamison
Copy link
Author

Worked ok on Vagrant but not Ansible. Problem with Glassfish. I still think it might be something I'm doing.

@pdurbin
Copy link
Member

pdurbin commented Nov 15, 2017

@jmjamison do you think doing a standard install will help you better understand and troubleshoot the Ansible scripts? I'm sure @donsizemore would welcome a pull request if you figure out a solution.

@pameyer
Copy link

pameyer commented May 24, 2018

When the script was paused, could you get a response at 8080 with the default "welcome to glassfish message"?

@jmjamison
Copy link
Author

So far I'd only tried 1 minute and will try 180 next.

Haven't checked for the default welcome. I have to pack up but will try both tomorrow.

@jmjamison
Copy link
Author

I've tried up to 5 minutes and it is still failing the same way.

@pdurbin
Copy link
Member

pdurbin commented May 24, 2018

Bah! Any ideas, @donsizemore or @danschmidt5189 ?

@pdurbin
Copy link
Member

pdurbin commented Jun 5, 2018

@jmjamison we're having trouble reproducing this but we've been talking about it in IRC the last couple days.

@pdurbin
Copy link
Member

pdurbin commented Jun 8, 2018

@jmjamison hi, just a heads up that dataverse-ansible now installs Dataverse 4.9 if you want to give it another try. Also, you're welcome to join me, @donsizemore and @pameyer at http://chat.dataverse.org some day if you're still having trouble. We're all on Eastern time.

@jmjamison
Copy link
Author

jmjamison commented Jul 19, 2018

It finally ran to completion once. One issue I have run into is that Glassfish by default looks in the applications folder for the war file. So in the dataverse-install.yml file I added a line to copy the war file over to the glassfish applications directory. Given my inexperience with Glassfish I didn't want to start changing it's configuration.

So far I'm back to different Glassfish problems. It may be that I'm not getting everything cleaned out correctly between playbook runs.

@jmjamison
Copy link
Author

Also, ran into an odd problem today: http://guides.dataverse.org/en/latest/_static/installation/files/issues/2180/grizzly-patch/glassfish-grizzly-extra-all.jar does not seem to be reachable. I need to see if it is our connection that's causing this.

@pdurbin
Copy link
Member

pdurbin commented Jul 20, 2018

http://guides.dataverse.org/en/latest/_static/installation/files/issues/2180/grizzly-patch/glassfish-grizzly-extra-all.jar does not seem to be reachable.

This is due to https://rce-docs.hmdc.harvard.edu/event/rce-and-all-other-hmdc-servers-down-due-planned-building-power-outage and it should be back up in 24 hours or so.

Not sure what to say about between runs. Sounds like progress!

@jmjamison
Copy link
Author

That was it. I didn't realize there was an outage. The vagrant install works now but (don't laugh) I have to figure out what the admin login is. Otherwise vagrant works.

For the ansible install I still have to work out the glassfish start or restart.

@pdurbin
Copy link
Member

pdurbin commented Jul 20, 2018

@jmjamison no problem. Please see http://guides.dataverse.org/en/4.9.1/installation/installation-main.html#logging-in for the login:

  • username: dataverseAdmin
  • password: admin

@jmjamison
Copy link
Author

Thank you

@jmjamison
Copy link
Author

jmjamison commented Jul 23, 2018

I've been able to get the glassfish install to run. But I have to manually kill the glassfish process and delete the files mentioned in the documentation for rerunning install. It doen't heart to reboot between reruns.

Now running into solr issues. The configsets directory seems to be missing from the solr 4.6.0 zip file. The install fails when ‘/usr/local/solr/server/solr/configsets/_default is copied to /usr/local/solr/server/solr/collection1

Also, I am not sure that 4.6.0 is even the correct version to be using but that is what the script is downloading. 4.6.0 is the version listed in my ~/defaults/main.yml file. I should probably do a pull from here since my fork may be out of date.

@pdurbin
Copy link
Member

pdurbin commented Jul 24, 2018

@jmjamison yeah, I would suggest doing a pull because @donsizemore already upgraded to the newer version of Solr as part of #17.

@jmjamison
Copy link
Author

Update the repo. I'm back to glassfish problems - failing because there is another service using port 4848. I can't find anything that might be hold the port. Tried both restart and start. I'm going to have to do more glassfish research.

@pameyer
Copy link

pameyer commented Jul 24, 2018

4848 should be the glassfish admin port

@jmjamison
Copy link
Author

Yes, when I've had glassfish install run to completion thats the port I use to log into the web interface.
What I get from netstat is:
tcp6 0 0 :::4848 :::* LISTEN 7014/java
The 7014 is:
root 7014 0.1 17.0 4220828 172700 ? Sl Jul23 1:20 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.171-8.b10.el7
That would be the java that ansible installs?

By the way, do you prefer posting here or the irc channel?

@pameyer
Copy link

pameyer commented Jul 24, 2018

Sometimes glassfish doesn't stop cleanly when asked to (systemd, init or asadmin) - I usually check the ps output and kill it if necessary.

I'm usually on irc (along with other folks who might have better suggestions)

@jmjamison
Copy link
Author

Actually I do check and kill it manually but this time I can't find it.

@pameyer
Copy link

pameyer commented Jul 24, 2018

@jmjamison I might be interpreting the netstat output incorrectly, but it looks to me like it's listening on an IPv6 address. Would it be practical for you do disable IPv6 (aka - just use IPv4) on the system for troubleshooting?

@jmjamison
Copy link
Author

I have to do some research. I've never done that so will look up disablig ip6 and try that.

@pdurbin
Copy link
Member

pdurbin commented Jul 25, 2018

@jmjamison hi! Sorry I missed you at http://irclog.iq.harvard.edu/dataverse/2018-07-24#i_70628 but my reply was "Are you still on Solr 4.x? collection1 exists out of the box on Solr 4.x. In Solr 7.x we have to create it."

@jmjamison
Copy link
Author

jmjamison commented Jul 26, 2018

I'm now using the latest. I've updated upstream so am up to date I believe. Also, disabled ipv6 which seems to be helping or solving the glassfish problem.

Solr version is 7.3. Its failing on the last step in dataverse-solr.yml (120-21). Creates Collection1 then fails for not being able to delete it.

It appears that I have to be sure to manuall kill both glassfish and solr between reruns.

Moving forward - now I'm failing on the deploy step.

@jmjamison
Copy link
Author

jmjamison commented Aug 1, 2018

I've tried deploying the war file from the glassfish web interface and on the command line as well as the script. It looks like glassfish crashes while deploying though I'm able to start it back up.

Tried downloading the dataverse installation zip file and running that. The error for that is that it can't connect to the local instance of PostgresQL as admin user. Postgres needs ipv6, but I've disabled ipv6 so I could get glassfish to install. For tomorrow I guess I'll try re-enabling ipv6 (since glassfish is up) and go from there.

@jmjamison
Copy link
Author

Ok, problem solved (in an embarrassing manner). Memory I did not read the fine manual carefully enough. Vagrant ran fine because the script sets up necessary memory but my laptop has more than enough. My first time working on AWS and I didn't choose a large enough server. Install worked on a t2.large.

@pdurbin
Copy link
Member

pdurbin commented Aug 6, 2018

@jmjamison great news! I saw at http://irclog.iq.harvard.edu/dataverse/2018-08-02#i_70858 that you're having trouble uploading files, though. Do you still need any help?

@pdurbin
Copy link
Member

pdurbin commented Aug 6, 2018

@jmjamison nevermind! I see that you opened #20 about the file upload problem.

pallinger pushed a commit to dsd-sztaki-hu/dataverse-ansible-deprecated that referenced this issue Jan 28, 2022
pallinger pushed a commit to dsd-sztaki-hu/dataverse-ansible-deprecated that referenced this issue Jan 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants