Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dockerize basgra #2566

Merged
merged 21 commits into from
Apr 24, 2020
Merged

Dockerize basgra #2566

merged 21 commits into from
Apr 24, 2020

Conversation

istfer
Copy link
Contributor

@istfer istfer commented Apr 11, 2020

Description

I want to work with BASGRA in the docker version. As explained here, BASGRA code lives in the PEcAn package.

I made these changes and rebuild the images locally (successfully). Not sure if it means everything will work as expected, but I don't know how to proceed to test.

Review Time Estimate

  • Immediately
  • Within one week
  • When possible

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My change requires a change to the documentation.
  • I have updated the CHANGELOG.md.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@istfer istfer requested a review from robkooper April 11, 2020 13:00
@istfer
Copy link
Contributor Author

istfer commented Apr 11, 2020

Q: Documentation says "It is important values for type and version are set correct" does this mean they should be same as DB entry of non-docker versions, or should they just be consistent within themselves?

If the former, I had BASGRA_N_v1.0 as a version, and docker complains that it must be lowercase:

invalid argument "pecan/model-basgra-BASGRA_N_v1.0:testing" for "-t, --tag" flag: invalid reference format: repository name must be lowercase

In that case, original record is on BU servers, should I change the record there and wait for it to propagate and/or change locally?

@robkooper
Copy link
Member

Looks like M models/basgra/man/run_BASGRA.Rd has changed:

diff --git a/models/basgra/man/run_BASGRA.Rd b/models/basgra/man/run_BASGRA.Rd
index dcd2ae9..d8a2e58 100644
--- a/models/basgra/man/run_BASGRA.Rd
+++ b/models/basgra/man/run_BASGRA.Rd
@@ -4,8 +4,16 @@
 \alias{run_BASGRA}
 \title{run BASGRA model}
 \usage{
-run_BASGRA(run_met, run_params, site_harvest, start_date, end_date, outdir,
-  sitelat, sitelon)
+run_BASGRA(
+  run_met,
+  run_params,
+  site_harvest,
+  start_date,
+  end_date,
+  outdir,
+  sitelat,
+  sitelon
+)
 }
 \arguments{
 \item{run_met}{path to CF met}

docker.sh Outdated Show resolved Hide resolved
istfer and others added 7 commits April 12, 2020 10:10
Co-Authored-By: Rob Kooper <kooper@illinois.edu>
Co-Authored-By: Rob Kooper <kooper@illinois.edu>
Co-Authored-By: Rob Kooper <kooper@illinois.edu>
@istfer
Copy link
Contributor Author

istfer commented Apr 12, 2020

@infotroph do I need to change Rcheck_reference.log to pass travis?

./scripts/time.sh " models/basgra" Rscript scripts/check_with_errors.R models/basgra
make__models_basgra
1 warnings found in models/basgra.
1 notes found in models/basgra.
── R CMD check comparison───────────── PEcAn.BASGRA 1.7.1 / 1.7.1 ────
Status: BROKEN
── Still failing
✖ checking compilation flags in Makevars ... WARNING
── Newly failing
✖ checking top-level files ... NOTE
R check of models/basgra reports the following new problems. Please fix these and resubmit:
checking top-level files ... NOTE
Non-standard files/directories found at top level:
  ‘Dockerfile’ ‘model_info.json’
Error: Please fix these and resubmit.
Execution halted
Makefile:139: recipe for target '.check/models/basgra' failed
make: *** [.check/models/basgra] Error 1
make: *** Waiting for unfinished jobs....
5 warnings found in modules/uncertainty.
3 notes found in modules/uncertainty.
Function make__modules_uncertainty takes 0 min 38 sec
echo `date` > .check/modules/uncertainty
The command "scripts/travis/script.sh" exited with 1.
cache.2
store build cache

@istfer
Copy link
Contributor Author

istfer commented Apr 12, 2020

@robkooper thanks a lot for the review and suggestions. I am now able to run docker.sh with the BASGRA_N_v1.0 version.

As I didn't specify any IMAGE_VERSION while running docker.sh, images were built with testing tag. I set PECAN_VERSION=testing in .env file, added basgra to docker-compose and had a basgra container up pecan/model-basgra-basgra_n_v1.0:testing

But I'm still not sure how to run the model in practice. It doesn't show up on the web interface. Documentation says "The PEcAn code will use these to register the model with the BETY database" but at which stage this happens? (I didn't rebuild bety image, so all my other containers have testing tag but bety has latest tag) When I go into the pecan/bety:latest container and run the code below, I still don't see basgra for example:

psql -d bety -h postgres -p 5432 -U postgres -c "SELECT * FROM models LEFT JOIN dbfiles ON dbfiles.container_id = models.id LEFT JOIN machines ON dbfiles.machine_id = machines.id WHERE machines.hostname = 'docker';"

@istfer
Copy link
Contributor Author

istfer commented Apr 12, 2020

Is this the correct way to add basgra to docker-compose.yml?

 # PEcAn basgra model runner
  basgra:
    image: pecan/model-basgra-basgra_n_v1.0:${PECAN_VERSION:-latest}
    restart: unless-stopped
    networks:
      - pecan
    environment:
      - RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
    depends_on:
       - rabbitmq
    volumes:
      - pecan:/data

Or is this not the way to go at all, beacuse basgra code is compiled with pecan code?

@robkooper
Copy link
Member

That should be correct. For the model registration to work you need the following pieces running:

  • the model container (duh)
  • rabbitmq
  • monitor
    -postgres with BETY database

Each model will send out a heartbeat every 5 minutes using rabbutnq. This heartbeat contains the model_info.json. Anybody can listen to these heartbeats and see what models are active.
The monitor container will listen for these heartbeats and create a GUI of all models it has seen, as well as register the model with the BETY database.

So from startup to actual registration it might take about 5 minutes, you should be able to see all models as well at: http://server:9999, or http://server/monitor

@infotroph
Copy link
Member

infotroph commented Apr 12, 2020

do I need to change Rcheck_reference.log to pass travis?

I think you can handle the new complaints by creating an .Rbuildignore file (same principle and similar syntax as .gitignore) that lists Dockerfile and model_info.json as files to leave out when building the R package.

...While I'm thinking about it: Are we expecting all models to have a model_info.json going forward and will it live in the package root (rather than inst/ like register.*.xml did)? If so we should add samples to models/template/.

@istfer
Copy link
Contributor Author

istfer commented Apr 12, 2020

@infotroph thanks, I'll try to fix that tomorrow

@robkooper thanks, looks like my monitor currently keeps restarting, meaning it keeps going down? I'll try to see why (hint: it still happens when I remove basgra section from my docker-compose.yml, but it restarts and stays up when I comment out PECAN_VERSION=testing in the .env file, I then re-added basgra while PECAN_VERSION=testing remains commented out but hardcoded image: pecan/model-basgra-basgra_n_v1.0:testing in the docker-compose.yml, this way it shows up on the monitor but still not on the UI dropdown... what might have gone wrong for monitor when I ran ./docker.sh locally ), but if you have any ideas I'd love to hear it.

Screenshot from 2020-04-12 19-17-12

@robkooper
Copy link
Member

What do you see when you type docker-compose logs monitor?

@istfer
Copy link
Contributor Author

istfer commented Apr 13, 2020

Here are the log files in my cases:

OK.log: when monitor container stays up (but there are still errors, e.g. Error adding model to database)

NOTOK.log: when monitor keeps restarting. I guess in this one the main problem is:

  File "monitor.py", line 250, in insert_model
    if conn is not None:
UnboundLocalError: local variable 'conn' referenced before assignment

Looking at previous version of monitor.py there used to be a conn = None line. So I went ahead and inserted conn = None in monitor.py-Line150 and it seemed to solve this particular problem.

FIX.log is what I get after conn = None fix

Not sure what to do about this invalid port number issue:

monitor_1        | 2020-04-13 09:12:51,256 [MainThread     ] ERROR   : root - Error adding model to database
monitor_1        | Traceback (most recent call last):
monitor_1        |   File "monitor.py", line 155, in insert_model
monitor_1        |     conn = psycopg2.connect(postgres_uri)
monitor_1        |   File "/usr/local/lib/python3.5/site-packages/psycopg2/__init__.py", line 130, in connect
monitor_1        |     conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
monitor_1        | psycopg2.OperationalError: invalid port number: "postgres"

Not sure if relevant but this is what I had in my docker-compose.override.yml:

  postgres:
    ports:
      - 7654:5432

Also could be related, I'm not sure why, but after I rebuilt images my bety container name became something like 59e40a881709_pecan_postgres_1 with hash in the front:
59e40a881709 mdillon/postgis:9.5 "docker-entrypoint.s…" 3 days ago Up About a minute 0.0.0.0:7654->5432/tcp 59e40a881709_pecan_postgres_1

Btw, is this a typo? PGHOST=posgres --> PGHOST=postgres

@istfer
Copy link
Contributor Author

istfer commented Apr 14, 2020

to work around this psycopg2.OperationalError: invalid port number: "postgres" problem, I hardcoded the port in this line:

postgres_port = os.getenv('PGPORT', 5432)

recreated the image (./docker.sh) and it worked, at least basgra showed up from the dropdown menu. Then I started to deal with this permission issues and couldn't test further but I'll get back to it once bring my docker environment back up.

But where is the code supposed to get this PGPORT from? one of the config files? .env file? The only place I see it being set up is here. Also, was the line in monitor.py above supposed to have 5432 all along?

@istfer
Copy link
Contributor Author

istfer commented Apr 14, 2020

After going back to original volume and keeping the changes I made to the monitor.py, I created the basgra image and rebuilt all the images and the whole stack.

Now basgra show up in the drop down menu, record gets inserted into the DB (I can now check it from the bety web interface as well 👍)

I got some errors in the met.process which may or maynot be related to docker, I'll dig deeper later.

But then put some already processed met files to an acessible path, just to move along the workflow.
I ended up with this error in the run stage:

DEBUG [start.model.runs] : JOB.SH submit status: python3: can't open file '/work/sender.py': [Errno 2] No such file or directory 

there is no sender.py under /work indeed, hmmm what might have gone wrong?

edit: it's a bit weird, I was trying to run it through RStudio, and when I navigate there to /work there is no sender.py, bu when I go inside the executor on terminal (docker exec -it pecan_executor_1 /bin/bash) and navigate to /work I can see sender.py. Both services have the same volumes mounted in the docker-compose.yml and docker-compose.override.yml

@robkooper robkooper mentioned this pull request Apr 14, 2020
@robkooper
Copy link
Member

sender.py should come from pecan/models docker image.

@istfer
Copy link
Contributor Author

istfer commented Apr 15, 2020

I am not sure if I follow what you mean by

sender.py should come from pecan/models docker image

I have a successful run when I give the same exact .xml through the web interface, then I open Rstudio, navigate to the output dir, e.g. /data/workflows/PEcAn_15000000006, then I read the pecan.CONFIGS.xml to give another run through RStudio and can't. Could be another permission issue, looking at /work folder and PEcAn_15000000006 directory, their contents are owned by the root but I'm trying to use RStudio as carya

@istfer
Copy link
Contributor Author

istfer commented Apr 17, 2020

I guess this PR in itself is complete, in practice I was able to get a successful run from the web interface. Looks like permission issues need to have their own PR.

This PR also fixes and closes the monitor bugs (if these are the fixes we want)

One question tho: should I leave basgra in the docker-compose.yml? Will the basgra image be available once this PR goes in or should I push it somehow separately? (currently I built the image locally) I also don't see biocro on the docker-compose.yml

@istfer istfer changed the title WIP: Dockerize basgra Dockerize basgra Apr 17, 2020
@@ -258,6 +258,19 @@ services:
# PEcAn models, list each model you want to run below
# ----------------------------------------------------------------------

# PEcAn basgra model runner
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is fine to have it here since the image will automagically be build by the docker.sh file that is part of the pull request.

# Setup model_info file
COPY model_info.json /work/model.json
RUN sed -i -e "s/@VERSION@/${MODEL_VERSION}/g" \
-e "s#@BINARY@#/usr/local/lib/R/site-library/PEcAn.BASGRA/libs/PEcAn.BASGRA.so#g" /work/model.json
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably want to figure out a better way to do the binaries for models that are in R.

],
"inputs": {},
"bibtex": [
"Hoglind M, van Oijen M, Cameron D, Persson T (2016) Process-based simulation of growth and overwintering of grassland using the BASGRA model. Ecological Modelling 335:1-15.",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

glad this is used, need to figure out a better way to show this in the interface in the future.

Copy link
Member

@robkooper robkooper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add an entry to ChangeLog.md, otherwise it is good to go.

@robkooper
Copy link
Member

@istfer Going to squash the pull request and merge it if that is ok so we end up with a single commit instead of many commits.

@istfer
Copy link
Contributor Author

istfer commented Apr 24, 2020

that's perfectly fine

@robkooper robkooper merged commit cdb94d4 into PecanProject:develop Apr 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants