Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use dockerized Fuseki for PHPUnit tests #1496

Merged
merged 5 commits into from
Sep 1, 2023

Conversation

osma
Copy link
Member

@osma osma commented Aug 25, 2023

Reasons for creating this PR

Modernizing the test suite. See #1488

Link to relevant issue(s), if any

Description of the changes in this PR

See commit 817dd97 for the changes that are specific to this PR (not just copied from #1495).

  • rewritten init_fuseki.sh to use docker compose to start containers (instead of downloading, installing and starting up Fuseki from a tarball distribution) and regular curl commands for loading the vocabularies (instead of s-put that comes with the Fuseki tarball).
  • changed the PHPUnit tests to use the dockerized Fuseki at http://localhost:9030/skosmos/sparql instead of http://localhost:13030/skosmos-test/sparql that was used in the previous configuration.
  • Fuseki configuration (for the container) changed to enable unionDefaultGraph, because it is currently required by Skosmos for some functionality (see The REST API types query relies on union default graph #678) and some tests will fail if it's not enabled.
  • adjusted GitHub Actions CI configuration a little (no need to use source for running init_fuseki.sh anymore)

Known problems or uncertainties in this PR

  1. After starting up containers, the new init_fuseki.sh script sleeps for 5 seconds before trying to load vocabularies. I had to add this as a stopgap because Fuseki wasn't always immediately ready to handle the queries. But I think a more elegant mechanism would be needed here, for example waiting in a loop that checks if Fuseki is ready.
  2. The docker compose command used here starts up three containers (skosmos, skosmos-cache, fuseki) but the PHPUnit tests only need the fuseki container. However, we will soon need the others as well for running Cypress tests (see e.g. Run cypress #1479), so I think it's not too bad.
  3. One of the tests is failing. This is JenaTextSparql::testQueryConceptsAlphabeticalOrderBy which checks for locale-aware collation in Jena SPARQL queries. It seems to me that collation is not currently working in the dockerized Fuseki - perhaps locale data is missing from the container?

I tested this also directly with the example query from the Jena ARQ Collation documentation. Basically I put this in collation.rq:

PREFIX arq: <http://jena.apache.org/ARQ/function#>
SELECT ?label WHERE {
    VALUES ?label { "tsahurin kieli"@fi "tšekin kieli"@fi "tulun kieli"@fi "töyhtöhyyppä"@fi }
}
ORDER BY arq:collation("fi", ?label)

then ran it against the Fuseki Docker container using rsparql (from Jena) and the result was this:

$ rsparql --query collation.rq --service http://localhost:9030/skosmos/sparql
-----------------------
| label               |
=======================
| "töyhtöhyyppä"@fi   |
| "tsahurin kieli"@fi |
| "tšekin kieli"@fi   |
| "tulun kieli"@fi    |
-----------------------

This is not the correct, locale-aware order; tsahurin kieli should be first and töyhtöhyyppä last. So I think that the Fuseki container configuration should be modified to enable support for locale-aware collation. @kinow would you be able to help with this?

Checklist

  • phpUnit tests pass locally with my changes
  • I have added tests that show that the new code works, or tests are not relevant for this PR (e.g. only HTML/CSS changes)
  • The PR doesn't reduce accessibility of the front-end code (e.g. tab focus, scaling to different resolutions, use of .sr-only class, color contrast)
  • The PR doesn't introduce unintended code changes (e.g. empty lines or useless reindentation)

@osma osma added this to the 3.0 milestone Aug 25, 2023
@osma osma self-assigned this Aug 25, 2023
Copy link
Collaborator

@kinow kinow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After starting up containers, the new init_fuseki.sh script sleeps for 5 seconds before trying to load vocabularies. I had to add this as a stopgap because Fuseki wasn't always immediately ready to handle the queries. But I think a more elegant mechanism would be needed here, for example waiting in a loop that checks if Fuseki is ready.

I will have time to test it in some days or in one or two weeks. But with a healthcheck we should be able to remove the sleep 5.

https://docs.docker.com/compose/compose-file/compose-file-v3/#healthcheck

@osma
Copy link
Member Author

osma commented Aug 25, 2023

I will have time to test it in some days or in one or two weeks. But with a healthcheck we should be able to remove the sleep 5.

Thanks @kinow . This is not a big problem, I can probably find a better solution myself and the sleep 5 works for now.

The bigger issue from my perspective is the lack of support for locale-aware collation in the Fuseki container. I think the Jena Fuseki Dockerfile needs to be modified, perhaps installing more locale packages and/or generating locales. But it's based on Alpine Linux and I'm not very familiar with it. Do you have an idea how to approach this?

@kinow
Copy link
Collaborator

kinow commented Aug 25, 2023

For Alpine I normally start a container with alpine and then do an apk search $pkg, apk add $pkg. Then if it works I update the dockerfile.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, now it clicked to me. We will have to modify this image to add the extra packages. Meaning it might be harder for us to upgrade it as Jena does so 😥

The best solution would really be if with every release of Jena, they also published the Docker image to Docker Hub…

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly.

It could also be argued that there is a bug, or at least a missing feature, in the Jena Fuseki Dockerfile. Jena itself supports locale-aware collation, but the container currently doesn't. Is this something that should be reported as a Jena issue? Or contributed to upstream Jena when we have figured out a fix?

I've looked a bit at Alpine locale support, as explained in this post. It seems to me that locale support in Alpine is a bit lacking, but it is possible to install a limited set of locales like in this recent gist. I didn't yet have time to test this in practice with the Jena Dockerfile.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could also be argued that there is a bug, or at least a missing feature, in the Jena Fuseki Dockerfile. Jena itself supports locale-aware collation, but the container currently doesn't. Is this something that should be reported as a Jena issue? Or contributed to upstream Jena when we have figured out a fix?

I've looked a bit at Alpine locale support, as explained in this post. It seems to me that locale support in Alpine is a bit lacking, but it is possible to install a limited set of locales like in this recent gist. I didn't yet have time to test this in practice with the Jena Dockerfile.

Yes, and if Alpine limits the installation or configuration of these locales, we could switch to another slim image, like debian-slim.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had a look and the Jena container has only the English locale. The blog & gist you linked sound like the perfect solution, @osma. I've raised an issue in Jena, with three implementation options. Let me know what you think (here or there) 👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kinow !

I'm currently playing around with this. I found a more up to date recipe for installing Alpine locale support, not specific to Docker, but seems applicable. It seems it would be nowadays possible to install locale support for Alpine without compiling anything.

Alas, I haven't been able to make ARQ collation work yet. I'm considering skipping the failing PHPUnit test for now, so that I can continue working on other aspects of the testing infrastructure and leave this for later (maybe even hoping it will get fixed on the Jena side 🤞). Thanks for opening the Jena issue!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for replying there, and for the better explanation & examples. I'll try to have a look and see if I can help, but I am in Brazil right now so timezone & family events might stop me from helping much, I am afraid 🙇

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kinow and no hurry, enjoy your time in Brazil!

I decided to skip the collation unit test for now, so that we can move on with this PR. I will open a separate Skosmos issue that tracks the collation problem on this side. I hope that it can be fixed directly in upstream Jena and then we can copy the solution to our Dockerfile as well.

@codecov
Copy link

codecov bot commented Aug 30, 2023

Codecov Report

Patch coverage has no change and project coverage change: +6.10% 🎉

Comparison is base (3cb4297) 63.89% compared to head (e541940) 70.00%.
Report is 38 commits behind head on skosmos-3.

Additional details and impacted files
@@               Coverage Diff               @@
##             skosmos-3    #1496      +/-   ##
===============================================
+ Coverage        63.89%   70.00%   +6.10%     
- Complexity        1634     1770     +136     
===============================================
  Files               32       32              
  Lines             4290     4657     +367     
===============================================
+ Hits              2741     3260     +519     
+ Misses            1549     1397     -152     

see 13 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@osma
Copy link
Member Author

osma commented Aug 30, 2023

I created a separate issue about the failing collation so that we can move on with this PR.

Also, I made init_fuseki.sh wait for Fuseki to start up using a bash while loop and curl -f. I couldn't get the docker compose healthcheck to work in a useful way, but this gets the job done and is better than blind sleep 5 which seems to fail once in a while.

@osma osma marked this pull request as ready for review August 30, 2023 08:16
@osma osma requested a review from miguelvaara August 30, 2023 08:18
@osma
Copy link
Member Author

osma commented Aug 30, 2023

Still TODO:

  • check and update the wiki documentation for running the unit tests (now that docker is required)
  • adjust github actions CI: no need to cache the fuseki installs in the traditional way
  • do we need to cache Docker images/layers instead?

@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 1 Code Smell

No Coverage information No Coverage information
0.0% 0.0% Duplication

Copy link
Contributor

@miguelvaara miguelvaara left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! 👍

When I ran init_fuseki.sh, I got the following error message:
./init_fuseki.sh: line 5: docker: command not found
Waiting for Fuseki to get ready
...waiting...
...waiting...

I installed Docker Ubuntu 22.04 packages, but the result was the same. Based on the tip you provided 👍 , I installed the official Docker packages, after which the tests passed. Although the tests had some skip parts, the intention was not to test the application itself, but to test its testing. And it works very well! 👍

However, here are some minor formatting-related suggestions:

The arrangement/style of code in the Dockerfile.ubuntu (review & code) file is inconsistent. It would be easier to read and interpret the file if it followed the same style from start to finish:

There are two newlines between the following, although usually there are no:
RUN php composer.phar install --no-dev
[newLine]
[newLine]
# Configure Skosmos

Sometimes related or similar commands are grouped together under one header, and sometimes they are not:
# skosmos layer
COPY . /var/www/html
RUN php composer.phar install --no-dev

vs.

RUN a2enmod rewrite
RUN a2enmod expires

Sometimes a single and separate command is labeled, and sometimes it's not:
# timezone
RUN sed -i 's/;date.timezone =/date.timezone = "UTC"/g' /etc/php/8.1/apache2/php.ini

vs.

COPY dockerfiles/config/000-default.conf /etc/apache2/sites-available/000-default.conf

It certainly does not matter what formatting we follow, but it does matter that it is consistent and easy to read.

@kinow
Copy link
Collaborator

kinow commented Aug 31, 2023

It certainly does not matter what formatting we follow, but it does matter that it is consistent and easy to read.

👍 good feedback! Either on this PR or later we can do a cleanup of the files for docker and improve docs, formatting, and consistency -- when I updated these recently I probably didn't take all those aspects into account 🙇

@osma osma changed the base branch from skosmos-3 to master August 31, 2023 16:27
@osma osma changed the base branch from master to skosmos-3 August 31, 2023 16:27
@osma
Copy link
Member Author

osma commented Aug 31, 2023

Thanks for testing and feedback @miguelvaara !

I see now that due to the complex branching history, this PR shows commits that were already merged into the skosmos-3 branch earlier, and also the diff view is similarly misleading. This PR actually does not change Dockerfile.ubuntu at all from how it already looks on the skosmos-3 branch. The recent changes to Dockerfile.ubuntu (upgrading to Ubuntu 22.04 and PHP 8.1) were done in PR #1498, already merged.

Rebasing this PR on top of current skosmos-3 branch should fix the commits and diff view, but I don't want to do that, because I've already opened PR #1508 which includes the same commits, and rebasing would likely cause a conflict.

I tried changing the base branch to master and back to skosmos-3, and it helped a bit, as the number of commits in this PR went from 8 to 5, but still the diff view incorrectly shows changes to Dockerfile.ubuntu which are not actually changes w.r.t. to the current skosmos-3 branch.

Would it be OK to merge this PR and deal with the formatting of Dockerfile.ubuntu (which is not related to this PR at all) separately?

@osma
Copy link
Member Author

osma commented Aug 31, 2023

Actual changes in this PR can be seen in this diff view.

@miguelvaara
Copy link
Contributor

Would it be OK to merge this PR and deal with the formatting of Dockerfile.ubuntu (which is not related to this PR at all) separately?

The code cleanup can be handled at a later time. I see no obstacle to merging 👍

@osma osma merged commit cfedf5d into skosmos-3 Sep 1, 2023
@osma osma deleted the issue1488-phpunit-fuseki-docker branch September 1, 2023 08:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done (verified in test.dev.finto.fi, set Milestone 3.0 for both issue & PR)
Development

Successfully merging this pull request may close these issues.

3 participants