Releases: gwu-libraries/sfm-docker
Version 2.5.0
- See changes to
example.docker-compose.yml
,example.prod.docker-compose.yml
, andsmoketests.docker-compose.yml
reflecting upgraded images in this release (using Python 3.8/Debian 10) (sfm-ui #1071). - Upgrades the
processing
andsmoketests
images to use Python 3.8. - Upgrades dependencies in those images for use with Python 3.8.
Version 2.4.0
Changes in this release:
- Introduces support for hosting data volumes on different filesystems, rather than as subdirectories in a single sfm-data directory (#30). This allows storage for RabbitMQ, Postgres, and SFM data for exports, containers, and collection sets to be separately configured. Thank you, @SvenLieber, for code contributions to add this feature! Existing SFM instances upgrading to 2.4.0 should read the notes below carefully for required new environment variables and changes to
docker-compose.yml
. - Updates dependencies in the processing container, including Twarc and JWAT Tools. (#31)
- Adds configuration for AWS Elastic Load Balancer. (#22) Thanks, @justinlittman!
New configurations for existing SFM instances:
IMPORTANT: Because of the numerous changes required for existing .env
and docker-compose.yml
files, we strongly recommend either:
- Re-copying the new
example.env
to.env
and the newexample.prod.docker-compose.yml
todocker-compose.yml
, and re-customizing. - Updating
.env
anddocker-compose.yml
based on the newexample.env
andexample.prod.docker-compose.yml
files in version 2.4.0.
Existing SFM instances will need to set the following new variables in their .env
to point their existing data directories.
DATA_VOLUME_MQ
DATA_VOLUME_DB
DATA_VOLUME_EXPORT
DATA_VOLUME_CONTAINERS
DATA_VOLUME_COLLECTION_SET
Example: The value should have the path to your sfm-data on the filesystem on the left of the :
and the path inside the container on the right. If your data is currently on the filesystem in /sfm-data
, use the following examples:
DATA_VOLUME_MQ=/some-local-path/sfm-mq-data:/sfm-mq-data
DATA_VOLUME_MQ=/sfm-data:/sfm-mq-data
DATA_VOLUME_DB=/sfm-data:/sfm-db-data
DATA_VOLUME_EXPORT=/sfm-data:/sfm-export-data
DATA_VOLUME_CONTAINERS=/sfm-data:/sfm-containers-data
DATA_VOLUME_COLLECTION_SET=/sfm-data:/sfm-collection-set-data
In order for SFM to find WARCs and exports at former internal paths, configure in .env
:
DATA_VOLUME_FORMER_COLLECTION_SET
DATA_VOLUME_FORMER_EXPORT
Example:
DATA_VOLUME_FORMER_EXPORT=/sfm-data/export:/sfm-data/export
DATA_VOLUME_FORMER_COLLECTION_SET=/sfm-data/collection_set:/sfm-data/collection_set
In docker-compose.yml
, these data volumes need to be uncommented at the end of the data container definition.
Monitoring space usage on a shared filesystem
With data volumes now configurable to live on mounted filesystems, SFM will monitor space usage of each volume. Thresholds to trigger warning emails can be set for each volume. However, since most SFM instances currently store data on a single filesystem, for meaningful monitoring when all data is on the same filesystem, existing SFM instances must include the new environment variables (see example.env
).
- In
.env
: setDATA_SHARED_USED
toTrue
and setDATA_SHARED_DIR
to the path of the parent directory on the filesystem, e.g./sfm-data
. - In
.env
: Provide a threshold for space usage warning emails to be sent by updatingDATA_THRESHOLD_SHARED
. - In
docker-compose.yml
: uncomment thevolumes
section in theui
container definition so that theDATA_SHARED_DIR
is accessible to SFM for monitoring. Seeexample.prod.docker-compose.yml
andexample.docker-compose.yml
for these configurations. - SFM instances which are not using a shared filesystem for data and which are making use of the new option to store data volumes on mounted filesystems should:
- In
.env
: setDATA_SHARED_USED
toFalse
and comment outDATA_SHARED_DIR
- In
docker-compose.yml
, comment out the ui container'svolumes
section which refers toDATA_SHARED_DIR
.
- In
Changes to Postgres and RabbitMQ environment variable names
There are several new environment variables that must be included in docker-compose.yml
container definitions. See example.prod.docker-compose.yml
and example.docker-compose.yml
for these updates. Note in particular:
- The
db
container's reference toPOSTGRES_PASSWORD
is changed toPOSTGRES_PASSWORD=${SFM_POSTGRES_PASSWORD}
. - The
mq
container has changes to environment variables.
Version 2.3.0
This release requires an upgrade of the Postgres database. See required upgrade steps below.
Changes in this version include:
- Upgrade of Postgres database from 9.4 to 9.6.
- Optional cookie consent pop-up (#1009). Instructions for enabling below.
- Optional GW footer (#1003). Instructions for enabling below.
For a complete list of tickets, see sfm-ui milestone 2.3.0.
Upgrading Postgres
Stop SFM and bring up only the database container
-
Stop containers
docker-compose stop -t 180 twitterstreamharvester
docker-compose stop -t 45
-
Bring up just the database container
docker-compose up -d db
Create a backup
-
Before doing the upgrade, we recommend you first create a backup of the database, using the following command, where pgdump is the name of the backup file:
docker exec sfm_db_1 pg_dumpall -U postgres > pgdump
-
You can then review the dumpfile:
cat pgdump | less
Upgrade the database
-
Remove the existing database container
docker-compose stop db
docker-compose rm -v db
-
Create an initial Postgres 9.6 database in a new directory alongside the existing postgres database.
Use the path for your/sfm-data/postgresql
directory as the first element of the volume parameter in the docker run command. Substitute your actual postgres password (this is in your .env file) for password. For example, if your existing database is within/sfm-data/postgresql
(it is probably in/sfm-data/postgresql/data
) and your password is password123, the command would look like:
docker run --name postgres -d -v /sfm-data/postgresql/9.6/data:/var/lib/postgresql/data \
-e POSTGRES_PASSWORD=password123 postgres:9.6
-
Stop and remove the postgres container:
docker stop postgres
docker rm -v postgres
-
Run the postgres upgrade image, changing the sfm-data path to match yours:
docker run --rm \
-v /sfm-data/postgresql/data:/var/lib/postgresql/9.4/data \
-v /sfm-data/postgresql/9.6/data:/var/lib/postgresql/9.6/data \
tianon/postgres-upgrade:9.4-to-9.6
Proceed with the rest of the SFM upgrade
Continue with the SFM upgrade, following step 2 of the upgrade instructions, "Make a copy of your existing docker-compose.yml and .env files".
Cookie consent popup
Version 2.3.0 adds a new configurable cookie consent popup. The user's consent is valid until the user clears their browser cookies, for a maximum of 365 days. This feature is disabled by default.
To enable and configure the cookie consent popup, you will need to modify two files in your sfm-docker
directory:
docker-config.yml
. View 2.2.0...2.3.0 to see the new lines added toexample.prod.docker-compose.yml
andexample.docker-compose.yml
. Apply the same changes to yourdocker-config.yml
..env
(environment settings file). View 2.2.0...2.3.0 to see the new lines added toexample.env
. Copy these new lines into your.env
file. Configure the new variables as follows:- Set
SFM_ENABLE_COOKIE_CONSENT
toTrue
. - Modify
SFM_COOKIE_CONSENT_HTML
to your institution's preferred message text. Note that the text may include HTML tags; for example, you may wish to use<a href>
to link to your institution's privacy policy. - If desired, modify
SFM_COOKIE_CONSENT_BUTTON_TEXT
to change the wording on the button that closes the message banner. The default wording isI consent
.
- Set
GW footer
Version 2.3.0 adds a new, GW-specific footer which is disabled by default. When enabled, the GW footer appears below the standard footer. If you opt to use this footer, you will need to modify two files in your sfm-docker
directory:
- In your
.env
file, setSFM_ENABLE_GW_FOOTER
toTrue
. View 2.2.0...2.3.0 to see the new lines added toexample.env
. - In your
docker-compose.yml
file, addSFM_ENABLE_GW_FOOTER
to the environment variables for theui
container. View 2.2.0...2.3.0 to see the new lines added toexample.prod.docker-compose.yml
andexample.docker-compose.yml
.
Release notes for specific components:
Version 2.2.0
Bump version in example files.
Version 2.1.0
This release adds the SFM_EMAIL_FROM
environment variable. It is optional, unless you are using AWS SES (Simple Email Service) to send notification emails.
Version 2.0.2
Various minor tweaks:
- Fixed serialization / deserialization and other management commands.
- Fixed display issue with credentials on collection detail page.
- Made SFM UI queue length configurable.
See release notes for 2.0.0 for relevant information. As an alternative to a full upgrade, only SFM UI and SFM UI Consumer can be set to 2.0.2.
Version 2.0.1
Patch for warcprox threading bug.
See release notes for 2.0.0 for relevant information.
Version 2.0.0
Major improvements in SFM in version 2.0.0:
- Upgraded to python 3.
- Upgraded to django 2.
- Upgraded to warcprox 2.
- Upgraded most other dependencies to latest.
- Replaced deprecated IA WARC library with warcio.
Known issues:
- All existing scheduled harvests are removed. See deployment notes for how to handle.
Release notes for specific components:
- sfm-ui
- sfm-utils
- sfm-twitter-harvester
- sfm-flickr-harvester
- sfm-weibo-harvester
- sfm-tumblr-harvester
For a complete list of tickets, see sfm-ui milestone 2.0.0.
To upgrade to this version of SFM, follow the general upgrade instructions.
Because of changes in apscheduler (which is used to schedule harvests), all scheduled jobs are purged during the upgrade. To fix this, all collections that are turned on (excluding Twitter filter and sample stream collections) must be turned off and turned on.
A collection that must be re-rescheduled is on, but is not scheduled. This is indicated on the collection detail page by the presence of a red button that says Turn off appears on the upper right, but no blue notification that says "Next harvest scheduled for ...".
If you press the turn on button and then the turn on button, the collection will be scheduled as indicated by the blue notification that says "Next harvest scheduled for ...".
After this upgrades, make sure to monitor your collections to make sure harvesting is occurring properly.
Version 1.12.1
Installed python3 in processing container.
To use this patch, change the version of processingcontainer to 1.12.1 in docker-compose.yml
.
Version 1.12.0
Major improvements in SFM in version 1.12.0:
- Deprecated web harvester. This will be replaced by other approaches in future releases that involve sending URLs to external web archives. Existing web harvests will not be deleted, but no new web harvests will be performed.
- Deprecated ELK. This is replaced by TweetSets, which provides a more scalable approach for indexing social media posts in ElasticSearch. An existing ELK instance can continue to run, but no new social media posts will be loaded.
- To improve citability of datasets, added public links field to collections and citation guidance to documentation.
- Added automatic, configurable seed deletion for seeds that have been suspended, deleted, protected, etc.
- Added support for deactivating credentials, for credentials which are no longer valid.
- Removed pinning of transitive dependencies to assist with managing dependency change.
- Worked to enable clean shutdown (status code 0) of containers.
- Switched to used Twarc's Json2Csv for exporting tweets.
Changes in sfm-docker:
- Upgraded processing containers to newer Ubuntu and added / upgraded tools.
- Removed ELK and web harvester.
Changes in docs:
- Fixed links to Twitter docs.
- Add citation guidance page.
- Updated processing container docs to reflect changes / additions.
- Corrected smoke test instructions.
- Deprecated web harvester and ELK.
- Updated Twitter data dictionary to reflect change in Twitter export.
- Update Export documentation to add detail about time zones.
Known issues:
- No significant known issues
Release notes for specific components:
- sfm-ui
- sfm-utils
- sfm-twitter-harvester
- sfm-flickr-harvester
- sfm-weibo-harvester
- sfm-tumblr-harvester
For a complete list of tickets, see sfm-ui milestone 1.12.0.
To upgrade to this version of SFM, follow the general upgrade instructions. In your .env
file, remove the WEB HARVESTER CONFIGURATION SECTION and the WEB_REQS line.
Also, change the versions of twitterrestexporter and twitterstreamexporter to 1.12.1 in your docker-compose.yml
file.
After SFM is upgraded, execute docker system prune -a
.