Skip to content
This repository has been archived by the owner on Mar 30, 2023. It is now read-only.

Releases: RSE-Cambridge/data-acc

Fixes for large buffers.

12 Mar 15:48
4f78c3a
Compare
Choose a tag to compare
Pre-release

Very large buffers eventually don't request an MDT for every OST, which caused a failure.

sha256:576717be2abc9b37dd1f07f30320f42c9c2ce35d1b929c446dc0348759136556

Extra config to aid testing

12 Mar 11:36
50a6896
Compare
Choose a tag to compare
Pre-release

Added support for DAC_MAX_MDT_COUNT and DAC_MDT_SIZE_MB config.

sha256:beed4ab5ee72f68b244c4e5fd3d32584c7c5d0efcf165c7077ba7d468c2b8efb

Move from Partitions to LVM

11 Mar 11:32
b780e7a
Compare
Choose a tag to compare
Pre-release

Includes replacing the setting of DAC_MDT_SIZE with DAC_MDT_SIZE_GB, i.e. drop any units in your old config and change it to just an int amount of GB.

sha256 26671f456294a7d22839a58f3bf979617b084d604d3e931aec998d87abd136f6

Wait for umount to succeed

09 Jan 17:38
6fba633
Compare
Choose a tag to compare
Pre-release

Fixes to remove lazy umount, so we spot umount errors when they happen. Also timeout commands that get hung (for example, ansible runs talking to clients with a dead mount). Rework some of the very old watching logic to use the new channel based code paths.

sha256 21dc9f34952309dfd47306a249d6a5ec7be21ee38f2e73836893c09b0f67c131

Unmount related fixes

08 Jan 08:51
1a00af1
Compare
Choose a tag to compare
Unmount related fixes Pre-release
Pre-release

Only attempt to mount and umount the attachments we have detected as changed, and also ignore any swapoff related errors, as they are not critical and likely interact with some sites slurm post-amble scripts.

sha256 a42a11f18c4ec40a347c37c18024a217a1cfa77d8b04faea2df34b70ef1ae83e

Stability Fixes

07 Jan 12:14
98ffcad
Compare
Choose a tag to compare
Stability Fixes Pre-release
Pre-release

Hold a lock while doing any volume operation, so we don't have races between the setup and teardown logic as we have seen at the moment.

sha256 a21aef60b907de200a1a3dfe8ed14d014aec2ef6fb47ef4cca92a42580d9a88c

Every OST has matching MDT

04 Jan 15:41
18cc1c6
Compare
Choose a tag to compare
Pre-release

First attempt at giving every OST its own matching MDT by partitioning the assigned device.

NOTE: to adopt this release you need to rebuild your full dac environment, because fs-ansible is not backwards compatible with old releases.

sha256 c59fffada05f8dd13dd3797c96e84c7cf9259687284c9970d8306178dc9a347f

setup allows nodehostnamefile

03 Jan 22:54
8dbaa2e
Compare
Choose a tag to compare
Pre-release

Turns out setup can be passed nodehostnamefile. It is just accepted and ignored for now.

sha256 0cca50ad72a4064cf7be24aa7d562bdd3b96d6da081b628782623539ea5e939a

dacd watch fixes

03 Jan 16:56
6e41214
Compare
Choose a tag to compare
dacd watch fixes Pre-release
Pre-release

Re-worked how dacd watches for new volumes and watches volumes once it seems them appear. Extra debug info added to help track down when we seem to lose events. Includes calling each instance of processNewPrimaryBlock inside a new goroutine.

sha256 ba7b956cbef2b5143294e27b93dc521a784e156778780ceccfcdb880b73fe0f6

Very experimental stage_in and stage_out support

03 Jan 09:33
90b482f
Compare
Choose a tag to compare

Try using rsync as the buffer user to stage in and out. Needs lots of hardening, and only tries to support $DW_JOB_STRIPED mode buffer.

To work around any stage_in and stage_out related failures, remove #DW stage_in and stage_out from your submission scripts.

sha256 ecc6fef8eafcb13208817be810b4b09ab12f44a3d755a52ad608fb898bfb249b