-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add HA autotests #427
Comments
@JeffreyDevloo can you please update on status? |
Jeffrey's update:
My virtual environment:
export LIBOVSVOLUMEDRIVER_XIO_KEEPALIVE_TIME=60; export LIBOVSVOLUMEDRIVER_XIO_KEEPALIVE_INTVL=20; export LIBOVSVOLUMEDRIVER_XIO_KEEPALIVE_PROBES=3; /tmp/fio.bin.latest --iodepth=32 --rw=readwrite --bs=4k --direct=1 --rwmixread=100 --rwmixwrite=0 --ioengine=openvstorage --hostname=10.100.69.121 --port=26203 --protocol=tcp --enable_ha=1 --group_reporting=1 --name=test --volumename=ci_scenario_hypervisor_ha_test_vdisk01 --name=test2 --volumename=cnanakos3; exec bash |
Problems along the wayUse of a not so powerful environmentWait times were a lot longer. Initially tested with Virtual Machines but their setup took way to much time. Instead switched to FIO to get quick results Not killing the volumedriver process but killing the whole nodeConnection will remain when the node is shot. The host cannot respond anymore (if its alive but no longer listening on port, the kernel will kill the internal connection). This gave some issues initially to the point where we had to adjust timeout settings. Able to manage the parent hypervisor of my environmentThe KVM sdk that I wrote for creating and managing VMs cannot take a password without some serious config change in the hypervisor.
Threading in PythonI am no expert in threading and thread management. Working with threads to perform different tasks such as monitoring was a challenge. Controlling the flow of these threads even more as I needed to control them to read my shared resource. Test casesUsed settings:The settings below are now the default:
Fio cmd:
Fail over 1 volume
Fail over 25 volumes
Fail over 50 volumes
Fail over 100 volumes
and only 55-56 volumes failed over. Other reported IO error (expected due to all connections being aborted to due FD limit)
Increasing BS to 1mb as it previously showed errorsfio cmd:
Changes in cmd:
Fail over 1 volume
Fail over 25 volumes
Fail over 50 volumes
Fail over 100 volumes
|
Fixed by #442 |
Test will require the parent hypervisor information for the first tests. These can be modified by killing certain processes later
Flow:
The text was updated successfully, but these errors were encountered: